Personal Statement -

“The marvelous thing is that even in studying linguistics, we find that the universe as a whole is patterned, ordered, and to some degree intelligible to us.” Linguist and anthropologist Kenneth Pike speaks to an optimistic view of what we can comprehend about the world we so often thoughtlessly flow through; we can find something intelligible about the world even in the “unlimited domain.” Of course, being the fairly skeptical mind that I am, I had to go out and seek the answers myself.

A year ago, I had the wonderful opportunity to meet with Professor Hubert Dreyfus at UC Berkeley. A significant (if somewhat polemic) figure during the early days of research in artificial intelligence, my discussion with him couldn't have come at a better time. I had just picked up his book What Computers Can't Do, and was getting an introduction to the fundamental flaws in current AI research that Dreyfus considered challenging to an extraordinary degree, if not wholly insurmountable.

It had been several years since he last discussed the book with an interested person, but the same passion and pessimism in the tone of his writing was still evident in his language when we walked for a short while. As I was preparing to leave, he made a point that is still very important to me: “AI research is still on the ground floor. Any contribution at all is significant.”

His point was both discouraging and empowering. There was so much to do, and if Professor Dreyfus' analysis was accurate, only a select few knew where to start. For a short time, I fancied myself a believer of his philosophy. As Professor Dreyfus termed it, I wanted to be one of the “Black Knights of AI,” the rebellious few who believed that the GOFAI (“good old fashioned artificial intelligence”) endeavor was fundamentally doomed from the outset, and meaningful progress to artificial intelligence was only possible after a major re-evaluation of the current paradigm of research.

In time, I came to realize that although much of previous AI research up until this point has somewhat supported his skepticism (most notably the absence of sentient robots once predicted in the past for this day and age), many excellent discoveries have still been made by scientists in the field. While I certainly believe that the verdant optimism that possessed the scientists and philosophers at the outset has since passed over to a more realistic approach once the realization of just how complex the “unlimited domain” is, what has been accomplished up until now is nothing to scoff at. Expert systems, improved sensors, artificial reasoning, and machine learning are just some important discoveries that were heavily influenced by research in AI, if not directly a product of it. Not to mention the contribution's made to other major fields of research such as linguistics, philosophy, psychology, and computer science.

With this nascent belief in my mind, I sought to bridge the gap between my experience as an undergraduate in philosophy to problems in AI. After discussing my interests with professors in both the Philosophy and Computer Science departments, I was directed to Philosophy Professor Richmond Thomason for my first touch with independent research in AI.

Throughout the semester project, I came to grasp just how intricate the human understanding of discourse is, and how difficult it is to reflect this grasp computationally. The study was a review of the TACITUS project done by Jerry Hobbs et.al., specifically its treatment of reference resolution. The somewhat general approach to issues in reference resolution for the paper (I considered problems in co-reference, anaphora, unclear utterances in discourse, etc... all in regards to the abductive inference theory at the heart of the TACITUS project) served as somewhat of an introduction for me to scholastic writing within this academic sphere.

Despite having perceived the problem of reference resolution (and by extension, several other special cases worth considering) as being a somewhat focused topic, the scope of the original paper's breadth escalated quickly. Covering as much as I originally intended to cover tipped me off to just how much intricacy goes into fluid language used by human beings all the time. After trimming the fairly voluminous tome down to a more reasonable 35 pages or so, I submitted the paper as a general examination of concerns that any algorithm for reference resolution ought to consider in order to adequately reflect the fluidity of human discourse.

This summer (2009) I was given the opportunity to work with the Computational Linguistics and Information Retrieval (CLAIR) research group under Professor Dragomir Radev. Much of my contribution to the research group was examining a database of papers and annotating author affiliations for the ACL Anthology Network. The AAN network allows the use of a web interface to examine individual references from a large number of papers in the database, and serves as support for a variety of different research projects and tasks. Sifting through the large amount of papers introduced me to a panoply of previous papers developed in computational linguistics, and gave me an idea of where much important work was being done. During my annotation tasks, some institutions stood out as significant contributors to the field of knowledge, as well as several individuals standing out as forerunners in certain sub-domains of research. It was to me an exercise in the rigors of high-level research: it's not always glamorous or exciting. Rather, many advances are made incrementally by dedicated staff and researchers. Just as importantly, it showed me the expectations of quality work at the university level.

This fall semester, I've continued to expand my work in reference resolution as part of further independent study done here at the University of Michigan with Professor Steven Abney, this time developing an implementation of the Lappin-Leass algorithm for pronominal anaphora resolution in Java. Working with textbooks and the original paper introducing the algorithm, developing my own implementation from the ground up has presented a whole new set of coding challenges that involve outside items directly relevant to the algorithm, including transforming the Charniak parse tree into a data structure that's compatible with the required manipulations and correct parts-of-speech tagging of words from the parse tree, as well as trying to fit into the framework an accurate implementation of the salience measures outlined in the algorithm.

Getting a taste of the scope of work being done in computational linguistics was (and continues to be) intimidating. Wading into the waters of the field made me realize just how far away we are from HAL. Pushing forward though, I realize that regardless of whether or not I see true AI born in my life there are many problems that are extremely challenging and interesting academically to consider.

Algorithms and methods in anaphora and reference resolution is a fairly focused topic in applied linguistics, and is also just a small snap-shot of my interests and future aspirations. Although anaphora and reference resolution remains a passion of mine, it forms just a small part of my growing interest in conversational agents. Of particular interest to me is the development of conversational agents that make vocal interactions between human beings and computers more fluid and natural, to the extent that this discourse becomes closer to the way humans interact with each other verbally. While what goes into conversational agents covers a lot of area in linguistics as a field (theories in syntax/semantics, pragmatics, discourse, and coherence are all salient to a reasonable system, and certainly does not exhaust the set of important considerations), it parallels my interdisciplinary background as one where many various elements are brought together to form an entity whose value is greater than the sum of its parts.

That being said, I hope to continue my work on anaphora and reference resolution problems in my graduate studies as an ongoing project of mine. On a more focused scale, I would like to have the time to study tree-based methods for reference resolution, especially the one described by Jerry Hobbs that is (what I perceive to be) syntactically based as opposed to relying on a discourse model and weighted preferences as the Lappin-Leass algorithm does.

These problems of consideration in AI are, in my opinion, interdisciplinary in nature. While I've focused much of my prior research and project experience on gaining a foothold in computational linguistics, I've also widened my grasp of concepts in other fields related to AI, such as my study of logic, epistemology, and theory of the mind as a part of my undergraduate philosophy degree. In addition, I've also studied general artificial intelligence algorithms and robotics as a part of my minor in computer science. Thus, the focus of much of my undergraduate career has been satiating my interest in AI, and finding my niche. In order to develop this interdisciplinary background, I've always pushed myself to gain experience in fields that challenged me, and forced myself in directions I wasn't always comfortable in. My studies in linguistics (via one course and two independent research projects) have all resulted in high marks from my guiding professors. While I'm comfortable in the niche my experience in computational linguistics has carved for me, the interdisciplinary development of my education (some would say eclectic) gives me the necessary experience to attack problems in my field creatively.