New collaboration: "The VerbCorner Project"

May 21, 2013

MIT Intelligence Initiative Fellow Tim O'Donnell is part of a new collabroative project, "The VerbCorner Project," hosted by GamesWithWords.org (host of a number of large-scale Web-based research projects.)

 
Collaborators: Timothy O'Donnell (MIT), Joshua Hartshorne (MIT), Martha Palmer (CU-Boulder), Daniel Peterson (CU-Boulder). MIT undergratuate student Gabriel Frattallone provided the website design. 

On Tuesday, May 21, 2013, researchers at MIT, in collaboration with researchers at CU-Boulder, launched a massive crowd-sourced project aimed at determining what verbs mean. One could, of course, just look up those same verbs in a dictionary. The problem with a dictionary, though, is it that it defines words in terms of other words. The hard work is done by the human reading the dictionary, not the dictionary itself.

Core Components of Meaning

If we do not want to define words in terms of other words, what should they be defined in terms of? This remains an open research question, but many researchers have argued that -- particularly in the case of verbs -- there are a relatively small number of core concepts around which meaning is organized, and that many of the features of these words can be explained by which of the core concepts they encode.

Consider the fact that while you can say Sally hit the vase or Sally hit at the vase, and you can say Sally touched the vase or Sally kicked at the vase, you can only say Sally broke the vase and not Sally broke at the vase. In the same fashion, while you can say, Sally broke the vase or The vase broke, you cannot say The vase hit or The vase kicked, unless you are talking about a very special type of vase.

Researchers have explained these patterns in terms of meaning: Sally hit/kicked the vase specifies that Sally came in contact with the vase, not that she changed it in some way (the vase might break as a result, but the sentence leaves this ambiguous). In contrast, Sally broke the vase indicates that she caused the vase to change state (from whole to broken), though whether she did this through physical contact or by remote control is unstated. Which types of sentences a verb can appear in is dictated by which components of meaning it has. Interestingly, the same core components of meaning often discussed in language research -- contact, causation, change-of-state, intentionality, etc. -- are frequently discussed by developmental psychologists, where it has been argued that they form the core underlying principles of infant cognition, suggesting that these components of meaning are rooted deep in the human mind.

Scaling Up

One limitation of this work is that it has tended to focus on small numbers of verbs. As Joshua Hartshorne, a post-doctoral fellow in the Department of Brain and Cognitive Sciences and the lead researcher on this project, explains, "There are a few dozen core components of meaning and there are tens of thousands of verbs in English. Even worse, each verb can appear in many different sentence contexts, which can affect its meaning. The Joads loaded the hay onto the wagon means that all the hay is now on the wagon, but doesn't require that the wagon be full, whereas The Joads loaded the wagon with the hay does exactly the opposite. A given verb can appear in as many as a couple dozen sentence contexts. To work out for every verb in every sentence context exactly which components of meaning are involved is simply not feasible even for a large research group." > > For this reason, the researchers turned to crowd-sourcing. They developed a website (http://gameswithwords.org/VerbCorner/) where volunteers can answer questions about sentences, with the answer to each question determining the presence or absence of a component of meaning. If successful, the VerbCorner project would not be the first to harness the crowd-sourcing for science, with some of the more famous examples being Galaxy Zoo, which successfully classified a million galaxies based on images, and Fold It, a protein-folding game. Like these and other successful "Citizen Science" projects, VerbCorner has been "gamified." The questions involve fanciful backstories, and users earn points and badges, with the most productive users earning a place on the leader board.

"The response so far has been fantastic," wrote Timothy O'Donnell, an I2 post-doctoral fellow and researcher on the team. "Within the first twelve hours after launch, we had around 200 users who had answered 4,500 questions."

Implications

The research team expects the results of the project to to be useful to researchers from a variety of disciplines. Having a systematic understanding of word meaning is vital for computer scientists trying to create computers that understand what language means or developmental psychologists trying to understand how children learn what language means. Similarly, the data could be used by linguists who are trying to understand the relationship between meaning and grammar or characterize the differences in the structure of different languages.

Hartshorne gives an example. "One really important theory of language learning suggests that children and even adults use the verbs they already know to help guess the meanings of new verbs. For instance, if I hear a new verb dax in a sentence -- Sally daxed the vase -- my first guess as to its meaning is basically the average of all the verbs I already know that could have fit in the same sentence. But since we as researchers have not yet identified the meanings of even most such verbs, this is a very hard prediction to test. Many theories -- both in psychology and linguistics -- require that most verbs that can go in the sentence Sally VERBed the vase indicate that the vase changed somehow. I suspect this is true, but at the moment there is no way to test the hypothesis. If VerbCorner succeeds, we'll change that."

The project has already gotten attention within the research community. Within 24 hours of the launch, the team had been contacted by a researcher in the UK who wanted to collaborate in the further development of the project, and by a researcher in Italy was was interested in developing an Italian version. "We want to make sure the English version gets off the ground first," wrote O'Donnell. "We have had a really great response so far, which as I am writing this has not yet let up. But it really depends on whether we continue to get enough volunteer support. But it would be fantastic to extend this work to other languages."

Joshua Hartshorne is funded by an NIH NRSA post-doctoral fellowship. 

The Boulder researchers are funded by NSF (IIS-1116782) and DAPRA/IPTO (HR0011-06-C-0022, HR0011-11-C-0145 (BOLT) FA8750-09-C-0179 (M.R.)).

There are no comments yet

Post a new comment...