Oct 30

The single spot in our NLP where compositionality is fully abondoned, is Semantic Tagger. Because of the syntactic and semantic complexity of structures we have to analyse, this module can’t extract the interpretation of a given construction (say, a finite sentence about the relation “be employee of”) step by step from the interpretation of its syntactic parts. We can’t do it not just because we don’t have a sufficiently powerfull syntactic analyser, but we simply don’t need it.

To extract a relation like “be employee of” we use a number of patterns relying on entities recognized previously. Look at (1).

(1) In the mid-1990 the former German citizen Heinz Schimmelbuch becomes CEO of Weissel company,

If the processor already knows that mid-1990 is a Date/Period, Heinz Schimmelbuch is a Person, CEO is a Job Title, Weissel company is an Organization, become is a verb of coming into being or smth. like that, and that there is no punctuation between these items, it has enough information to conclude that this sentence might speak about an employment relation, as long as several additional conditions are satisfied. The first condition is the items’ order in this sentence. The sentence (2) might also be an employment-related sentence, but not (3).

(2) The former German citizen Heinz Schimmelbuch becomes CEO of Weissel company in the mid-1990.
(3) In the mid-1990 Heinz Schimmelbuch becomes American citizen and meets the CEO of Weissel company.

The second condition concerns some specific constraints on the key verb. E.g., the verb become has to be finite: with an non-finite verb, the probability that this sentence is irrelevant, increases, cf. (4).

(4) In the mid-1990 the former German citizen Heinz Schimmelbuch dreamed about becoming CEO of Weissel company.

Then, taking into account these observation, we can safely add a pattern using just the recognized items like Person, Job Title, etc. and some punctuation markup (commas, points, etc.), “hiding” all the irrelevant part of the sentence.

(5) … the mid-1990 … Heinz Schimmelbuch … become … CEO … Weissel company.

Now we can compose such a pattern.

({Date} | {StartPoint})? {Person}
{becomeVG.VOICE = “act”, becomeVG.MOOD = “ind”}
{JobTitle} {Organization}

Oh yeap, this pattern is indeed used in our system.