UNIGE document Chapitre d'actes
previous document  unige:17034  next document
add to browser collection
Title

Extraction of multi-word collocations using syntactic bigram composition

Authors
Published in Proceedings of the Fourth International Conference on Recent Advances in NLP (RANLP-2003). Borovets (Bulgaria) - 10-12 September 2003 - . 2003, p. 424-431
Abstract This paper presents a method for extracting multi-word collocations (MWCs) from text corpora, which is based on the previous extraction of syntactically bound collocation bigrams. We describe an iterative word linking procedure which relies on a syntactic criterion and aims at building up arbitrarily long expressions that represent multi-word collocation candidates. We propose several measures to rank candidates according to the collocational strength, and we present the results of a trigram extraction experiment. The methodology used is particularly well-suited for the identification of those collocations whose terms are arbitrarily distant, due to syntactic processes (passivization, relativization, dislocation, topicalization).
Full text
Proceedings chapter (155 Kb) - public document Free access
Other version: http://lml.bas.bg/ranlp2003/
Structures
Research group Laboratoire d'Analyse et de Traitement du Langage (LATL)
Citation
(ISO format)
SERETAN, Violeta, NERIMA, Luka, WEHRLI, Eric. Extraction of multi-word collocations using syntactic bigram composition. In: Proceedings of the Fourth International Conference on Recent Advances in NLP (RANLP-2003). Borovets (Bulgaria). [s.l.] : [s.n.], 2003. p. 424-431. https://archive-ouverte.unige.ch/unige:17034

311 hits

253 downloads

Update

Deposited on : 2011-09-26

Export document
Format :
Citation style :