UNIGE document Chapitre d'actes
previous document  unige:72793  next document
add to browser collection

ResToRinG CaPitaLiZaTion in #TweeTs

Bontcheva, Kalina
Gorrell, Genevieve
Published in 3rd International Workshop on Natural Language Processing for Social Media in conjunction with WWW 2015. Florence (Italy) - 18-22 may 2015 - . 2015, p. 1111-1115
Abstract The rapid proliferation of microblogs such as Twitter has resulted in a vast quantity of written text becoming available that contains interesting information for NLP tasks. However, the noise level in tweets is so high that standard NLP tools perform poorly. In this paper, we present a statistical truecaser for tweets using a 3-gram language model built with truecased newswire texts and tweets. Our truecasing method shows an improvement in named entity recognition and part-of-speech tagging tasks.
Stable URL https://archive-ouverte.unige.ch/unige:72793
Full text
Research group Laboratoire d'Analyse et de Traitement du Langage (LATL)
Project FNS: P1GEP1_151615

145 hits



Deposited on : 2015-05-26

Export document
Format :
Citation style :