Proceedings chapter
OA Policy
English

Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus

Presented atPrague (Czech Republic), June
Publication date2007
Abstract

The AMI Meeting Corpus is now publicly available, including manual annotation files generated in the NXT XML format, but lacking explicit metadata for the 171 meetings of the corpus. To increase the usability of this important resource, a representation format based on relational databases is proposed, which maximizes informativeness, simplicity and reusability of the metadata and annotations. The annotation files are converted to a tabular format using an easily adaptable XSLT-based mechanism, and their consistency is veri®ed in the process. Metadata ®les are generated directly in the IMDI XML format from implicit information, and converted to tabular format using a similar procedure. The results and tools will be freely available with the AMI Corpus. Sharing the metadata using the Open Archives network will contribute to increase the visibility of the AMI Corpus.

Citation (ISO format)
POPESCU-BELIS, Andréi, ESTRELLA, Paula Susana. Generating Usable Formats for Metadata and Annotations in a Large Meeting Corpus. In: Proceedings of the 45th International Conference of the Association for Computational Linguistics (ACL 2007): Interactive Poster and Demonstration Sessions. Prague (Czech Republic). [s.l.] : [s.n.], 2007. p. 93–96.
Main files (1)
Proceedings chapter
accessLevelPublic
Identifiers
  • PID : unige:3462
594views
395downloads

Technical informations

Creation10/02/2009 9:28:58 AM
First validation10/02/2009 9:28:58 AM
Update time03/14/2023 3:14:53 PM
Status update03/14/2023 3:14:53 PM
Last indexation10/29/2024 12:18:30 PM
All rights reserved by Archive ouverte UNIGE and the University of GenevaunigeBlack