UNIGE document Report
previous document  unige:23174  next document
add to browser collection
Title

Development of a flexible tool for the automatic comparison of bibliographic records. Application to sample collections

Authors
Borel, Alain
Year 2009
Description 107p.
Abstract Due to the multiplication of digital bibliographic catalogues (open repositories, library and bookseller catalogues), information specialists are facing the challenge of mass-processing huge amounts of metadata for various purposes. Among the many possible applications, determining the similarity between records is an important issue. Such a similarity can be interesting from a bibliographic point of view (i.e., do the records describe the same document, the answer to which can be useful for deduplication or for collection overlap studies) as well as from a thematic point of view (suggestion of documents to the user, as well as content management within the framework of a library policy, automatic classification of documents, and so on). In order to fulfil such various needs, we propose a flexible, open-source, multiplatform software tool supporting the implementation of multiple strategies for record comparisons. In a second step, we study the relevance and performance of several algorithms applied to a selection of collections (size, origin, document types...).
Keywords MarcximilDeduplicationNear duplicatesDuplicatesDoublonsDédoublonnerDédoublonageSimilaritéSimilaritySimilitudeCollectionInformation retrieval
Full text
Report (1.1 MB) - public document Free access
Other version: http://infoscience.epfl.ch/record/141894
Structures
Citation
(ISO format)
BOREL, Alain, KRAUSE, Jan Brice. Development of a flexible tool for the automatic comparison of bibliographic records. Application to sample collections. 2009 https://archive-ouverte.unige.ch/unige:23174

236 hits

208 downloads

Update

Deposited on : 2012-10-04

Export document
Format :
Citation style :