UNIGE document Chapitre d'actes
previous document  unige:4528  next document
add to browser collection
Title

Summarizing Sets of Categorical Sequences: selecting and visualizing representative sequences

Authors
Published in International Conference on Knowledge Discovery and Information Retrieval. Madeira - 6-8 October 2009 - . 2009
Abstract This paper is concerned with the summarization of a set of categorical sequence data. More specifically, the problem studied is the determination of the smallest possible number of representative sequences that ensure a given coverage of the whole set, i.e. that have together a given percentage of sequences in their neighborhood. The goal is to yield a representative set that exhibits the key features of the whole sequence data set and permits easy sounded interpretation. We propose an heuristic for determining the representative set that first builds a list of candidates using a representativeness score and then eliminates redundancy. We propose also a visualization tool for rendering the results and quality measures for evaluating them. The proposed tools have been implemented in TraMineR our R package for mining and visualizing sequence data and we demonstrate their efficiency on a real world example from social sciences. The methods are nonetheless by no way limited to social science data and should prove useful in many other domains.
Keywords Categorical sequence dataRepresentativenessDissimilarityDiscrepancy of sequencesSummarizing
Full text
Proceedings chapter - public document Free access
Structures
Citation
(ISO format)
GABADINHO, Alexis et al. Summarizing Sets of Categorical Sequences: selecting and visualizing representative sequences. In: International Conference on Knowledge Discovery and Information Retrieval. Madeira. [s.l.] : [s.n.], 2009. https://archive-ouverte.unige.ch/unige:4528

151 hits

490 downloads

Update

Deposited on : 2009-12-01

Export document
Format :
Citation style :