Automatic Keyphrase Extraction
Abstract
Increasing number of documents in the Web caused the growth of needs for tools supporting automatic search and classification of texts. Keywords are one of characteristic features of documents that may be used as criteria in automatic document management. In the paper we describe the technique for automatic keyphrase extraction based on the KEA algorithm [1]. The main modifications consist in changes in the stemming method and simplification of the discretization technique. Besides, in the presented algorithm the keyphrase list may contain proper names, and the candidate phrase list may contain number sequences. We describe experiments, that were done on the set of English language documents available in the Internet and that allow for optimization of extraction parameters. The comparison of the efficiency of the algorithm with the KEA technique is presented.
Full Text:
PDFDOI: http://dx.doi.org/10.17951/ai.2006.5.1.101-111
Date of publication: 2006-01-01 00:00:00
Date of submission: 2016-04-27 10:15:50
Statistics
Total abstract view - 352
Downloads (from 2020-06-17) - PDF - 0
Indicators
Refbacks
- There are currently no refbacks.
Copyright (c) 2015 Annales UMCS Sectio AI Informatica
This work is licensed under a Creative Commons Attribution 4.0 International License.