Automatic Keyphrase Extraction

Danuta Zakrzewska, Katarzyna Mataśka

Abstract


Increasing number of documents in the Web caused the growth of needs for tools supporting automatic search and classification of texts. Keywords are one of characteristic features of documents that may be used as criteria in automatic document management. In the paper we describe the technique for automatic keyphrase extraction based on the KEA algorithm [1]. The main modifications consist in changes in the stemming method and simplification of the discretization technique. Besides, in the presented algorithm the keyphrase list may contain proper names, and the candidate phrase list may contain number sequences. We describe experiments, that were done on the set of English language documents available in the Internet and that allow for optimization of extraction parameters. The comparison of the efficiency of the algorithm with the KEA technique is presented.

Full Text:

PDF


DOI: http://dx.doi.org/10.17951/ai.2006.5.1.101-111
Data publikacji: 2006-01-01 00:00:00
Data złożenia artykułu: 2016-04-27 10:15:50


Statistics


Total abstract view - 160
Downloads (from 2020-06-17) - PDF - 0

Indicators



Refbacks

  • There are currently no refbacks.


Copyright (c) 2015 Annales UMCS Sectio AI Informatica

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.