A refined methodology for automatic keyphrase assignment to digital documents

TitleA refined methodology for automatic keyphrase assignment to digital documents
Publication TypeJournal Article
Year of Publication2011
AuthorsKhan, S, Fatima, I, Irfan, R, Latif, K
JournalJournal of Digital Information Management
Volume9
Issue2
Pagination55 - 63
Date Published2011
KeywordsAutomatic indexing, Keyphrase assignment, Vocabulary
Abstract

Keyphrases precisely express the primary topics and themes of documents and are valuable for cataloging and classification. Manually assigning keyphrases to existing documents is a tedious task; therefore, automatic keyphrase generation has been extensively used to classify digital documents. Existing automatic keyphrase generation algorithms are limited in assigning semantically relevant keyphrases to documents. In this paper we have proposed a methodology to refine the result set of automatically generated keyphrases by Keyphrase Extraction Algorithm (KEA++), so that the keyphrases accurately and precisely represent the content of the document. Our approach is an additional layer at the top of KEA++ and exploits semantic relationships and hierarchical structure of the controlled vocabulary to filter out irrelevant keyphrases from the result set generated by KEA++. The methodology was applied on different sets of academic publications for evaluation. Evaluation demonstrates that the proposed refinement methodology improves the quality of generated keyphrases.

URLhttp://www.scopus.com/inward/record.url?eid=2-s2.0-79960690550&partnerID=40&md5=aeb220f4fd01fa335a93432979341abe

Collaborative Partner

Institute of Electronic and Information Technology (IEIT)

Collaborative Partner

Collaborative Partner