Title | A comparative study on key phrase extraction methods in automatic Web Site Summarization |
Publication Type | Journal Article |
Year of Publication | 2007 |
Authors | Zhang, Y, Milios, E, Zincir-Heywood, N |
Journal | Journal of Digital Information Management |
Volume | 5 |
Issue | 5 |
Pagination | 323 - 332 |
Date Published | 2007 |
Keywords | Key phrase extraction, Web retrieval, Web site study |
Abstract | Web Site Summarization is the process of automatically generating a concise and informative summary for a given Web site. It has gained more and more attention in recent years as effective summarization could lead to enhanced Web information retrieval systems such as searching for Web sites. Extraction-based approaches to Web site summarization rely on the extraction of the most significant sentences from the target Web site based on the density of a list of key phrases that best describe the entire Web site. In this work, we benchmark five alternative key phrase extraction methods, TFIDF, KEA, Keyword, Keyterm, and Mixture, in an automatic Web site summarization framework we previously developed. We investigate the performance of these underlying methods via a formal user study and demonstrate that Keyterm is the best choice for key phrase extraction while Mixture should be used to obtain key sentences. We also discuss why one method performs better than another and what could be done to further improve the summarization system. |
URL | http://www.scopus.com/inward/record.url?eid=2-s2.0-70350686085&partnerID=40&md5=1e62cb16df24be6167a1dbde14d4281c |