Title | Querying and ranking xml documents based on data synopses |
Publication Type | Journal Article |
Year of Publication | 2011 |
Authors | He, W, Lv, T |
Journal | Journal of Digital Information Management |
Volume | 9 |
Issue | 5 |
Pagination | 199 - 205 |
Date Published | 2011 |
Keywords | Document ranking, Query processing, Query synopses, XML |
Abstract | There is an increasing interest in recent years for querying and ranking XML documents. In this paper, we present a new framework for querying and ranking schema-less XML documents based on concise summaries of their structural and textual content. We introduce a novel data synopsis structure to summarize the textual content of an XML document for efficient indexing. More importantly, we extend the traditional vector space model to effectively rank XML documents over the proposed data synopses. We conduct extensive experiments over XML benchmark data to demonstrate the advantages of the indexing scheme and the effectiveness of our ranking scheme. We also compare our framework with Lucene to demonstrate our extended TF*IDF scoring function is effective. |
URL | http://www.scopus.com/inward/record.url?eid=2-s2.0-84855362331&partnerID=40&md5=85a8a3de60a877da1d0a3495eaf27e22 |