Title | The Infocious Web search engine: Improving Web searching through linguistic analysis |
Publication Type | Journal Article |
Year of Publication | 2007 |
Authors | Ntoulas, A, Chao, G, Cho, J |
Journal | Journal of Digital Information Management |
Volume | 5 |
Issue | 5 |
Pagination | 277 - 291 |
Date Published | 2007 |
Keywords | Infocious search engine, Linguistic processing, Web pages, Web search engines |
Abstract | In this paper we present the Infocious Web search engine [23], which currently indexes more than 2 billion pages collected from the Web. The main goal of Infocious is to enhance the way that people find relevant information on the Web by resolving ambiguities present in natural language text. Towards this goal, Infocious performs linguistic analysis to the content of the Web pages prior to indexing and exploits the output of this analysis when ranking and presenting the results to the users. Our hope is that this additional step of linguistic processing provides Infocious with two main advantages. First, Infocious tries to gain a deeper understanding of the content of Web pages and to match the users' queries with the indexed documents better, improving the relevancy of the returned results. Second, based on its linguistic processing, Infocious tries to organize and present the results to the users in a structured and more intuitive way. In this paper we present the linguistic processing technologies that we investigated and/or incorporated into the Infocious search engine, and we discuss the main challenges in applying these technologies to Web documents. We also present the various components in the architecture of Infocious, and how each one of these components benefits from the added linguistic processing. Finally, we present preliminary results from our experimental study that evaluates the effectiveness of the described linguistic analysis. |
URL | http://www.scopus.com/inward/record.url?eid=2-s2.0-70350638936&partnerID=40&md5=51ba97210c13f5678967a5c10ff4a58a |