Title | Distributed Web 2.0 crawling for ontology evolution |
Publication Type | Journal Article |
Year of Publication | 2009 |
Authors | Juffinger, A, Neidhart, T, Granitzer, M, Kern, R, Weichselbraun, A, Wohlgenannt, G, Scharl, A |
Journal | Journal of Digital Information Management |
Volume | 7 |
Issue | 2 |
Pagination | 114 - 119 |
Date Published | 2009 |
Keywords | Distributed crawling, Executed crawling, Ontology evolution, Ontology learning, Web 2.0 crawling |
Abstract | The World Wide Web as a social network reflects changes of interest in certain domains. It has been shown that free online content available through blogs, wikis, news media and online forums is a valuable source of information to identify trends in certain domains. Utilizing this data, one can construct ontologies that describe this information and provide a semantically correct overview of a domain. Tracked over time this also enables a user to identify trends and hypes. The decentralised structure of the Internet, the huge amount of data and upcoming Web2.0 technologies pose several challenges to a crawling system for ontology learning, evolution and trend analysis. This paper presents a distributed crawling system with browser integration for Web2.0. The proposed crawler is a high performance Web data retrieval system aimed to gather browser-equivalent textual Web content and prepare it for ontology learning. |
URL | http://www.scopus.com/inward/record.url?eid=2-s2.0-70350630600&partnerID=40&md5=7b88c38575eca72b620a0a6e113a8f55 |