Title | Framework for mixed entity resolving system using unsupervised clustering |
Publication Type | Journal Article |
Year of Publication | 2010 |
Authors | On, B-W, Lee, I |
Journal | Journal of Digital Information Management |
Volume | 8 |
Issue | 6 |
Pagination | 362 - 368 |
Date Published | 2010 |
Keywords | Mixed entity resolution, Unsupervised clustering |
Abstract | During web search, confusion can happen due to homonym when users use non-unique values as a search term of an entity. Especially, when parts of names of an entity were used as its identifier, we call a mixed entity resolution problem whose goal is to clear out the erroneous entities. For example, if only last name is used as an identifier, we cannot distinguish "Vannessa Bush" from "George Bush." Mixed entity resolution problem is common among Web pages data. In this paper, to resolve aforementioned mixed entities on the Web, we propose a prototypical system which includes a web service based interface, unsupervised clustering scheme, and cluster ranking algorithms. In particular, since the correct number of clusters is often unknown, we study a state-of-the-art unsupervised clustering solution based on propagation of pairwise similarities of entities. Experimental results show that our approach outperforms main competing solution. |
URL | http://www.scopus.com/inward/record.url?eid=2-s2.0-79960657815&partnerID=40&md5=b7024f58cda537138350a05e3d88ac8c |