Link analysis algorithms leverage hyperlinks created by authors as semantic endorsements between pages, while social bookmarks provide a way to leverage annotations by information consumers as a source of information about pages. This project explores a novel approach that is a synergy of the two: soliciting annotations from users about the content of pages, in a way that implicitly forms networks of relationships between and among resources and tags. These socially generated relationships are then aggregated to build bottom-up, global semantic similarity networks. Algorithms are developed to construct, analyze, and mine these networks in support of search and recommendation applications, exploratory navigation interfaces, resource management utilities, tag spam detection, and incentive games to accelerate the achievement of critical mass.
To extrapolate both annotations about content (tags) and semantic relationships (similarity) from single users to the “wisdom of the crowd,” the project investigates an information-theoretic model that extracts semantic assessments from information structures that many users are already maintaining, namely the bookmarks and tags they manage on their browsers or online. This entails the design and evaluation of several network-based measures and algorithms, such as similarity, novelty, centrality, and focus. Among the aims of this model are the exploration of the duality between resources (URLs) and concepts (tags or categories) and the integration of social annotation and collaborative filtering. One way to provide users with immediate value is to integrate client-based taxonomies and server-based folksonomies for social bookmark management. Both traditional users of browser bookmarks and social users of online bookmarks can take advantage of the same semantic maps while retaining the convenience of intuitive browser interfaces and centralized storage.
Strategic collaborations to share data, accelerate evaluation, and maximize impact are under way with key groups in Europe through the TAGora Project and its partners at Rome Sapienza, Sony Paris, the ISI Foundation in Torino, and the BibSonomy group at Kassel University. GiveALink.org (supported by a wonderful computing and storage infrastructure) is an open social bookmarking platform developed to experiment with and demonstrate the ideas of this project. The algorithms and data generated by the project are made available to the Web community to facilitate analysis, the development of improved network algorithms, and integration with other Internet applications. Early results of this project have been presented at various conferences and workshops including LinkKDD2005, AAAI2006, and HT2008. More recent publications are listed below. To learn more, donate your bookmarks, play with our system, and download our data and applications please visit GiveALink.org.
Collaborators & Alumni:
We should also acknowledge Todd Holloway for his contributions to the early search engine; Luis Rocha and Ana Maguitman for suggesting the idea of ranking and searching by novelty; Mark Meiss, who thought of the catchy name for GiveALink; and Rob Henderson, quite possibly the greatest sysadmin around.
|This project is supported by the National Science Foundation under award IIS-0811994: Social Integration of Semantic Annotation Networks for Web Applications. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.|