biasThe aim of this project is to characterize, study and model the sources of bias that emerge from the complex network structure of the Web and from the use of search engines.  The feedback loops between users searching information, users creating content, and the ranking algorithms of search engines that mediate between them, lead to surprising results. We are studying how all these systems and communities influence and feed on each other in a dynamic information ecology, and how these interactions affect their evolution and their impact on the global processes of information discovery, retrieval, and utilization.

For example, studying the relationship between Web traffic and PageRank, we have shown that given the heterogeneity of topical interests expressed by search queries, search engines mitigate the popularity bias generated by the rich-get-richer structure of the Web graph. These results, dispelling the feared Googlearchy affect, have been published in Proc. Natl. Acad. Sci. USA, presented at the WAW 2006 keynote (slides), and generated some media attention. You can see some movies demonstrating the finding. The result also inspired a robust rank-based model of scale-free network growth, published in Phys. Rev. Lett. (press release).

We also study sources of bias that stem from legal, political, or economic factors. The CENSEARCHIP tool visualizes the differences between results obtained from different search engines, or different country versions of a search engine. This tool, based on a technique described in this paper in First Monday, generated a lot of reactions in the media and the blogosphere (press release).

Project Participants

Fil Menczer, PI

Fil Menczer

Sandro Flammini

Sandro Flammini

Alex Vespignani

Alex Vespignani

Santo Fortunato

Santo Fortunato

Mark Meiss

Mark Meiss

Support

Pervasive Technology Labs at Indiana University Mark Meiss is supported by the Advanced Network Management Laboratory, which is one of the Pervasive Technology Labs established at Indiana University with the assistance of the Lilly Endowment.
Volkswagen Foundation Santo Fortunato was supported by a Volkswagen Foundation grant.
Nsf_logo This research is also supported in part by the National Science Foundation under awards 0348940, 0513650, and 0705676.

Opinions, findings, conclusions, recommendations or points of view of this group are those of the authors and do not necessarily represent the official position of the National Science Foundation, the Volkswagen Foundation, or Indiana University.

egalSearch engines are not biased towards well-known Web sites. In fact, they actually produce an egalitarian effect as to where traffic is directed, say researchers at the Indiana University School of Informatics. Their study, Topical interests and the mitigation of search engine bias, appears in the Aug. 7-11 issue of the Proceedings of the National Academy of Sciences and challenges the “Googlearchy” theory – the perception that search engines push Web traffic toward popular sites, thus creating a monopoly over lesser-known sites.

The study was cited by New Scientist, MIT Technology Review, Scientific American MIND, New Scientist Online, UPI, VNUnet, Forskning & Framsteg (Sweden), Sole 24 Ore (Italy), Ars Technica, and Slashdot. Interviews aired on BBC World Service (MP3), Deutschlandradio (MP3), WFHB (MP3), and WFIU. Earlier, preliminary reports of our findings appeared in The Economist, Slashdot, PhysicsWeb, IDS, Le Scienze (Italian Edition of Scientific American), and IEEE Spectrum Online (see also our piece in IEEE Spectrum). Radio interviews were broadcast by Italian Radio (MP3 in Italian) and Swiss Radio (MP3 in Italian). Other news sources that picked up the story include Monsters and Critics, PhysOrg, TechNews Daily, Political Gateway, Daily India, ACM TechNews (Aug 9, Aug 28 2006), IT Week, Science Daily, EurekAlert, computing, LaboratoryTalk, PC World, SDA Asia, What PC, BrightSurf, PC Authority, TRN, and hundreds of blogs.

CenSEARCHip received intense coverage including in Slashdot, Network World, PhysOrg, IDS, ACM TechNews, Technology News Daily, Computer World, CCNews, ePrairie, PC World, LaboratoryTalk, Search Engine Journal, USA Today, dozens of new sources around the world (including France, Sweden, Norway, Poland, Russia, Italy, Mexico, etc.), and many blogs around the world (list from technorati or google). A radio interview aired on WFIU, WIBC and other NPR affiliates (20 March 2006).