Congratulations to Rion Correia, who successfully defended his PhD dissertation on Prediction of Drug Interaction and Adverse Reactions, with data from Electronic Health Records, Clinical Reporting, Scientific Literature, and Social Media, using Complexity Science Methods. Dr. Correia’s research used network science, machine learning, and data science to uncover population-level associations of drugs and symptoms, useful for public health surveillance. His findings show that Social Media (Instagram and Twitter) and Electronic Health Records of an entire city in Southern Brazil, are very useful to reveal how the Drug interaction phenomenon varies across distinct groups. For instance, he identifying gender biases and specific communities of interest in chronic disease (e.g. Epilepsy and Depression). In addition to Complex Networks and Systems, his dissertation contributes to the fields of biomedical informatics and precision public health by leveraging heterogeneous data sources at multiple levels to understand population and individual pharmacology differences and other public health problems.
Congratulations to Dimitar Nikolov, who successfully defended his PhD dissertation on Information Exposure Biases in Online Behaviors. Dr. Nikolov’s research explored the unintentional biases introduced by filtering, ranking, and recommendation algorithms that mediate our online consumption of information. His findings show that our reliance on modern online technologies limits exposure to diverse points of view and makes us vulnerable to misinformation. In particular, he analyzed two massive Web traffic datasets to quantify the popularity and homogeneity bias of several popular online platforms including social media, email, personalized news, and search engines. He also leveraged Twitter data to characterize the link between political partisanship and vulnerability to online pollution, such as fake news, conspiracy theories, and junk science. His dissertation contributes to the field of computational social science by putting the study of bias in information consumption and derived phenomena like political polarization, echo chambers, and online pollution on a more firm quantitative foundation.
Speaker: Ricardo Baeza-Yates, Universitat Pompeu Fabra, Spain & Universidad de Chile
Title: Data and Algorithmic Bias in the Web
Room: Info East 122
Abstract: The Web is the largest public big data repository that humankind has created. In this overwhelming data ocean we need to be aware of the quality and in particular, of biases that exist in this data, such as redundancy, spam, etc. These biases affect the algorithms that we design to improve the user experience. This problem is further exacerbated by biases that are added by these algorithms, especially in the context of search and recommendation systems. They include ranking bias, presentation bias, position bias, etc. We give several examples and their relation to sparsity, novelty, and privacy, stressing the importance of the user context to avoid these biases.
Bio: Ricardo Baeza-Yates areas of expertise are information retrieval, web search and data mining, data science and algorithms. He was VP of Research at Yahoo Labs, based in Barcelona, Spain, and later in Sunnyvale, California, from January 2006 to February 2016. He is part time Professor at DTIC of the Universitat Pompeu Fabra, in Barcelona, Spain. Until 2004 he was Professor and founding director of the Center for Web Research at the Dept. of Computing Science of the University of Chile. He obtained a Ph.D. in CS from the University of Waterloo, Canada, in 1989. He is co-author of the best-seller Modern Information Retrieval textbook published by Addison-Wesley in 2011 (2nd ed), that won the ASIST 2012 Book of the Year award. From 2002 to 2004 he was elected to the board of governors of the IEEE Computer Society and in 2012 he was elected for the ACM Council. Since 2010 is a founding member of the Chilean Academy of Engineering. In 2009 he was named ACM Fellow and in 2011 IEEE Fellow, among other awards and distinctions.
This sabbatical is providing wonderful opportunities for me to present our work and establish/strengthen collaborations with several groups in Italy. Recently I have given invited seminars on social search at the Department of Informatics at the University of Torino (hosts Matteo Sereno and Mino Anglano) and on Web traffic at the Department of Math at the University of Padova (host Massimo Marchiori). In the next few weeks I will give a talk on social search at the Department of Informatics and Information Science at the University of Genova (host Marina Ribaudo) and one on search engine bias and Web modeling at my old stomping ground, the Institute of Cognitive Sciences and Technologies of the National Research Council in Rome (host my undergraduate advisor and mentor Domenico Parisi).
No, it’s not an Italian spin-off of the popular TV show. CSI Piemonte is organizing a meeting on Understanding Complexity: a Journey through Science to be held November 22-23 at the Lingotto Convention Center here in Torino. We will have demos and posters on 6S, GiveALink, and the egalitarian effect of search engines. I look forward in particular to seeing my good old friend Dario and my mentor, Domenico.
Search engines are not biased towards well-known Web sites. In fact, they actually produce an egalitarian effect as to where traffic is directed, say researchers at the Indiana University School of Informatics. Their study, Topical interests and the mitigation of search engine bias, appears in the Aug. 7-11 issue of the Proceedings of the National Academy of Sciences and challenges the “Googlearchy” theory – the perception that search engines push Web traffic toward popular sites, thus creating a monopoly over lesser-known sites.
The study was cited by New Scientist, MIT Technology Review, Scientific American MIND, New Scientist Online, UPI, VNUnet, Forskning & Framsteg (Sweden), Sole 24 Ore (Italy), Ars Technica, and Slashdot. Interviews aired on BBC World Service (MP3), Deutschlandradio (MP3), WFHB (MP3), and WFIU. Earlier, preliminary reports of our findings appeared in The Economist, Slashdot, PhysicsWeb, IDS, Le Scienze (Italian Edition of Scientific American), and IEEE Spectrum Online (see also our piece in IEEE Spectrum). Radio interviews were broadcast by Italian Radio (MP3 in Italian) and Swiss Radio (MP3 in Italian). Other news sources that picked up the story include Monsters and Critics, PhysOrg, TechNews Daily, Political Gateway, Daily India, ACM TechNews (Aug 9, Aug 28 2006), IT Week, Science Daily, EurekAlert, computing, LaboratoryTalk, PC World, SDA Asia, What PC, BrightSurf, PC Authority, TRN, and hundreds of blogs.
CenSEARCHip received intense coverage including in Slashdot, Network World, PhysOrg, IDS, ACM TechNews, Technology News Daily, Computer World, CCNews, ePrairie, PC World, LaboratoryTalk, Search Engine Journal, USA Today, dozens of new sources around the world (including France, Sweden, Norway, Poland, Russia, Italy, Mexico, etc.), and many blogs around the world (list from technorati or google). A radio interview aired on WFIU, WIBC and other NPR affiliates (20 March 2006).