Thanks to support from the Indiana University Network Science Institute (IUNI) and Digital Science Center (DSC), the full content of the Twitter data repository from the Observatory on Social Media (OSoMe) is now available to all IU researchers. Many tools to detect social bots, study the spread of fake news, visualize meme diffusion networks, trends, and maps, as well as APIs to access this data, have been available to the general public since mid-2016. Now, however, the IU research community can access enhanced data and content from the large collection, based on a 10% sample of all public tweets. A dedicated portal allows IU faculty and students to submit queries to the OSoMe cluster based on hashtags, URLs, keywords, geo-coordinates, and other criteria. At any time the system can search and retrieve data from the previous 18 months. We hope this resource will spur and support new research projects in all areas of computing, natural, and social sciences. Click here to read how to get access and learn more about the data, or attend our Open Science Forum!
LinkedIn announced that YY Ahn and his team of Ph.D. students from the Center for Complex Networks and Systems Research, including Yizhi Jing, Adazeh Nematzadeh, Jaehyuk Park, and Ian Wood, is one of the 11 winners of the LinkedIn Economic Graph Challenge.
Their project, “Forecasting large-scale industrial evolution,” aims to understand the macro-evolution of industries to track businesses and emerging skills. This data would be used to forecast economic trends and guide professionals toward promising career paths.
“This is a fascinating opportunity to study the network of industries and people with unprecedented details and size. All of us are very excited to collaborate with LinkedIn and our LinkedIn mentor, Mike Conover, who is a recent Informatics PhD alumnus, on this topic,” said Ahn. Read more…
Congratulations to Przemyslaw Grabowicz, Luca Aiello, and Fil Menczer for winning the WICI Data Challenge. A prize of $10,000 CAD accompanies this award from the Waterloo Institute for Complexity and Innovation at the University of Waterloo. The Challenge called for tools and methods that improve the exploration, analysis, and visualization of complex-systems data. The winning entry, titled Fast visualization of relevant portions of large dynamic networks, is an algorithm that selects subsets of nodes and edges that best represent an evolving graph and visualizes it either by creating a movie, or by streaming it to an interactive network visualization tool. The algorithm is deployed in the movie generation tool of the Truthy system, which allows users to create, in near-real time, YouTube videos that illustrate the spread and co-occurrence of memes on Twitter. Przemek and Luca worked on this project while visiting CNetS in 2011 and collaborating with the Truthy team. Bravo!
UPDATE: With legal review completed, we re-launched Kinsey Reporter V.2!
CNetS, in collaboration with The Kinsey Institute, has released Kinsey Reporter, a global mobile survey platform for collecting and sharing anonymous data about sexual and other intimate behaviors. The pilot project allows citizen observers around the world to use free applications now available for Apple and Android mobile platforms to not only report on sexual behavior and experiences, but also to share, explore and visualize the accumulated data.
This new platform will allow us to explore issues that have been challenging to study until now, such as the prevalence of unreported sexual violence in different parts of the world, or the correlation between various sexual practices like condom use, for example, and the cultural, political, religious or health contexts in particular geographical areas.
The Kinsey Institute’s longstanding seminal studies of sexual behaviors created a perfect synergy with research going on at CNetS related to mining big data crowd-sourced from mobile social media. The sensitive domain — sexual relations — added an intriguing challenge in finding a way to share useful data with the community while protecting the privacy and anonymity of the reporting volunteers.
To foster the study of the structure and dynamics of Web traffic networks, we are making available to the research community a large Click Dataset of
13 53.5 billion HTTP requests collected at Indiana University. Between 2006 and 2010, our system generated data at a rate of about 60 million requests per day, or about 30 GB/day of raw data. We hope that this data will help develop a better understanding of user behavior online and create more realistic models of Web traffic. The potential applications of this data include improved designs for networks, sites, and server software; more accurate forecasting of traffic trends; classification of sites based on the patterns of activity they inspire; and improved ranking algorithms for search results.