When: Tuesday, May 7, 2019, 2:00 pm
Where: Informatics West, Room 232
Speaker: Jinhyuk Yun
Inequality in the formation of collaborative knowledge - the case of Wikipedia
See: J. Yun et al., Nature Human Behaviour 3, 155 (2019)
Abstract: Wikipedia and its sibling projects have served as a representative medium of worldwide knowledge market to share individuals’ knowledge in the information age. It has been commonly believed that such an open-editing communal data set accelerates democratization of knowledge, shifting the possession of knowledge from privileged class to general public. However, recent studies have observed inequality in authority and power distributions among the editors. One essential question is the underlying mechanism behind abiogenesis of such an unexpected authority. Naturally, lacking consideration of data sets other than English Wikipedia, had obscured such communal data set’s genuine nature of editing dynamics. In this study, we propose unbiased framework encompassing every element of Wikimedia projects. Our analysis, using the complete edit history of 267,304,095 articles from the entire 863 Wikimedia projects, reveals universality in growing regardless its category and language. The interplays between number of edits, number of editors, number of articles, and total length of text is characterized by a single set of exponents. Moreover, we observe the rapid increasing of the Gini coefficient, and suggest that this entrenched inequality stems from the nature of such open-editing communal data sets. We introduce a generative model accompanied with short-term and long-term memories, which successfully elucidates the mechanism behind the oligarchy in Wikipedia.
Biography: Jinhyuk Yun is a senior research scientist in KISTI (Korea Institute of Science and Technology Information). He worked as a data scientist at the Search Division in Naver Cooperation after receiving his Ph.D. in statistical physics from KAIST in 2016. His research aims to reveal the hidden structure and dynamics of complex systems with the large-scale dataset and mathematical frameworks focusing on the society, culture, media, collective knowledge, and so on.