doi:10.1038/srep01069Read our latest paper titled Social Dynamics of Science in Nature Scientific Reports. Authors Xiaoling Sun, Jasleen Kaur, Staša Milojević, Alessandro Flammini & Filippo Menczer ask, How do scientific disciplines emerge? No quantitative model to date allows us to validate competing theories on the different roles of endogenous processes, such as social collaborations, and exogenous events, such as scientific discoveries. Here we propose an agent-based model in which the evolution of disciplines is guided mainly by social interactions among agents representing scientists. Disciplines emerge from splitting and merging of social communities in a collaboration network. We find that this social model can account for a number of stylized facts about the relationships between disciplines, scholars, and publications. These results provide strong quantitative support for the key role of social interactions in shaping the dynamics of science. While several “science of science” theories exist, this is the first account for the emergence of disciplines that is validated on the basis of empirical data.

Lilian

Meme diffusion networksIn our paper on Competition among memes in a world with limited attention in Nature Scientific Reports, Lilian Weng and coauthors Sandro Flammini, Alex Vespignani, and Fil Menczer report that we can explain the massive heterogeneity in the popularity and persistence of memes as deriving from a combination of the competition for our limited attention and the structure of the social network, without the need to assume different intrinsic values among ideas. The findings have been mentioned in the popular press, including Information Week, The Atlantic, and the Dutch daily NRC.

Mark Meiss

Dr. Mark Meiss

On December 16, Mark Meiss presented our paper “Modeling Traffic on the Web Graph” (with Bruno, José, Sandro, and Fil) at the 7th Workshop on Algorithms and Models for the Web Graph (WAW 2010), at Stanford. In this paper we introduce an agent-based model that explains many statistical features of aggregate and individual Web traffic data through realistic elements such as bookmarks, tabbed browsing, and topical interests.

Back to Research

We have been working on various type of agent based modeling framework. We apply that into some our projects, such as : recommendation systems, RNA editing, evolving cellular automata, and artificial immune system.

Agent

Semiotic agents as maintaining a  generalized control relation with their environments.

Semiotic agents as maintaining a generalized control relation with their environments (Cliff&Rocha 2000).

  • It has  some degree of autonomy of action, makes a decision about what action to take next

  • It is distinguishable from its environment

  • It possess some kind of identity to be identifiable in its environment


Agency

  • A dynamically incoherent system-environment engagement or coupling as the strong sense of agency, and to the view of agency as some degree of identity and autonomy in dynamically coherent system-environment coupling as the weak sense of agency.

Visit Agent-based modeling page to learn more about it.

Project Members

Luis Rocha

Luis Rocha


Selected Project Publications

Back to CASCI Research

The Agent-Based T-Cell Cross-regulation Model for Document Classification

The Agent-Based T-Cell Cross-regulation Model for Document Classification.

We have developed a bio-inspired solution for binary classification of textual documents inspired by T-cell cross-regulation in the vertebrate adaptive immune system, which is a complex adaptive system of millions of cells interacting to distinguish between self and nonself substances. In analogy, automatic document classification assumes that the interaction and co-occurrence of thousands of words in text can be used to identify conceptually-related classes of documents—at a minimum, two classes with relevant and irrelevant documents for a given concept (e.g. articles with protein-protein interaction information). Our agent-based method for document classification expands the analytical model of Carneiro et al, by allowing us to deal simultaneously with many distinct populations of antigen-specific T-Cells and their collective dynamics. We have extended this model to produce a spam-detection system. We have also developed our agent-based model further to apply it to biomedical article classification, testing it on a dataset of biomedical articles provided by the BioCreative 2.5 challenge. Our results are useful for biomedical text mining, but they also help us understand T-cell cross-regulation as a potential general principle of classification available to collectives of molecules without a central controller. While there is still much to know about the specifics of T-cell cross-regulation in adaptive immunity, Artificial Life allows us to explore alternative emergent classification principles while producing useful bio-inspired tools. Recently, we started expanding this algorithm to other forms of classification such as sensor data from human-robot interactions under an IUCRG project.


Project Members

Luis Rocha

Luis Rocha

Al Abi-Haidar

Al Abi-Haidar

Ian Wood

Ian Wood



Funding

Project partially funded by:


Selected Project Publications

Back to CASCI Research

Selected Self-organization in Genotype-Phenotype Maps

Agent-based simulation of self-organizing, evolving agents with and without genotype-phenotype mappings

Agent-based simulation of self-organizing, evolving agents with and without genotype-phenotype mappings


We are interested in the linguistic/symbolic aspects of the living organization (the gene as a carrier of information, and DNA as memory) which play a large role in the seemingly open-ended evolution defined by natural selection. This symbolic vision of biology (bio-semiotics), at first glance, seems to be at odds with notions of self-organization so dear to complex systems scientists and a more developmental approach to biology. Therefore, we have been studying the interplay between self-organization and natural selection (in embodied agents), introducing the concept of selected self-organization[Rocha ,1996a; Rocha ,1998a].

We are particularly interested in the problem of how information, symbols, representations and the like can arise from a purely dynamical system of many components. In addition to our work on collective computation and origin of representations, we have worked on simulations of evolving agents with different kinds of reproduction strategies (self-inspection and via a symbolic genotype-phenotype mapping). For these simulations we developed a genetic algorithm with an indirect encoding implemented with Fuzzy Development Programs, which model self-organizing development processes. More information on these simulations is available in the Fuzzy Development Programs’ Resource page, which contains publications and software for understanding and using these. You can also check a paper where these simulations are detailed. The figure depicts a run of our agent-based model where agents which reproduce via a genotype-phenotype mapping completely overtake a population, in a few generations, also containing agents which reproduce by self-inspection without such mappings.


Project Members

Luis Rocha

Luis Rocha

Wim

Wim Hordijk

Artemy Kolchinsky

Artemy Kolchinsky


Selected Project Publications

Back to CASCI Research

Agent with separate codotype and editype components of their genotype in our Evolutionary Model of Genotype Editing.

Agent with separate codotype and editype components of their genotype in our Evolutionary Model of Genotype Editing. Rocha, et al (2007)

Evolutionary models in theoretical biology at large, and computational biology and artificial life in particular, rarely deal with ontogenetic, non-inherited alteration of genetic information because they are based on a direct genotype-phenotype mapping. In contrast, in Nature several processes have been discovered which alter genetic information encoded in DNA before it is translated into amino-acid chains. Ontogenetically altered genetic information is not inherited but extensively used in regulation and development of phenotypes, giving organisms the ability to, in a sense, re-program their genotypes according to environmental clues. An example of post-transcriptional alteration of gene-encoding sequences is the process of RNA Editing. Our latest agent-based model of genotype editing presents a novel architecture for evolving agents in which coding and non-coding genetic components are allowed to coevolve. Our goal is twofold: (1) to study the role of RNA Editing regulation in the evolutionary process, and (2) to investigate the conditions under which genotype edition improves the optimization performance of evolutionary algorithms. We have shown that genotype edition allows evolving agents to perform better in several classes of fitness functions, both in static and dynamic environments. We are also investigating the ways in which the indirect genotype/phenotype mapping resulting from genotype editing lead to a better exploration/exploitation compromise in the search process. In the past year we developed an entirely new modeling platform in Python to run experiments to explore the evolutionary advantages of RNA editing.

Some characteristics of our model of RNA Editing:

Genome contains both coding and non- coding portions: Codome and Editome (Editosome)

  • Agents with editome perform better in changing environments

Study of regulation via non-coding DNA

  • Observe emergence of regulation with promoter signals
  • Memory of previous environments

Bio-inspired algorithm for optimization

  • Outperfoms traditional evolutionary algorithms on many classes of functions

This research is described in greater detail in the separate Evolutionary Models of Genotype Editing page.

 

Project Members

Luis Rocha

Luis Rocha

Ana Maguitman

Ana Maguitman

Chien-Feng Huang

Chien-Feng Huang

Jonathan Frankel

Jasleen Kaur

Artemy Kolchinsky

Artemy Kolchinsky

 

Selected Project Publications

Research | People | Academics | News and Meetings | Publications-online | Media Mentions | Relevant Conferences

Complex Adaptive Systems and Computational Intelligence

We are a research group at Indiana University and the Instituto Gulbenkian de Ciencia working on complex systems. We are particularly interested in the informational properties of natural and artificial systems which enable them to adapt and evolve. This means both understanding how information is fundamental for the evolutionary capabilities of natural systems, as well as abstracting principles from natural systems to produce adaptive information technology.

Our research projects (see below) are on computational and systems biology, complex networks, text and literature mining, evolutionary systems, adaptive search and recommendation, cognitive science, artificial life, and biosemiotics. Additional information available on Luis Rocha’s Website and our group page at the Instituto Gulbenkian de Ciencia.

For information on joining our group see our Academics page. As a group, we are seriously interconnected with other research groups and networks: The Center for Complex Networks and Systems (CNets), Alife@IU, Biocomplexity Institute, Cognitive Science Program, Complex Systems & Networks, FLAD Computational Biology Collaboratorium, InfoVis Lab, Instituto Gulbenkian de Ciencia, Networks an Agents (NAN).

You are welcome to join our mailing list CASCI-L by either:

  • sending an e-mail to listserv@indiana.edu with subscribe CASCI-L in the body (with no subject), or
  • via the LISTSERV web interface: https://listserv.indiana.edu/cgi-bin/wa-iub.exe?HOME ; Click Subscriber’s Corner at the top of the page. Search for “CASCI-L” select it and click Submit.

CASCI projects

Literature Mining

Biomedical Literature Mining

Collective Dynamics in Complex Biochemical Networks

Collective Dynamics in Complex Biochemical Networks

Models of RNA Editing

Models of RNA Editing

Artificial Immune Systems

 Semi-metric Network Analysis

Network Analysis of Weighted and Fuzzy Graphs

 The Adaptive Web and Bio-inspired designs for Recommendation Systems

The Adaptive Web

Microarray Analysis

Genomic Multivariate Analysis

 Biosemiotics: interplay between self-organization and selection

Biosemiotics

Agent-based modeling

Agent-based modeling

Uncertainty and Generalized Information Theory

Uncertainty and Generalized Information Theory

structure of a logical web sessionWe study the structure and dynamics of Web traffic networks based on data from HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows us to examine large volumes of traffic data while minimizing biases associated with other data sources. It also gives us valuable referrer information that we can use to reconstruct the subset of the Web graph actually traversed by users.

Our Web traffic (click) dataset is available!

Our goal is to develop a better understanding of user behavior online and creating more realistic models of Web traffic. The potential applications of this analysis include improved designs for networks, sites, and server software; more accurate forecasting of traffic trends; classification of sites based on the patterns of activity they inspire; and improved ranking algorithms for search results.

Among our more intriguing findings are that server traffic (as measured by number of clicks) and site popularity (as measured by distinct users) both follow distributions so broad that they lack any well-defined mean. Actual Web traffic turns out to violate three assumptions of the random surfer model: users don’t start from any page at random, they don’t follow outgoing links with equal probability, and their probability of jumping is dependent on their current location. Search engines appear to be directly responsible for a smaller share of Web traffic than often supposed. These results were presented at WSDM2008 (paper | talk).

Another paper (also here; presented at Hypertext 2009) examined the conventional notion of a Web session as a sequence of requests terminated by an inactivity timeout. Such a definition turns out to yield statistics dependent primarily on the timeout value selected, which we find to be arbitrary. For that reason, we have proposed logical sessions defined by the target and referrer URLs present in a user’s Web requests.

Inspired by these findings, we designed a model of Web surfing able to recreate not only the broad distribution of traffic, but also the basic statistics of logical sessions. Late breaking results were presented at WSDM2009. Our final report in the ABC model was presented at WAW 2010.

Project Participants

Mark Meiss

Mark Meiss

Bruno Gonçalves

Bruno Gonçalves

Fil Menczer, PI

Fil Menczer

Sandro Flammini

Sandro Flammini

Jose Ramasco

Jose Ramasco

Alex Vespignani

Alex Vespignani

Santo Fortunato

Santo Fortunato

Support

Mark Meiss was supported by the Advanced Network Management Laboratory, one of the Pervasive Technology Labs established at Indiana University with the assistance of the Lilly Endowment.
Nsf_logo This research was also supported in part by the National Science Foundation under awards 0348940, 0513650, and 0705676.
DHS Logo This research was also supported in part from the Institute for Information Infrastructure Protection research program. The I3P is managed by Dartmouth College and supported under Award Number 2003-TK-TX-0003 from the U.S. DHS, Science and Technology Directorate.

Opinions, findings, conclusions, recommendations or points of view of this group are those of the authors and do not necessarily represent the official position of the U.S. Department of Homeland Security, Science and Technology Directorate, I3P, National Science Foundation, or Indiana University.