Read our latest paper titled Social Dynamics of Science in Nature Scientific Reports. Authors Xiaoling Sun, Jasleen Kaur, Staša Milojević, Alessandro Flammini & Filippo Menczer ask, How do scientific disciplines emerge? No quantitative model to date allows us to validate competing theories on the different roles of endogenous processes, such as social collaborations, and exogenous events, such as scientific discoveries. Here we propose an agent-based model in which the evolution of disciplines is guided mainly by social interactions among agents representing scientists. Disciplines emerge from splitting and merging of social communities in a collaboration network. We find that this social model can account for a number of stylized facts about the relationships between disciplines, scholars, and publications. These results provide strong quantitative support for the key role of social interactions in shaping the dynamics of science. While several “science of science” theories exist, this is the first account for the emergence of disciplines that is validated on the basis of empirical data.

Lilian
In our paper on Competition among memes in a world with limited attention in Nature Scientific Reports, Lilian Weng and coauthors Sandro Flammini, Alex Vespignani, and Fil Menczer report that we can explain the massive heterogeneity in the popularity and persistence of memes as deriving from a combination of the competition for our limited attention and the structure of the social network, without the need to assume different intrinsic values among ideas. The findings have been mentioned in the popular press, including Information Week, The Atlantic, and the Dutch daily NRC.

Dr. Mark Meiss
On December 16, Mark Meiss presented our paper “Modeling Traffic on the Web Graph” (with Bruno, José, Sandro, and Fil) at the 7th Workshop on Algorithms and Models for the Web Graph (WAW 2010), at Stanford. In this paper we introduce an agent-based model that explains many statistical features of aggregate and individual Web traffic data through realistic elements such as bookmarks, tabbed browsing, and topical interests.
We have been working on various type of agent based modeling framework. We apply that into some our projects, such as : recommendation systems, RNA editing, evolving cellular automata, and artificial immune system.
Agent

Semiotic agents as maintaining a generalized control relation with their environments (Cliff&Rocha 2000).
-
It has some degree of autonomy of action, makes a decision about what action to take next
-
It is distinguishable from its environment
-
It possess some kind of identity to be identifiable in its environment
Agency
-
A dynamically incoherent system-environment engagement or coupling as the strong sense of agency, and to the view of agency as some degree of identity and autonomy in dynamically coherent system-environment coupling as the weak sense of agency.
Visit Agent-based modeling page to learn more about it.
Project Members
Selected Project Publications
- Joslyn, Cliff and Luis M. Rocha [2000]. “Towards Semiotic Agent-Based Models of Socio-Technical Organizations.” Proc. AI, Simulation and Planning in High Autonomy Systems (AIS 2000) Conference, Tucson, Arizona, USA. ed. HS Sarjoughian et al., pp. 70-79
- Rocha, Luis M. [2000]. “Syntactic autonomy, cellular automata, and RNA editing: or why self-organization needs symbols to evolve and how it might evolve them”. In: Closure: Emergent Organizations and Their Dynamics. Chandler J.L.R. and G, Van de Vijver (Eds.) Annals of the New York Academy of Sciences. Vol. 901, pp 207-223.
- Rocha, Luis M. [1999]. “Complex Systems Modeling: Using Metaphors From Nature in Simulation and Scientific Models“. IN: BITS: Computer and Communications News. Computing, Information, and Communications Division. Los Alamos National Laboratory. November 1999.
We have developed a bio-inspired solution for binary classification of textual documents inspired by T-cell cross-regulation in the vertebrate adaptive immune system, which is a complex adaptive system of millions of cells interacting to distinguish between self and nonself substances. In analogy, automatic document classification assumes that the interaction and co-occurrence of thousands of words in text can be used to identify conceptually-related classes of documents—at a minimum, two classes with relevant and irrelevant documents for a given concept (e.g. articles with protein-protein interaction information). Our agent-based method for document classification expands the analytical model of Carneiro et al, by allowing us to deal simultaneously with many distinct populations of antigen-specific T-Cells and their collective dynamics. We have extended this model to produce a spam-detection system. We have also developed our agent-based model further to apply it to biomedical article classification, testing it on a dataset of biomedical articles provided by the BioCreative 2.5 challenge. Our results are useful for biomedical text mining, but they also help us understand T-cell cross-regulation as a potential general principle of classification available to collectives of molecules without a central controller. While there is still much to know about the specifics of T-cell cross-regulation in adaptive immunity, Artificial Life allows us to explore alternative emergent classification principles while producing useful bio-inspired tools. Recently, we started expanding this algorithm to other forms of classification such as sensor data from human-robot interactions under an IUCRG project.
Project Members
Funding
Project partially funded by:
- Indiana University Collaborative Research Grants 2013. Project title: “Social SLAM: Creating Dynamical Socio-Environmental Models for Mobile Robots”.
- IARPA Contract: Early Model-Based Event Recognition with Surrogates (EMBERS), 2012-2014.
Selected Project Publications
- A. Abi-Haidar [2011]. “An adaptive document classifier inspired by T-Cell cross-regulation in the immune system” (pdf). PhD Dissertation, Indiana University
- A. Abi-Haidar and L.M. Rocha [2011]. “Collective Classification of Textual Documents by Guided Self-Organization in T-Cell Cross-Regulation Dynamics“. Evolutionary Intelligence. 4(2):69-80. DOI: 10.1007/s12065-011-0052-5.
- A. Abi-Haidar and L.M. Rocha [2010]. “Collective Classification of Biomedical Articles using T-Cell Cross-regulation“. In: Artificial Life XII: Twelfth International Conference on the Simulation and Synthesis of Living Systems. H. Fellermann et al et al (Eds.). MIT Press, pp. 706-713.
- A. Abi-Haidar and L.M. Rocha [2010]. “Biomedical Article Classification Using an Agent-Based Model of T-Cell Cross-Regulation“. In: Artificial Immune Systems: 9th International Conference, (ICARIS 2010). E. Hart, C. McEwan, J. Timmis, and A. Hone (Eds.) Lecture Notes in Computer Science. Springer-Verlag, 6209: 237-249. Recipient of Best Paper Award. for ICARIS 2010
- A. Abi-Haidar and L.M. Rocha [2008]. Adaptive Spam Detection Inspired by a Cross-Regulation Model of Immune Dynamics: A Study of Concept Drift“. In: Artificial Immune Systems: 7th International Conference, (ICARIS 2008). Bentley, Peter; Lee, Doheon; Jung, Sungwon (Eds.) Lecture Notes in Computer Science. Springer-Verlag, 5132: 36-47.
- A. Abi-Haidar and L.M. Rocha [2008]. Adaptive Spam Detection Inspired by the Immune System“. In: Artificial Life XI: Eleventh International Conference on the Simulation and Synthesis of Living Systems. S. Bullock, J. Noble, R. A. Watson, and M. A. Bedau (Eds.). MIT Press, pp. 1-8.
Selected Self-organization in Genotype-Phenotype Maps

Agent-based simulation of self-organizing, evolving agents with and without genotype-phenotype mappings
We are interested in the linguistic/symbolic aspects of the living organization (the gene as a carrier of information, and DNA as memory) which play a large role in the seemingly open-ended evolution defined by natural selection. This symbolic vision of biology (bio-semiotics), at first glance, seems to be at odds with notions of self-organization so dear to complex systems scientists and a more developmental approach to biology. Therefore, we have been studying the interplay between self-organization and natural selection (in embodied agents), introducing the concept of selected self-organization[Rocha ,1996a; Rocha ,1998a].
We are particularly interested in the problem of how information, symbols, representations and the like can arise from a purely dynamical system of many components. In addition to our work on collective computation and origin of representations, we have worked on simulations of evolving agents with different kinds of reproduction strategies (self-inspection and via a symbolic genotype-phenotype mapping). For these simulations we developed a genetic algorithm with an indirect encoding implemented with Fuzzy Development Programs, which model self-organizing development processes. More information on these simulations is available in the Fuzzy Development Programs’ Resource page, which contains publications and software for understanding and using these. You can also check a paper where these simulations are detailed. The figure depicts a run of our agent-based model where agents which reproduce via a genotype-phenotype mapping completely overtake a population, in a few generations, also containing agents which reproduce by self-inspection without such mappings.
Project Members
Selected Project Publications
- L.M. Rocha [2007].”Reality is Stranger than Fiction: What can Artificial Life do about Advances in Biology?“. Invited presentation for the “Biocomplexity” discussion section at the 9th European Conference on Artificial Life, September 12, 2007 in Lisbon, Portugal.
- Rocha, Luis M. and W. Hordijk [2005]. “Material Representations: From the Genetic Code to the Evolution of Cellular Automata”. Artificial Life. 11 (1-2), pp. 189 – 214
- Rocha, Luis M. [2001]."Evolution with Material Symbol Systems." Biosystems. Vol. 60, pp. 95-121.
- Rocha, Luis M. (Ed.)[2001]. The Physics and Evolution of Symbols and Codes. BioSystems Vol. 60, No. 1-3. Editorial: Biosystems Vol. 60, pp. 1-4.
- Rocha, Luis M. [2000]. “Syntactic autonomy, cellular automata, and RNA editing: or why self-organization needs symbols to evolve and how it might evolve them”. In: Closure: Emergent Organizations and Their Dynamics. Chandler J.L.R. and G, Van de Vijver (Eds.) Annals of the New York Academy of Sciences. Vol. 901, pp 207-223.
- Rocha, Luis M. [1998]." Selected Self-Organization and the Semiotics of Evolutionary Systems." In: Evolutionary Systems: The Biological and Epistemological Perspectives on Selection and Self- Organization. S. Salthe, G. Van de Vijver, and M. Delpos (eds.). Kluwer Academic Publishers, pp. 341-358.
- Rocha, Luis M. and Cliff Joslyn [1998]." “Simulations of Evolving Embodied Semiosis: Emergent Semantics in Artificial Environments” Simulation Series; Vol. 30, (2), pp. 233-238.
- Rocha, Luis M. [1998]." Syntactic Autonomy." In: Proceedings of the Joint Conference on the Science and Technology of Intelligent Systems (ISIC/CIRA/ISAS 98). National Institute of Standards and Technology, Gaithersburg, MD.. IEEE Press, pp. 706-711.
- Rocha, Luis M. [1997]." Evidence Sets and Contextual Genetic Algorithms: Exploring Uncertainty, Context, and Embodiment in Cognitive and Biological Systems. PhD Dissertation. State University of New York at Binghamton.
- Rocha, Luis M. [1996]." Eigenbehavior and symbols." In: Systems Research Vol. 13, No 3, pp. 371-384
- Rocha, Luis M. [1995]." Contextual Genetic Algorithms: Evolving Developmental Rules ." In: Advances in Artificial Life . F. Moran, A.Moreno, J.J. Merelo, and P. Chacon (Eds.). Series: Lecture Notes in Artificial Intelligence, Springer-Verlag. pp. 368-382.

Agent with separate codotype and editype components of their genotype in our Evolutionary Model of Genotype Editing. Rocha, et al (2007)
Evolutionary models in theoretical biology at large, and computational biology and artificial life in particular, rarely deal with ontogenetic, non-inherited alteration of genetic information because they are based on a direct genotype-phenotype mapping. In contrast, in Nature several processes have been discovered which alter genetic information encoded in DNA before it is translated into amino-acid chains. Ontogenetically altered genetic information is not inherited but extensively used in regulation and development of phenotypes, giving organisms the ability to, in a sense, re-program their genotypes according to environmental clues. An example of post-transcriptional alteration of gene-encoding sequences is the process of RNA Editing. Our latest agent-based model of genotype editing presents a novel architecture for evolving agents in which coding and non-coding genetic components are allowed to coevolve. Our goal is twofold: (1) to study the role of RNA Editing regulation in the evolutionary process, and (2) to investigate the conditions under which genotype edition improves the optimization performance of evolutionary algorithms. We have shown that genotype edition allows evolving agents to perform better in several classes of fitness functions, both in static and dynamic environments. We are also investigating the ways in which the indirect genotype/phenotype mapping resulting from genotype editing lead to a better exploration/exploitation compromise in the search process. In the past year we developed an entirely new modeling platform in Python to run experiments to explore the evolutionary advantages of RNA editing.
Some characteristics of our model of RNA Editing:
Genome contains both coding and non- coding portions: Codome and Editome (Editosome)
- Agents with editome perform better in changing environments
Study of regulation via non-coding DNA
- Observe emergence of regulation with promoter signals
- Memory of previous environments
Bio-inspired algorithm for optimization
- Outperfoms traditional evolutionary algorithms on many classes of functions
This research is described in greater detail in the separate Evolutionary Models of Genotype Editing page.
Project Members
Selected Project Publications
- L.M. Rocha and J. Kaur [2007].”Genotype Editing and the Evolution of Regulation and Memory“. Proceedings of the 9th European Conference on Artificial Life. Lecture Notes in Artificial Intelligence (LNAI), 4648: 63-73 (Springer-Verlag).
- C. Huang, J. Kaur, A. Maguitman, L.M. Rocha[2007].”Agent-Based Model of Genotype Editing“. Evolutionary Computation, 15(3): 253-89.
- Rocha, L.M., A. Maguitman, C. Huang, J. Kaur, and S. Narayanan. [2006].”“An Evolutionary Model of Genotype Editing“. In: Artificial Life 10: Tenth International Conference on the Simulation and Synthesis of Living SystemsL.M.Rocha, L. Yaeger, M. Bedau, D. Floreano, R. Goldstone, and A. Vespignani (Eds.). MIT Press, In Press.
Research | People | Academics | News and Meetings | Publications-online | Media Mentions | Relevant Conferences
Complex Adaptive Systems and Computational Intelligence
We are a research group at Indiana University and the Instituto Gulbenkian de Ciencia working on complex systems. We are particularly interested in the informational properties of natural and artificial systems which enable them to adapt and evolve. This means both understanding how information is fundamental for the evolutionary capabilities of natural systems, as well as abstracting principles from natural systems to produce adaptive information technology.
Our research projects (see below) are on computational and systems biology, complex networks, text and literature mining, evolutionary systems, adaptive search and recommendation, cognitive science, artificial life, and biosemiotics. Additional information available on Luis Rocha’s Website and our group page at the Instituto Gulbenkian de Ciencia.
For information on joining our group see our Academics page. As a group, we are seriously interconnected with other research groups and networks: The Center for Complex Networks and Systems (CNets), Alife@IU, Biocomplexity Institute, Cognitive Science Program, Complex Systems & Networks, FLAD Computational Biology Collaboratorium, InfoVis Lab, Instituto Gulbenkian de Ciencia, Networks an Agents (NAN).
You are welcome to join our mailing list CASCI-L by either:
- sending an e-mail to listserv@indiana.edu with subscribe CASCI-L in the body (with no subject), or
- via the LISTSERV web interface: https://listserv.indiana.edu/cgi-bin/wa-iub.exe?HOME ; Click Subscriber’s Corner at the top of the page. Search for “CASCI-L” select it and click Submit.
CASCI projects
We study the structure and dynamics of Web traffic networks based on data from HTTP requests made by users at Indiana University. Gathering anonymized requests directly from the network rather than relying on server logs and browser instrumentation allows us to examine large volumes of traffic data while minimizing biases associated with other data sources. It also gives us valuable referrer information that we can use to reconstruct the subset of the Web graph actually traversed by users.
Our Web traffic (click) dataset is available!
Our goal is to develop a better understanding of user behavior online and creating more realistic models of Web traffic. The potential applications of this analysis include improved designs for networks, sites, and server software; more accurate forecasting of traffic trends; classification of sites based on the patterns of activity they inspire; and improved ranking algorithms for search results.
Among our more intriguing findings are that server traffic (as measured by number of clicks) and site popularity (as measured by distinct users) both follow distributions so broad that they lack any well-defined mean. Actual Web traffic turns out to violate three assumptions of the random surfer model: users don’t start from any page at random, they don’t follow outgoing links with equal probability, and their probability of jumping is dependent on their current location. Search engines appear to be directly responsible for a smaller share of Web traffic than often supposed. These results were presented at WSDM2008 (paper | talk).
Another paper (also here; presented at Hypertext 2009) examined the conventional notion of a Web session as a sequence of requests terminated by an inactivity timeout. Such a definition turns out to yield statistics dependent primarily on the timeout value selected, which we find to be arbitrary. For that reason, we have proposed logical sessions defined by the target and referrer URLs present in a user’s Web requests.
Inspired by these findings, we designed a model of Web surfing able to recreate not only the broad distribution of traffic, but also the basic statistics of logical sessions. Late breaking results were presented at WSDM2009. Our final report in the ABC model was presented at WAW 2010.
Project Participants
Support
![]() |
Mark Meiss was supported by the Advanced Network Management Laboratory, one of the Pervasive Technology Labs established at Indiana University with the assistance of the Lilly Endowment. |
![]() |
This research was also supported in part by the National Science Foundation under awards 0348940, 0513650, and 0705676. |
![]() |
This research was also supported in part from the Institute for Information Infrastructure Protection research program. The I3P is managed by Dartmouth College and supported under Award Number 2003-TK-TX-0003 from the U.S. DHS, Science and Technology Directorate. |
Opinions, findings, conclusions, recommendations or points of view of this group are those of the authors and do not necessarily represent the official position of the U.S. Department of Homeland Security, Science and Technology Directorate, I3P, National Science Foundation, or Indiana University.



























