Category Archives: Information space

Meltdown: Why our systems fail and What we can do about it

Meltdown: Why our systems fail and What we can do about it

Authors and related work

  • Chris Clearfield 
    • Chris is the founder of System Logic, an independent research and consulting firm focusing on the challenges posed by risk and complexity. He previously worked as a derivatives trader at Jane Street, a quantitative trading firm, in New York, Tokyo, and Hong Kong, where he analyzed and devised mitigations for the financial and regulatory risks inherent in the business of technologically complex high-speed trading. He has written about catastrophic failure, technology, and finance for The Guardian, Forbes, the Harvard Kennedy School Review, the popular science magazine Nautilus, and the Harvard Business Review blog.
  • András Tilcsik
    • András holds the Canada Research Chair in Strategy, Organizations, and Society at the University of Toronto’s Rotman School of Management. He has been recognized as one of the world’s top forty business professors under forty and as one of thirty management thinkers most likely to shape the future of organizations. The United Nations named his course on organizational failure as the best course on disaster risk management in a business school. 
  • How to Prepare for a Crisis You Couldn’t Possibly Predict
    • Over the past five years, we have studied dozens of unexpected crises in all sorts of organizations and interviewed a broad swath of people — executives, pilots, NASA engineers, Wall Street traders, accident investigators, doctors, and social scientists — who have discovered valuable lessons about how to prepare for the unexpected. Here are three of those lessons.


This book looks at the underlying reasons for accidents that emerge from complexity and how diversity is a fix. It’s based on Charles Perrow’s concept of Normal Accidents being a property of high-risk systems.

Normal Accidents are unpredictable, yet inevitable combinations of small failures that build upon each other within an unforgiving environment. Normal accidents include catastrophic failures such as reactor meltdowns, airplane crashes, and stock market collapses. Though each failure is unique, all these failures have common properties:

    • The system’s components are tightly coupled. A change in one place has rapid consequences elsewhere
    • The system is densely connected, so that the actions of one part affects many others
    • The system’s internals are difficult to observe, so that failure can appear without warning

What happens in all these accidents is that there is misdirected progress in a direction that makes the problem worse. Often, this is because the humans in the system are too homogeneous. They all see the problem from the same perspective, and they all implicitly trust each other (Tight coupling and densely connected).

The addition of diversity is a way to solve this problem. Diversity does three things:

    • It provides additional perspectives into the problem. This only works if there is large enough representation of diverse groups so that they do not succumb to social pressure.
    • It lowers the amount of trust within the group, so that proposed solutions are exposed to a higher level of skepticism.
    • It slows the process down, making the solution less reflexive and more thoughtful.

By designing systems to be transparent, loosely coupled and sparsely connected, the risk of catastrophe is reduced. If that’s not possible, ensure that the people involved in the system are diverse.

My more theoretical thoughts:

There are two factors that affect the response of the network: The level of connectivity and the stiffness of the links. When the nodes have a velocity component, then a sufficiently stiff network (either many somewhat stiff or a few very stiff links) has to move as a single entity. Nodes with sparse and slack connections are safe systems, but not responsive. Stiff, homogeneous (similarity is implicit stiff coupling)  networks are prone to stampede. Think of a ball rolling down a hill as opposed to a lump of jello.

When all the nodes are pushing in the same direction, then the network as a whole will move into more dangerous belief spaces. That’s a stampede. When some percentage of these connections are slack connections to diverse nodes (e.g. moving in other directions), the structure as a whole is more resistant to stampede.

I think that dimension reduction is inevitable in a stiffening network. In physical systems, where the nodes have mass, a stiff structure really only has two degrees of freedom, its direction of travel and its axis of rotation. Which means that regardless of the number of initial dimensions, a stiff body’s motion reduces to two components. Looking at stampedes and panics, I’d say that this is true for behaviors as well, though causality could run in either direction. This is another reason that diversity helps keep away from dangerous conditions, but at the expense of efficiency.


  • Such a collision should have been impossible. The entire Washington Metro system, made up of over one hundred miles of track, was wired to detect and control trains. When trains got too close to each other, they would automatically slow down. But that day, as Train 112 rounded a curve, another train sat stopped on the tracks ahead—present in the real world, but somehow invisible to the track sensors. Train 112 automatically accelerated; after all, the sensors showed that the track was clear. By the time the driver saw the stopped train and hit the emergency brake, the collision was inevitable. (Page 2)
  • The second element of Perrow’s theory (of normal accidents) has to do with how much slack there is in a system. He borrowed a term from engineering: tight coupling. When a system is tightly coupled, there is little slack or buffer among its parts. The failure of one part can easily affect the others. Loose coupling means the opposite: there is a lot of slack among parts, so when one fails, the rest of the system can usually survive. (Page 25)
  • Perrow called these meltdowns normal accidents. “A normal accident,” he wrote, “is where everyone tries very hard to play safe, but unexpected interaction of two or more failures (because of interactive complexity) causes a cascade of failures (because of tight coupling).” Such accidents are normal not in the sense of being frequent but in the sense of being natural and inevitable. “It is normal for us to die, but we only do it once,” he quipped. (Page 27)
    • This is exactly what I see in my simulations and in modelling with graph Laplacians. There are two factors that affect the response of the network: The level of connectivity and the stiffness of the links. When the nodes have a velocity component, then a sufficiently stiff network (either many somewhat stiff or a few very stiff links) has to behave as a single entity.
  • These were unintended interactions between the glitch in the content filter, Talbot’s photo, other Twitter users’ reactions, and the resulting media coverage. When the content filter broke, it increased tight coupling because the screen now pulled in any tweet automatically. And the news that Starbucks had a PR disaster in the making spread rapidly on Twitter—a tightly coupled system by design. (Page 30)
  • This approach—reducing complexity and adding slack—helps us escape from the danger zone. It can be an effective solution, one we’ll explore later in this book. But in recent decades, the world has actually been moving in the opposite direction: many systems that were once far from the danger zone are now in the middle of it. (Page 33)
  • Today, smartphone videos create complexity because they link things that weren’t always connected (Page 37)
  • For nearly thirty minutes, Knight’s trading system had gone haywire and sent out hundreds of unintended orders per second in 140 stocks. Those very orders had caused the anomalies that John Mueller and traders across Wall Street saw on their screens. And because Knight’s mistake roiled the markets in such a visible way, traders could reverse engineer its positions. Knight was a poker player whose opponents knew exactly what cards it held, and it was already all in. For thirty minutes, the company had lost more than $15 million per minute. (Page 41)
  • Though a small software glitch caused Knight’s failure, its roots lay much deeper. The previous decade of technological innovation on Wall Street created the perfect conditions for the meltdown. Regulation and technology transformed stock trading from a fragmented, inefficient, relationship-based activity to a tightly connected endeavor dominated by computers and algorithms. Firms like Knight, which once used floor traders and phones to execute trades, had to adapt to a new world. (Page 42)
    • This is an important point. There is short-term survival value in becoming homogeneous and tightly connected. Diversity only helps in the long run.
  • As the crew battled the blowout, complexity struck again. The rig’s elaborate emergency systems were just too overwhelming. There were as many as thirty buttons to control a single safety system, and a detailed emergency handbook described so many contingencies that it was hard to know which protocol to follow. When the accident began, the crew was frozen. The Horizon’s safety systems paralyzed them. (Page 49)
    • I think that this may argue opposite to the authors’ point. The complexity here is a form of diversity. The safety system was a high-dimensional system that required an effective user to be aligned with it, like a free climber on a cliff face. A user highly educated in the system could probably have made it work, even better than a big STOP button. But expecting that user is a mistake. The authors actually discuss this later when they describe how safety training was reduced to simple practices that ignored perceived unlikely catastrophic events.
  • “The real threat,” Greenberg explained, “comes from malicious actors that connect things together. They use a chain of bugs to jump from one system to the next until they achieve full code execution.” In other words, they exploit complexity: they use the connections in the system to move from the software that controls the radio and GPS to the computers that run the car itself. “As cars add more features,” Greenberg told us, “there are more opportunities for abuse.” And there will be more features: in driverless cars, computers will control everything, and some models might not even have a steering wheel or brake pedal. (Page 60)
    • In this case it’s not the stiffness of the connections, its the density of connections
  • Attacks on cars, ATMs, and cash registers aren’t accidents. But they, too, originate from the danger zone. Complex computer programs are more likely to have security flaws. Modern networks are rife with interconnections and unexpected interactions that attackers can exploit. And tight coupling means that once a hacker has a foothold, things progress swiftly and can’t easily be undone. In fact, in all sorts of areas, complexity creates opportunities for wrongdoing, and tight coupling amplifies the consequences. It’s not just hackers who exploit the danger zone to do wrong; it’s also executives at some of the world’s biggest companies. (Page 62)
  • By the year 2000, Fastow and his predecessors had created over thirteen hundred specialized companies to use in these complicated deals. “Accounting rules and regulations and securities laws and regulation are vague,” Fastow later explained. “They’re complex. . . . What I did at Enron and what we tended to do as a company [was] to view that complexity, that vagueness . . . not as a problem, but as an opportunity.” Complexity was an opportunity. (Page 69)
    • I’m not sure how to fit this in, but I think there is something here about high-dimensional spaces being essentially invisible. This is the same thing as the safety system on the Deepwater Horizon.
  • But like the core of a nuclear power plant, the truth behind such writing is difficult to observe. And research shows that unobservability is a key ingredient to news fabrications. Compared to genuine articles, falsified stories are more likely to be filed from distant locations and to focus on topics that lend themselves to the use of secret sources, such as war and terrorism; they are rarely about big public events like baseball games. (Page 77)
    •  More heuristics for map building
  • Charles Perrow once wrote that “safety systems are the biggest single source of catastrophic failure in complex, tightly coupled systems.” (Page 85)
    • Dimensions reduce through use, which is a kind of conversation between the users and the designers. Safety systems are rarely used, so this conversation doesn’t happen.
  • Perrow’s matrix is helpful even though it doesn’t tell us what exactly that “crazy failure” will look like. Simply knowing that a part of our system—or organization or project—is vulnerable helps us figure out if we need to reduce complexity and tight coupling and where we should concentrate our efforts. It’s a bit like wearing a seatbelt. The reason we buckle up isn’t that we have predicted the exact details of an impending accident and the injuries we’ll suffer. We wear seatbelts because we know that something unforeseeable might happen. We give ourselves a cushion of time when cooking an elaborate holiday dinner not because we know what will go wrong but because we know that something will. “You don’t need to predict it to prevent it,” Miller told us. “But you do need to treat complexity and coupling as key variables whenever you plan something or build something.” (Page 88)
  • A fundamental feature of complex systems is that we can’t find all the problems by simply thinking about them. Complexity can cause such strange and rare interactions that it’s impossible to predict most of the error chains that will emerge. But before they fall apart, complex systems give off warning signs that reveal these interactions. The systems themselves give us clues as to how they might unravel. (Page 141)
  • Over the course of several years, Rerup conducted an in-depth study of global pharmaceutical powerhouse Novo Nordisk, one of the world’s biggest insulin producers. In the early 1990s, Rerup found, it was difficult for anyone at Novo Nordisk to draw attention to even serious threats. “You had to convince your own boss, his boss, and his boss that this was an issue,” one senior vice president explained. “Then he had to convince his boss that it was a good idea to do things in a different way.” But, as in the childhood game of telephone—where a message gets more and more garbled as it passes between people—the issues became oversimplified as they worked their way up the chain of command. “What was written in the original version of the report . . . and which was an alarm bell for the specialist,” the CEO told Rerup, “was likely to be deleted in the version that senior management read.” (Page 146)
    • Dimension reduction, leading to stampede
  • Once an issue has been identified, the group brings together ad hoc teams from different departments and levels of seniority to dig into how it might affect their business and to figure out what they can do to prevent problems. The goal is to make sure that the company doesn’t ignore weak signs of brewing trouble.  (Page 147)
    • Environmental awareness as a deliberate counter to dimension reduction
  • We show that a deviation from the group opinion is regarded by the brain as a punishment,” said the study’s lead author, Vasily Klucharev. And the error message combined with a dampened reward signal produces a brain impulse indicating that we should adjust our opinion to match the consensus. Interestingly, this process occurs even if there is no reason for us to expect any punishment from the group. As Klucharev put it, “This is likely an automatic process in which people form their own opinion, hear the group view, and then quickly shift their opinion to make it more compliant with the group view.” (Page 154)
    • Reinforcement Learning Signal Predicts Social Conformity
      • Vasily Klucharev
      • We often change our decisions and judgments to conform with normative group behavior. However, the neural mechanisms of social conformity remain unclear. Here we show, using functional magnetic resonance imaging, that conformity is based on mechanisms that comply with principles of reinforcement learning. We found that individual judgments of facial attractiveness are adjusted in line with group opinion. Conflict with group opinion triggered a neuronal response in the rostral cingulate zone and the ventral striatum similar to the “prediction error” signal suggested by neuroscientific models of reinforcement learning. The amplitude of the conflict-related signal predicted subsequent conforming behavioral adjustments. Furthermore, the individual amplitude of the conflict-related signal in the ventral striatum correlated with differences in conforming behavior across subjects. These findings provide evidence that social group norms evoke conformity via learning mechanisms reflected in the activity of the rostral cingulate zone and ventral striatum.
  • When people agreed with their peers’ incorrect answers, there was little change in activity in the areas associated with conscious decision-making. Instead, the regions devoted to vision and spatial perception lit up. It’s not that people were consciously lying to fit in. It seems that the prevailing opinion actually changed their perceptions. If everyone else said the two objects were different, a participant might have started to notice differences even if the objects were identical. Our tendency for conformity can literally change what we see. (Page 155)
    • Gregory Berns
      • Dr. Berns specializes in the use of brain imaging technologies to understand human – and now, canine – motivation and decision-making.  He has received numerous grants from the National Institutes of Health, National Science Foundation, and the Department of Defense and has published over 70 peer-reviewed original research articles.
    • Neurobiological Correlates of Social Conformity and Independence During Mental Rotation
      • Background: When individual judgment conflicts with a group, the individual will often conform his judgment to that of the group. Conformity might arise at an executive level of decision making, or it might arise because the social setting alters the individual’s perception of the world.
      • Methods: We used functional magnetic resonance imaging and a task of mental rotation in the context of peer pressure to investigate the neural basis of individualistic and conforming behavior in the face of wrong information.
      • Results: Conformity was associated with functional changes in an occipital-parietal network, especially when the wrong information originated from other people. Independence was associated with increased amygdala and caudate activity, findings consistent with the assumptions of social norm theory about the behavioral saliency of standing alone.
      • Conclusions: These findings provide the first biological evidence for the involvement of perceptual and emotional processes during social conformity.
      • The Pain of Independence: Compared to behavioral research of conformity, comparatively little is known about the mechanisms of non-conformity, or independence. In one psychological framework, the group provides a normative influence on the individual. Depending on the particular situation, the group’s influence may be purely informational – providing information to an individual who is unsure of what to do. More interesting is the case in which the individual has definite opinions of what to do but conforms due to a normative influence of the group due to social reasons. In this model, normative influences are presumed to act through the aversiveness of being in a minority position
    • A Neural Basis for Social Cooperation
      • Cooperation based on reciprocal altruism has evolved in only a small number of species, yet it constitutes the core behavioral principle of human social life. The iterated Prisoner’s Dilemma Game has been used to model this form of cooperation. We used fMRI to scan 36 women as they played an iterated Prisoner’s Dilemma Game with another woman to investigate the neurobiological basis of cooperative social behavior. Mutual cooperation was associated with consistent activation in brain areas that have been linked with reward processing: nucleus accumbens, the caudate nucleus, ventromedial frontal/orbitofrontal cortex, and rostral anterior cingulate cortex. We propose that activation of this neural network positively reinforces reciprocal altruism, thereby motivating subjects to resist the temptation to selfishly accept but not reciprocate favors.
  • These results are alarming because dissent is a precious commodity in modern organizations. In a complex, tightly coupled system, it’s easy for people to miss important threats, and even seemingly small mistakes can have huge consequences. So speaking up when we notice a problem can make a big difference. (Page 155)
  • KRAWCHECK: I think when you get diverse groups together who’ve got these different backgrounds, there’s more permission in the room—as opposed to, “I can’t believe I don’t understand this and I’d better not ask because I might lose my job.” There’s permission to say, “I come from someplace else, can you run that by me one more time?” And I definitely saw that happen. But as time went on, the management teams became less diverse. And in fact, the financial services industry went into the downturn white, male and middle aged. And it came out whiter, maler and middle-aged-er. (Page 176)
  • “The diverse markets were much more accurate than the homogeneous markets,” said Evan Apfelbaum, an MIT professor and one of the study’s authors. “In homogeneous markets, if someone made a mistake, then others were more likely to copy it,” Apfelbaum told us. “In diverse groups, mistakes were much less likely to spread.” (Page 177)
  • Having minority traders wasn’t valuable because they contributed unique perspectives. Minority traders helped markets because, as the researchers put it, “their mere presence changed the tenor of decision making among all traders.” In diverse markets, everyone was more skeptical. (Page 178)
  • In diverse groups, we don’t trust each other’s judgment quite as much, and we call out the naked emperor. And that’s very valuable when dealing with a complex system. If small errors can be fatal, then giving others the benefit of the doubt when we think they are wrong is a recipe for disaster. Instead, we need to dig deeper and stay critical. Diversity helps us do that. (Page 180)
  • Ironically, lab experiments show that while homogeneous groups do less well on complex tasks, they report feeling more confident about their decisions. They enjoy the tasks they do as a group and think they are doing well. (Page 182)
    • Another stampede contribution
  • The third issue was the lack of productive conflict. When amateur directors were just a small minority on a board, it was hard for them to challenge the experts. On a board with many bankers, one CEO told the researchers, “Everybody respects each other’s ego at that table, and at the end of the day, they won’t really call each other out.” (Page 193)
    • Need to figure out what productive conflict is and how to measure it
  • Diversity is like a speed bump. It’s a nuisance, but it snaps us out of our comfort zone and makes it hard to barrel ahead without thinking. It saves us from ourselves. (Page 197)
  • A stranger is someone who is in a group but not of the group. Simmel’s archetypal stranger was the Jewish merchant in a medieval European town—someone who lived in the community but was different from the insiders. Someone close enough to understand the group, but at the same time, detached enough to have an outsider’s perspective. (Page 199)
    • Can AI be trained to be a stranger?
  • But Volkswagen didn’t just suffer from an authoritarian culture. As a corporate governance expert noted, “Volkswagen is well known for having a particularly poorly run and structured board: insular, inward-looking, and plagued with infighting.” On the firm’s twenty-member supervisory board, ten seats were reserved for Volkswagen workers, and the rest were split between senior managers and the company’s largest shareholders. Both Piëch and his wife, a former kindergarten teacher, sat on the board. There were no outsiders. This kind of insularity went well beyond the boardroom. As Milne put it, “Volkswagen is notoriously anti-outsider in terms of culture. Its leadership is very much homegrown.” And that leadership is grown in a strange place. Wolfsburg, where Volkswagen has its headquarters, is the ultimate company town. “It’s this incredibly peculiar place,” according to Milne. “It didn’t exist eighty years ago. It’s on a wind-swept plain between Hanover and Berlin. But it’s the richest town in Germany—thanks to Volkswagen. VW permeates everything. They’ve got their own butchers, they’ve got their own theme park; you don’t escape VW there. And everybody comes through this system.” (Page 209)
  • Most companies have lots of people with different skills. The problem is, when you bring people together to work on the same problem, if all they have are those individual skills . . . it’s very hard for them to collaborate. What tends to happen is that each individual discipline represents its own point of view. It basically becomes a negotiation at the table as to whose point of view wins, and that’s when you get gray compromises where the best you can achieve is the lowest common denominator between all points of view. The results are never spectacular but, at best, average. (Page 236)
    • The idea here is that there is either total consensus and groupthink, or grinding compromise. The authors are focussing too much on the ends of the spectrum. The environmentally aware, social middle is the sweet spot where flocking occurs.
  • Or think about driverless cars. They will almost certainly be safer than human drivers. They’ll eliminate accidents due to fatigued, distracted, and drunk driving. And if they’re well engineered, they won’t make the silly mistakes that we make, like changing lanes while another car is in our blind spot. At the same time, they’ll be susceptible to meltdowns—brought on by hackers or by interactions in the system that engineers didn’t anticipate. (Page 242)
  • We can design safer systems, make better decisions, notice warning signs, and learn from diverse, dissenting voices. Some of these solutions might seem obvious: Use structured tools when you face a tough decision. Learn from small failures to avoid big ones. Build diverse teams and listen to skeptics. And create systems with transparency and plenty of slack. (Page 242)

Thinking slow, acting reflexively

I just finished the cover story in Communications of the ACM on Human-Level Intelligence or Animal-Like Abilities?. Overall interesting and insightful, but what really caught my eye was Adnan Darwiche‘s discussion of models and maps:

  • “In his The Book of Why: The New Science of Cause and Effect, Judea Pearl explained further the differences between a (causal) model and a function, even though he did not use the term “function” explicitly. In Chapter 1, he wrote: “There is only one way a thinking entity (computer or human) can work out what would happen in multiple scenarios, including some that it has never experienced before. It must possess, consult, and manipulate a mental causal model of that reality.” He then gave an example of a navigation system based on either reasoning with a map (model) or consulting a GPS system that gives only a list of left-right turns for arriving at a destination (function). The rest of the discussion focused on what can be done with the model but not the function. Pearl’s argument particularly focused on how a model can handle novel scenarios (such as encountering roadblocks that invalidate the function recommendations) while pointing to the combinatorial impossibility of encoding such contingencies in the function, as it must have a bounded size.”
  • This is a Lists and Maps argument, and it leaves out stories, but it also implies something powerful that I need to start to think about. There is another interface, and it’s one that bridges human and machine, The dynamic model. What follows is a bunch of (at the moment – 10.8.18) incomplete thoughts. I think that models/games are another sociocultural interface, one that may be as affected by computers as the Ten Blue Links. So I’m using this as a staging area.
  • Games
    • Games and play are probably the oldest form of a dynamic model. Often, and particularly in groups, they are abstract simulations of conflict of some kind. It can be a simple game of skill such as Ringing the Bull, or a complex a wargame, such as chess:
      • “Historically chess must be classed as a game of war. Two players direct a conflict between two armies of equal strength upon a field of battle, circumscribed in extent, and offering no advantage of ground to either side. The players have no assistance other than that afforded by their own reasoning faculties, and the victory usually falls to the one whose strategical imagination is the greater, whose direction of his forces is the more skilful, whose ability to foresee positions is the more developed.” Murray, H.J.R.. A History of Chess: The Original 1913 Edition (Kindle Locations 576-579). Skyhorse Publishing. Kindle Edition.
    • Recently, video games afford games that can follow narrative templates:
      • Person vs. Fate/God
      • Person vs. Self
      • Person vs. Person
      • Person vs Society
      • Person vs. Nature
      • Person vs. Supernatural
      • Person vs. Technology
    • More on this later, because I think that this sort of computer-human interaction is really interesting, because it seems to open up spaces that would not be accessible to humans because of the data manipulation requirements (would flight simulators exist without non-human computation?).
  • Moving Maps
    • I would argue that the closer to interactive rates a model is, the more dynamic it is. A map is a static model, a snapshot of the current geopolitical space. Maps are dynamic because the underlying data is dynamic. Borders shift. Counties come into and go out of existence. Islands are created, and the coastline is eroded. And the next edition of map will incorporate these changes.
    • Online radar weather maps are an interesting case, since they reflect a rapidly changing environment and often now support playback of the last few hours (and prediction for the next few hours) of imagery at variable time scales.
  • Cognition
    • Traditional simulation and humans
      • Simulations provide a mechanism for humans to explore a space of possibilities that larger than what can be accomplished by purely mental means. Further, these simulations create artifacts that can be examined independently by other humans.
        • Every model is a theory—a very-well specified theory. In the case of simulations, the models are theories expressed in so much detail that their consequences can be checked by execution on a computer [Bryson, 2015]
      • The assumptions that provide the basis for the simulation are the model. The computer provides the dynamics. The use of simulation allows users to explore the space in the same way that one would explore the environment. Discoveries can be made that exist outside of the social constructs that led to the construction of the simulator and the assumptions that the simulator is based on.
      • What I think this means is that humans bring meaning to the outputs of the simulation. But it also means that there is a level of friction required to get from the outputs as they are computed to a desired level of meaningfulness to the users. In other words, if you have a theory of galaxy formation, but the results of the simulation only match observations if you have to add something new, like negative gravity, this could reflect a previously undiscovered component in the current theory of the formation of the universe.
      • I think this is the heart of my thinking. Just as maps allow the construction of trajectories across a physical (or belief) spaces, dynamic models such as simulation support ways of evaluating potential (and simplified/general) spaces that exist outside the realms of current understanding. This can be in the form of alternatives not yet encountered (a hurricane will hit the Florida panhandle on Thursday), or systems not yet understood (protein folding interactive simulators)
      • From At Home in the Universe: Physicists roll out this term, “universality class,” to refer to a class of models all of which exhibit the same robust behavior. So the behavior in question does not depend on the details of the model. Thus a variety of somewhat incorrect models of the real world may still succeed in telling us how the real world works, as long as the real world and the models lie in the same universality class. (Page 283)
    • Traditional simulation and ML(models and functions)
      • Darwiche discusses how the ML community has focused on “functional” AI at the expense of  “model-based” AI. I think his insight that functional AI is closer to reflex, and how there is an analogical similarity between it and “thinking fast“. Similarly, he believes that model-based AI may more resemble “thinking slow“.
      • I would contend that building simulators may be the slowest possible thinking. And I wonder if using simulators to train functional AI that can then be evaluated against real-world data, which is then used to modify the model in a “round trip” approach might be a way to use the fundamental understandability of simulation with the reflexive speed of trained NN systems.
      • What this means is that “slow” AI explicitly includes building testable models. The tests are not always going to be confirmation of predictions because of chaos theory. But there can be predictions of the characteristics of a model. For example, I’m working with using agent-based simulation moving in belief space to generate seeds for RNNs to produce strings that resemble conversations. Here, the prediction would be about the “spectral” characteristics of the conversation – how words change over time when compared to actual conversations where consensus evolves over time.

At Home in the Universe: The Search for the Laws of Self-Organization and Complexity

Kauffman’s NK model K large K medium K small fitness distanceFitness landscapes

At Home in the Universe: The Search for the Laws of Self-Organization and Complexity (Kindle Edition)

Stuart Kauffman (Wikipedia)

Quick takeaway:

  • The book’s central thesis is that complexity in general (and life in particular) is an inevitable consequence of self organizing principles that come into play with non-equilibrium systems. He explores the underlying principles in a variety of ways including binary networks, autocatalytic sets, NK models, and fitness landscapes, both static and co-evolving.
  • When I was reading this 20 year old book, I had the impression that his work, particularly on how fitness landscapes are explored have direct relevance to the construction of complex systems today. In particular I was struck by how applicable his work with fitness landscapes and NK models would be to the evaluation of the hyperparameter space associated with building Neural Networks.
  • Another point that I found particularly compelling is his descriptions of the incalculable size of the high-dimension spaces of combinatorial possibility. The number of potential combinations on even a smallish binary network would take more time in than the universe has to calculate. As such, there need to be mechanisms that allow for a faster, “good enough” evaluation of the space. That’s why we have historical narratives. They describe a path through this space that has worked. As an example, compare tic-tac-toe to chess. In the former, the every possibility in the game space can be known. Chess has too many possibilities, so instead there are openings, gambits, and endgames, discovered by chess masters that come to us as stories.


  • Chapter 1: At home in the Universe
    • In all these cases, the order that emerges depends on robust and typical properties of the systems, not on the details of structure and function. Under a vast range of different conditions, the order can barely help but express itself. (page 19)
    • Nonequilibrium ordered systems like the Great Red Spot are sustained by the persistent dissipation of matter and energy, and so were named dissipative structures by the Nobel laureate Ilya Prigogine some decades ago. These systems have received enormous attention. In part, the interest lies in their contrast to equilibrium thermodynamic systems, where equilibrium is associated with collapse to the most probable, least ordered states. In dissipative systems, the flux of matter and energy through the system is a driving force generating order. In part, the interest lies in the awareness that free-living systems are dissipative structures, complex metabolic whirlpools. (page 21).
    • The theory of computation is replete with deep theorems. Among the most beautiful are those showing that, in most cases by far, there exists no shorter means to predict what an algorithm will do than to simply execute it, observing the succession of actions and states as they unfold. The algorithm itself is its own shortest description. It is, in the jargon of the field, incompressible. (page 22)
    • And yet, even if it is true that evolution is such an incompressible process, it does not follow that we may not find deep and beautiful laws governing that unpredictable flow. For we are not precluded from the possibility that many features of organisms and their evolution are profoundly robust and insensitive to details. (page 23)
    • Strikingly, such coevolving systems also behave in an ordered regime, a chaotic regime, and a transition regime. (page 27)
      • Note that this reflects our Nomadic (chaotic), Flocking (transition) and Stampeding (ordered) states
    • This seemingly haphazard process also shows an ordered regime where poor compromises are found quickly, a chaotic regime where no compromise is ever settled on, and a phase transition where compromises are achieved, but not quickly. The best compromises appear to occur at the phase transition between order and chaos. (page 28)
  • Chapter 4: Order for Free
    • But evolution requires more than simply the ability to change, to undergo heritable variation. To engage in the Darwinian saga, a living system must first be able to strike an internal compromise between malleability and stability. To survive in a variable environment, it must be stable, to be sure, but not so stable that it remains forever static. Nor can it be so unstable that the slightest internal chemical fluctuation causes the whole teetering structure to collapse. (page 73)
    • To survive in a variable environment, it must be stable, to be sure, but not so stable that it remains forever static. Nor can it be so unstable that the slightest internal chemical fluctuation causes the whole teetering structure to collapse. (pg 73)
    • It is now well known that in most cells, such molecular feedback can give rise to complex chemical oscillations in time and space. (page 74)
      • Olfati-Saber and graph laplacians!
    • The point in using idealizations in science is that they help capture the main issues. Later one must show that the issues so captured are not altered by removing the idealizations. (page 75)
      • Start with observation, build initial simulation and then measure the difference and modify
    • If started in one state, over time the system will flow through some sequence of states. This sequence is called a trajectory (page 77)
      • I wonder if this can be portrayed as a map? You have to go through one state to get to the next. In autocatalytic systems there may be multiple systems that may be similar and yet have branch points (plant cells, animal cells, bacteria)
    • To answer these questions we need to understand the concept of an attractor. More than one trajectory can flow into the same state cycle. Start a network with any of these different initial patterns and, after churning through a sequence of states, it will settle into the same state cycle, the same pattern of blinking. In the language of dynamical systems, the state cycle is an attractor and the collection of trajectories that flow into it is called the basin of attraction. We can roughly think of an attractor as a lake, and the basin of attraction as the water drainage flowing into that lake. (page 78)
      • Also applicable to social and socio-technical systems. The technology changes the connectivity which could change the shape of the landscape
    • One feature is simply how many “inputs” control any lightbulb. If each bulb is controlled by only one or two other lightbulbs, if the network is “sparsely connected,” then the system exhibits stunning order. If each bulb is controlled by many other light-bulbs, then the network is chaotic. So “tuning” the connectivity of a network tunes whether one finds order or chaos. The second feature that controls the emergence of order or chaos is simple biases in the control rules themselves. Some control rules, the AND and OR Boolean functions we talked about, tend to create orderly dynamics. Other control rules create chaos. (page 80)
      • In our more velocity-oriented system, this is Social Influence Horizon) and is dynamic over time
    • Consider networks in which each lightbulb receives input from only one other. In these K = 1 networks, nothing very interesting happens. They quickly fall into very short state cycles, so short that they often consist of but a single state, a single pattern of illumination. Launch such a K = 1 network and it freezes up, saying the same thing over and over for all time. (page 81)
    • At the other end of the scale, consider networks in which K = N, meaning that each lightbulb receives an input from all lightbulbs, including itself. One quickly discovers that the length of the networks’ state cycles is the square root of the number of states. Consider the implications. For a network with only 200 binary variables—bulbs that can be on or off—there are 2200 or 1060 possible states. (page 81)
    • Such K = N networks do show signs of order, however. The number of attractors in a network, the number of lakes, is only N/e, where e is the basis of the natural logarithms, 2.71828. So a K = N network with 100,000 binary variables would harbor about 37,000 of these attractors. Of course, 37,000 is a big number, but very very much smaller than 2100,000, the size of its state space. (page 82)
      • Need to look into if there is some kind of equivalent in the SIH settings
    • The order arises, sudden and stunning, in K = 2 networks. For these well-behaved networks, the length of state cycles is not the square root of the number of states, but, roughly, the square root of the number of binary variables. Let’s pause to translate this as clearly as we can. Think of a randomly constructed Boolean network with N = 100,000 lightbulbs, each receiving K = 2 inputs. The “wiring diagram” would look like a madhatterly scrambled jumble, an impenetrable jungle. Each lightbulb has also been assigned at random a Boolean function. The logic is, therefore, a similar mad scramble, haphazardly assembled, mere junk. The system has 2100,000 or 1030,000 states—megaparsecs of possibilities—and what happens? The massive network quickly and meekly settles down and cycles among the square root of 100,000 states, a mere 317. (page 83)
    • The reason complex systems exist on, or in the ordered regime near, the edge of chaos is because evolution takes them there. (page 89)
  • Chapter 5: The Mystery of Ontology
    • Another way to ensure orderly behavior is to construct networks using what are called canalyzing Boolean functions. These Boolean rules have the easy property that at least one of the molecular inputs has one value, which might be 1 or 0, which by itself can completely determine the response of the regulated gene. The OR function is an example of a canalyzing function (Figure 5.3a). An element regulated by this function is active at the next moment if its first, or its second, or both inputs are active at the current moment. Thus if the first input is active, then the regulated element is guaranteed to be active at the next moment, regardless of the activity of the second input. (page 103)
      • This is max pooling
    • For most perturbations, a genomic system on any attractor will exhibit homeostatic return to the same attractor. The cell types are fundamentally stable. But for some perturbations, the system flows to a different attractor. So differentiation occurs naturally. And the further critical property is this: from any one attractor, it is possible to undergo transitions to only a few neighboring attractors, and from them other perturbations drive the system to still other attractors. Each lake, as it were, is close to only a few other lakes. (page 110)
  • Chapter 6: Noah’s Vessel
    • That we eat our meals rather than fusing with them marks, I believe, a profound fact. The biosphere itself is supracritical. Our cells are just subcritical. Were we to fuse with the salad, the molecular diversity this fusion would engender within our cells would unleash a cataclysmic supracritical explosion. The explosion of molecular novelty would soon be lethal to the unhappy cells harboring the explosion. The fact that we eat is not an accident, one of many conceivable methods evolution might have alighted on to get new molecules into our metabolic webs. Eating and digestion, I suspect, reflect our need to protect ourselves from the supracritical molecular diversity of the biosphere. (page 122)
    • We may be discovering a universal in biology, a new law: if our cells are subcritical, then, presumably, so too are all cells—bacteria, bracken, fern, bird, man. Throughout the supracritical explosion of the biosphere, cells since the Paleozoic, cells since the start, cells since 3.45 billion years ago must have remained subcritical. If so, then this subcritical–supracritical boundary must have always set an upper limit on the molecular diversity that can be housed within one cell. A limit exists, then, on the molecular complexity of the cell. (page 126)
    • If local ecosystems are metabolically poised at the subcritical–supracritical boundary, while the biosphere as a whole is supracritical? Then what a new tale we tell, of life cooperating to beget ever new kinds of molecules, and a biosphere where local ecosystems are poised at the boundary, but have collectively crept slowly upward in total diversity by the supracritical character of the whole planet. The whole biosphere is broadly collectively autocatalytic, catalyzing its own maintenance and ongoing molecular exploration. (page 130)
  • Chapter 8: High-Country Adventures
    • what would happen if, in addition to attempting to evolve such a computer program, we were more ambitious and attempted to evolve the shortest possible program that will carry out the task? Such a “shortest program” is one that is maximally compressed; that is, all redundancies have been squeezed out of it. Evolving a serial computer program is either very hard or essentially impossible because it is incredibly fragile. Serial computer programs contain instructions such as “compare two numbers and do such and such depending on which is larger” or “repeat the following action 1,000 times.” The computation performed is extremely sensitive to the order in which actions are carried out, the precise details of the logic, numbers of iterations, and so forth. The result is that almost any random change in a computer program produces “garbage.” Familiar computer programs are precisely the kind of complex systems that do not have the property that small changes in structure yield small changes in behavior. Almost all small changes in structure lead to catastrophic changes in behavior. (page 152)
      • This is the inherent problem we are grappling with in our “barely controlled systems”. All the elements involved are brittle and un-evolvable
    • It seems likely that there is no way to evolve a maximally compressed program in less time than it would take to exhaustively generate all possible programs, testing each to see if it carries out the desired task. When all redundancy has been squeezed from a program, virtually any change in any symbol would be expected to cause catastrophic variation in the behavior of the algorithm. Thus nearby variants in the program compute very different algorithms. (page 154)
    • because the program is maximally compressed, any change will cause catastrophic alterations in the computation performed. The fitness landscape is entirely random. The next fact is this: the landscape has only a few peaks that actually perform the desired algorithm. In fact, it has recently been shown by the mathematician Gregory Chaitin that for most problems there is only one or, at most, a few such minimal programs. It is intuitively clear that if the landscape is random, providing no clues about good directions to search, then at best the search must be a random or systematic search of all the 10300 possible programs to find the needle in the haystack, the possibly unique minimal program. This is just like finding Mont Blanc by searching every square meter of the Alps; the search time is, at best, proportional to the size of the program space. (page 155)
      • I’ve been thinking about hyperparameter tuning in the wrong way. There need(?) to be two approaches – one that works in evolvable spaces where there can be gradualism. The other approach cas to work in discontinuous regions, such as what activation function to use.
    • The question of what kinds of complex systems can be assembled by an evolutionary search process not only is important for understanding biology, but may be of practical importance in understanding technological and cultural evolution as well. The sensitivity of our most complex artifacts to catastrophic failure from tiny causes—for example, the Challenger disaster, the failed Mars Observer mission, and power-grid failures affecting large regions—suggests that we are now butting our heads against a problem that life has nuzzled for enormously longer periods: how to produce complex systems that do not teeter on the brink of collapse. Perhaps general principles governing search in vast spaces of possibilities cover all these diverse evolutionary processes, and will help us design—or even evolve—more robust systems. (page 157)
    • Once we understand the nature of these random landscapes and evolution on them, we will better appreciate what it is about organisms that is different, how their landscapes are nonrandom, and how that nonrandomness is critical to the evolutionary assembly of complex organisms. We will find reasons to believe that it is not natural selection alone that shapes the biosphere. Evolution requires landscapes that are not random. The deepest source of such landscapes may be the kind of principles of self-organization that we seek. (page 165)
    • On random landscapes, finding the global peak by searching uphill is totally useless; we have to search the entire space of possibilities. But even for modestly complex genotypes, or programs, that would take longer than the history of the universe. (page 167)
    • Things capable of evolving—metabolic webs of molecules, single cells, multicellular organisms, ecosystems, economic systems, people—all live and evolve on landscapes that themselves have a special property: they allow evolution to “work.” These real fitness landscapes, the types that underlie Darwin’s gradualism, are “correlated.” Nearby points tend to have similar heights. The high points are easier to find, for the terrain offers clues about the best directions in which to proceed. (page 169)
    • In short, the contribution to overall fitness of the organism of one state of one trait may depend in very complex ways on the states of many other traits. Similar issues arise if we think of a haploid genotype with N genes, each having two alleles. The fitness contribution of one allele of one gene to the whole organism may depend in complex ways on the alleles of other genes. Geneticists call this coupling between genes epistasis or epistatic coupling, meaning that genes at other places on the chromosomes affect the fitness contribution of a gene at a given place. (page 170)
    • The NK model captures such networks of epistatic couplings and models the complexity of the coupling effects. It models epistasis itself by assigning to each trait, or gene, epistatic “inputs” from K other traits or genes. Thus the fitness contribution of each gene depends on the gene’s own allele state, plus the allele states of the K other genes that affect that gene. (page 171)
    • I find the NK model fascinating because of this essential point: altering the number of epistatic inputs per gene, K, alters the ruggedness and number of peaks on the landscape. Altering K is like twisting a control knob. (page 172)
      •  This is really important and should also work with graph laplacians. In other words, not only can we model the connectivity, we can model the stiffness
    • our model organism, with its network of epistatic interactions among its genes, is caught in a web of conflicting constraints. The higher K is—the more interconnected the genes are—the more conflicting constraints exist, so the landscape becomes ever more rugged with ever more local peaks. (page 173)
      • This sounds oddly like how word2vec is calculated. Which implies that all connected neural networks are correlated and epistatic.
    • It is these conflicting constraints that make the landscape rugged and multipeaked. Because so many constraints are in conflict, there is a large number of rather modest compromise solutions rather than an obvious superb solution. (page 173)
      • Dimension reduction and polarization are a social solution to this problem
    • landscapes with moderate degrees of ruggedness share a striking feature: it is the highest peaks that can be scaled from the greatest number of initial positions! This is very encouraging, for it may help explain why evolutionary search does so well on this kind of landscape. On a rugged (but not random) landscape, an adaptive walk is more likely to climb to a high peak than a low one. If an adapting population were to “jump” randomly into such a landscape many times and climb uphill each time to a peak, we would find that there is a relationship between how high the peak is and how often the population climbed to it. If we turned our landscapes upside down and sought instead the lowest valleys, we would find that the deepest valleys drain the widest basins. (page 177)
    • The property that the highest peaks are the ones to which the largest fraction of genotypes can climb is not inevitable. The highest peaks could be very narrow but very high pinnacles on a low-lying landscape with modest broad hilltops. If an adapting population were released at a random spot and walked uphill, it would then find itself trapped on the top of a mere local hilltop. The exciting fact we have just discovered is that for an enormous family of rugged landscapes, the NK family, the highest peaks “drain” the largest basins. This may well be a very general property of most rugged landscapes reflecting complex webs of conflicting constraints. (page 177)
      •  I think this may be a function of how the landscapes are made. The K in NK somewhat dictates the amount of correlation
    • Recall another striking feature of random landscapes: with every step one takes uphill, the number of directions leading higher is cut by a constant fraction, one-half, so it becomes ever harder to keep improving. As it turns out, the same property shows up on almost any modestly rugged or very rugged landscape. Figure 8.9 shows the dwindling fraction of fitter neighbors along adaptive walks for different K values (Figure 8.9a) and the increased waiting times to find fitter variants for different K values (Figure 8.9b). Once K is modestly large, about K = 8 or greater, at each step uphill the number of directions uphill falls by a constant fraction, and the waiting time or number of tries to find that way uphill increases by a constant fraction. This means that as one climbs higher and higher, it becomes not just harder, but exponentially harder to find further directions uphill. So if one can make one try per unit time, the rate of improving slows exponentially. (page 178)
      • This is very important in understanding how hyperparameter space needs to be explored
    • Optimal solutions to one part of the overall design problem conflict with optimal solutions to other parts of the overall design. Then we must find compromise solutions to the joint problem that meet the conflicting constraints of the different subproblems. (page 179)
    • Selection, in crafting the kinds of organisms that exist, may also help craft the kinds of landscapes over which they evolve, picking landscapes that are most capable of supporting evolution—not only by mutation alone, but by recombination as well. Evolvability itself is a triumph. To benefit from mutation, recombination, and natural selection, a population must evolve on rugged but “well-correlated” landscapes. In the framework of NK landscapes, the “K knob” must be well tuned. (page 182)
      • This is going to be the trick for machine learning
    • even if the population is released on a local peak, it may not stay there! Simply put, the rate of mutation is so high that it causes the population to “diffuse” away from the peak faster than the selective differences between less fit and more fit mutants can return the population to the peak. An error catastrophe, first discovered by Nobel laureate Manfred Eigen and theoretical chemist Peter Schuster, has occurred, for the useful genetic information built up in the population is lost as the population diffuses away from the peak. (page 184)
    • Eigen and Schuster were the first to emphasize the importance of this error catastrophe, for it implies a limit to the power of natural selection. At a high enough mutation rate, an adapting population cannot assemble useful genetic variants into a working whole; instead, the mutation-induced “diffusion” over the space overcomes selection, pulling the population toward adaptive peaks. (page 184)
    • This limitation is even more marked when seen from another vantage point. Eigen and Schuster also emphasized that for a constant mutation rate per gene, the error catastrophe will arise when the number of genes in the genotype increases beyond a critical number. Thus there appears to be a limit on the complexity of a genome that can be assembled by mutation and selection! (page 184)
    • We are seeking a new conceptual framework that does not yet exist. Nowhere in science have we an adequate way to state and study the interleaving of self-organization, selection, chance, and design. We have no adequate framework for the place of law in a historical science and the place of history in a lawful science. (page 185)
      • This is the research part of the discussion in the iConference paper. Use the themes in the following paragraphs (self organization, selection, etc. ) to build up the areas that need to be discussed and researched.
    • The inevitability of historical accident is the third theme. We can have a rational morphology of crystals, because the number of space groups that atoms in a crystal can occupy is rather limited. We can have a periodic table of the elements because the number of stable arrangements of the subatomic constituents is relatively limited. But once at the level of chemistry, the space of possible molecules is vaster than the number of atoms in the universe. Once this is true, it is evident that the actual molecules in the biosphere are a tiny fraction of the space of the possible. Almost certainly, then, the molecules we see are to some extent the results of historical accidents in this history of life. History arises when the space of possibilities is too large by far for the actual to exhaust the possible. (page 186)
    • Here is a firm foothold: an evolutionary process, to be successful, requires that the landscapes it searches are more or less correlated. (page 186)
      • This is a meta design constraint that needs to be discussed (iConference? Antonio’s workshop?)
    • Nonequilibrium systems can be robust as well. A whirlpool dissipative system is robust in the sense that a wide variety of shapes of the container, flow rates, kinds of fluids, and initial conditions of the fluids lead to vortices that may persist for long periods. So small changes in the construction parameters of the system, and initial conditions, lead to small changes in behavior. (page 187)
    • Whirlpools are attractors in a dynamical system. Attractors, however, can be both stable and unstable. Instability arises in two senses. First, small changes in the construction of the system may dramatically alter the behavior of the system. Such systems are called structurally unstable. In addition, small changes in initial conditions, the butterfly effect, can sharply change subsequent behavior. Conversely, stable dynamical systems can be stable in both senses. Small changes in construction may typically lead to small changes in behavior. The system is structurally stable. And small changes in initial conditions can lead to small changes in behavior. (page 187)
    • We know that there is a clear link between the stability of the dynamical system and the ruggedness of the landscape over which it adapts. Chaotic Boolean networks, and many other classes of chaotic dynamical systems, are structurally unstable. Small changes wreak havoc on their behavior. Such systems adapt on very rugged landscapes. In contrast, Boolean networks in the ordered regime are only slightly modified by mutations to their structure. These networks adapt on relatively smooth fitness landscapes. (page 187)
    • We know from the NK landscape models discussed in this chapter that there is a relationship between the richness of conflicting constraints in a system and the ruggedness of the landscape over which it must evolve. We plausibly believe that selection can alter organisms and their components so as to modify the structure of the fitness landscapes over which those organisms evolve. By taking genomic networks from the chaotic to the ordered regime, selection tunes network behavior to be sure. By tuning epistatic coupling of genes, selection also tunes landscape structure from rugged to smooth. Changing the level of conflicting constraints in the construction of an organism from low to high tunes how rugged a landscape such organisms explore. (page 188)
    • And so we return to a tantalizing possibility: that self-organization is a prerequisite for evolvability, that it generates the kinds of structures that can benefit from natural selection. It generates structures that can evolve gradually, that are robust, for there is an inevitable relationship among spontaneous order, robustness, redundancy, gradualism, and correlated landscapes. Systems with redundancy have the property that many mutations cause no or only slight modifications in behavior. Redundancy yields gradualism. But another name for redundancy is robustness. Robust properties are ones that are insensitive to many detailed alterations. The robustness of the lipid vesicle, or of the cell type attractors in genomic networks in the ordered regime, is just another version of redundancy. Robustness is precisely what allows such systems to be molded by gradual accumulation of variations. Thus another name for redundancy is structural stability—a folded protein, an assembled virus, a Boolean network in the ordered regime. The stable structures and behaviors are ones that can be molded. (page 188)
      • This is why evolution may be the best approach for machine learning hyperparameter tuning
    • If this view is roughly correct, then precisely that which is self-organized and robust is what we are likely to see preeminently utilized by selection. (page 188)
    • The more rare and improbable the forms that selection seeks, the less typical and robust they are and the stronger will be the pressure of mutations to revert to what is typical and robust. (page 189)
  • Chapter 9: Organisms and Artifacts
    •  Might the same general laws govern major aspects of biological and technological evolution? Both organisms and artifacts confront conflicting design constraints. As shown, it is those constraints that create rugged fitness landscapes. Evolution explores its landscapes without the benefit of intention. We explore the landscapes of technological opportunity with intention, under the selective pressure of market forces. But if the underlying design problems result in similar rugged landscapes of conflicting constraints, it would not be astonishing if the same laws governed both biological and technological evolution. (page 192)
    • I begin by describing a simple, idealized kind of adaptive walk—long-jump adaptation—on a correlated but rugged landscape. We have already looked at adaptive walks that proceed by generating and selecting single mutations that lead to fitter variants. Here, an adaptive walk proceeds step-by-step in the space of possibilities, marching steadfastly uphill to a local peak. Suppose instead that we consider simultaneously making a large number of mutations that alter many features at once, so that the organism takes a “long jump” across its fitness landscape. Suppose we are in the Alps and take a single normal step. Typically, the altitude where we land is closely correlated with the altitude from which we started. There are, of course, catastrophic exceptions; cliffs do occur here and there. But suppose we jump 50 kilometers away. The altitude at which we land is essentially uncorrelated with the altitude from which we began, because we have jumped beyond what is called the correlation length of the landscape(page 192)
    • A very simple law governs such long-jump adaptation. The result, exactly mimicking adaptive walks via fitter single-mutant variants on random landscapes is this: every time one finds a fitter long-jump variant, the expected number of tries to find a still better long-jump variant doubles! (page 193)
      • Intelligence is computation, and expensive
    • As the number of genes increases, long-jump adaptations becomes less and less fruitful; the more complex an organism, the more difficult it is to make and accumulate useful drastic changes through natural selection. (Page 194)
    • The germane issue is this: the “universal law” governing long-jump adaptation suggests that adaptation on a correlated landscape should show three time scales—an observation that may bear on the Cambrian explosion. Suppose that we are adapting on a correlated, but rugged NK landscape, and begin evolving at an average fitness value. Since the initial position is of average fitness, half of all nearby variants will be better. But because of the correlation structure or shape of the landscape, those nearby variants are only slightly better. In contrast, consider distant variants. Because the initial point is of average fitness, again half the distant variants are fitter. But because the distant variants are far beyond the correlation length of the landscape, some of them can be very much fitter than the initial point. (By the same token, some distant variants can be very much worse.) Now consider an adaptive process in which some mutant variants change only a few genes, and hence search the nearby vicinity, while other variants mutate many genes, and hence search far away. Suppose that the fittest of the variants will tend to sweep through the population the fastest. Thus early in such an adaptive process, we might expect the distant variants, which are very much fitter than the nearby variants, to dominate the process. If the adapting population can branch in more than one direction, this should give rise to a branching process in which distant variants of the initial genotype, differing in many ways from one another as well, emerge rapidly. Thus early on, dramatically variant forms should arise from the initial stem. Just as in the Cambrian explosion, the species exhibiting the different major body plans, or phyla, are the first to appear. (Page 195)
    • Because the fraction of fitter nearby variants dwindles very much more slowly than in the long-jump case. In short, in the mid term of the process, the adaptive branching populations should begin to climb local hills. (Page 195)
    • The implication is this: when fitness is average, the fittest variants will be found far away. As fitness improves, the fittest variants will be found closer and closer to the current position. (Page 196)
      • So with hyperparameter tuning, change many variables initially, and reduce as the fitness results level out and proceed up the local hill
    • Uniting these two features of rugged but correlated landscapes, we should find radiation that initially both is bushy and occurs among dramatically different variants, and then quiets to scant branching among similar variants later on as fitness increases. (page 198)
    • Despite the fact that human crafting of artifacts is guided by intent and intelligence, both processes often confront problems of conflicting constraints. (Page 202)
      • Dimension reduction is a way of reducing those constraints, but the cost is ignoring the environment. Ideologies must be simple to allow for dense connection without conflict
    • As better designs are found, it becomes progressively harder to find further improvements, so variations become progressively more modest. Insofar as this is true, it is obviously reminiscent of the claims for the Cambrian explosion, where the higher taxa filled in from the top down. (Page 202)
      • This is a design trap. Since designing for more constraints limits hill climbing, designing for individuals and cultures could make everything grind to a halt. Designing for cultures needs to have a light footprint
    • There is something very familiar about this in the context of technological trajectories and learning effects: the rate of finding fitter variants (that is, making better products or producing them more cheaply) slows exponentially, and then ceases when a local optimum is found. This is already almost a restatement of two of the well-known aspects of learning effects. First, the total number of “tries” between finding fitter variants increases exponentially; thus we expect that increasingly long periods will pass with no improvements at all, and then rapid improvements as a fitter variant is suddenly found. Second, adaptive walks that are restricted to search the local neighborhood ultimately terminate on local optima. Further improvement ceases. (Page 204)
    • it seems worthwhile to consider seriously the possibility that the patterns of branching radiation in biological and technological evolution are governed by similar general laws. Not so surprising, this, for all these forms of adaptive evolution are exploring vast spaces of possibilities on more or less rugged “fitness” or “cost” landscapes. If the structures of such landscapes are broadly similar, the branching adaptive processes on them should also be similar. (Page 205)
  • Chapter 10: An Hour upon the Stage
    • The vast puzzle is that the emergent order in communities—in community assembly itself, in coevolution, and in the evolution of coevolution—almost certainly reflects selection acting at the level of the individual organism. (Page 208)
    • Models like those of Lotka and Volterra have provided ecologists with simple “laws” that may govern predator-prey relationships. Similar models study the population changes, or population dynamics, when species are linked into more complex communities with tens, hundreds, or thousands of species. Some of these links are “food webs,” which show which species eat which species. But communities are more complex than food webs, for two species may be mutualists, may be competitors, may be host and parasite, or may be coupled by a variety of other linkages. In general, the diverse populations in such model communities might exhibit simple steady-state patterns of behavior, complex oscillations, or chaotic behavior. (Page 211)
      • Building an ecology for intelligent machines means doing this. I guess we’ll find out what it’s like to build the garden of eden
    • Pimm and his colleagues have struggled to understand these phenomena and have arrived at ideas deeply similar to the models of fitness landscapes we discussed in Chapter 8 and 9. Different communities are imagined as points on a community landscape. Change the initial set of species, and the community will climb to a different peak, a different stable community. (Page 212)
    • In these models, Pimm and friends toss randomly chosen species into a “plot” and watch the population trajectories. If any species goes to zero population, hence extinct, it is “removed” from the plot. The results are both fascinating and still poorly understood. What one finds is that, at first, it is easy to add new species, but as more species are introduced, it becomes harder and harder. That is, more randomly chosen species must be tossed into the plot to find one that can survive with the rest of the assembling community. Eventually, the model community is saturated and stable; no further species can be added. (Page 212)
    • The community-assembly simulation studies are fascinating for a number of reasons beyond the distribution of extinction events. In particular, it is not obvious why model communities should “saturate,” so that it becomes increasingly difficult and finally impossible to add new species. If one constructs a “community landscape,” in which each point of the terrain represents a different combination of species, then the peaks will represent points of high fitness—combinations that are stable. While a species navigates a fitness landscape by mutating genes, a community navigates a community landscape by adding or deleting a species. Pimm argues that as communities climb higher and higher toward some fitness peak, the ascension becomes harder and harder. As the climb proceeds, there are fewer directions uphill, and hence it is harder to add new species. At a peak, no new species can be added. Saturation is attained. And from one initial point the community can climb to different local peaks, each representing a different stable community. (Page 214)
      • In belief spaces, this could help to explain the concept of velocity. It is mechanism for stumbling into new parts of the fitness landscape. And there is something about how ideas go stale.
    • In a coevolutionary arms race, when the Red Queen dominates, all species keep changing and changing their genotypes indefinitely in a never-ending race merely to sustain their fitness level. (Page 216)
      • This should also apply to belief spaces
    • Two main behaviors are imagined. The first image is of Red Queen behavior, where all organisms keep changing their genotypes in a persistent “arms race,” and hence the coevolving population never settles down to an unchanging mixture of genotypes. The second main image is of coevolving populations within or between species that reach a stable ratio of genotypes, an evolutionary stable strategy, and then stop altering genotypes. Red Queen behavior is, as we will soon see, a kind of chaotic behavior. ESS behavior, when all species stop changing, is a kind of ordered regime. (Page 221)
    • Just as we can use the NK model to show how genes interact with genes or how traits interact with traits within one organism, we can also use it to show how traits interact with traits between organisms in an ecosystem. (Page 225)
    • The ecosystem tends to settle into the ordered, evolutionary stable strategies regime if either epistatic connections, K, within each species are high, so that there are lots of peaks to become trapped on, or if couplings between species, C, is low, so landscapes do not deform much at the adaptive moves of the partners. Or an ESS regime might result when a third parameter, S, the number of species each species interacts with, is low, so that moves by one do not deform the landscapes of many others. (Page 226)
    • There is also a chaotic Red Queen regime where the species virtually never stop coevolving (Figure 10.4c). This Red Queen regime tends to occur when landscapes have few peaks to get trapped on, thus when K is low; when each landscape is deformed a great deal by adaptive moves of other species, thus when C is high; or when S is high so that each species is directly affected by very many other species. Basically, in this case, each species is chasing peaks that move away faster than the species can chase them. (Page 228)
    • At first, it might seem surprising that low K leads to chaotic ecosystems; in the NK Boolean networks, high K led to chaos. The more inter-couplings, the more likely a small change was to propagate throughout and cause the Boolean system to veer off into butterfly behavior. But with coupled landscapes it is the interconnectedness between the species that counts. When intercoupling, C, is high, moves by one species strongly deform the fitness landscapes of its partners. If any trait in the frog is affected by many traits in the fly, and vice versa, then a small change in traits of one species alters the landscape of the other a lot. The system will tend to be chaotic. Conversely, the ecosystem will tend to be in the ordered regime when the couplings between species, C, is sufficiently low. For much the same reason, if we were to keep K and C the same, but change the number of species S any one species directly interacts with, we would find that if the number is low the system will tend to be ordered, while if the number is high the ecosystem will tend to be chaotic. (Page 228)
      • There is something about Tajfel’s opposition identity that might lead to Red Queen scenarios. This would also help to explain the differences between left and right wing behaviours. Right wing is driven by “liberal tears” more than the opposition.
    • In fact, the results of our simulations suggest that the very highest fitness occurs precisely between ordered and chaotic behavior! (Page 228)
    • EpistasisTuning
    • Tuning an ecosystem. As the richness of epistatic connections between species, K, is increased, tuning the ecosystem from the chaotic to the orderly regime, average fitness at first increases and then decreases. It reaches the highest value midway between the extremes. The experiment is based on a 5 × 5 square lattice ecosystem, in which each of 25 species interacts with at most four other species. (Species on the corners of the lattice interact with two neighbors [CON = 2]; species on the edges interact with three neighbors [CON = 3]; and interior species interact with four neighbors [CON = 4]. N = 24, C = 1, S =25.) (page 229)
    • One might start the system with all species having very high K values, coevolving on very rugged landscapes, or all might have very low K values, coevolving on smooth landscapes. If K were not allowed to change, then deep within the high-K ordered regime, species would settle to ESS rapidly; that is, species would climb to poor local peaks and cling to them. In the second, low K, Red Queen chaotic regime, species would never attain fitness peaks. The story no longer stops there, however, for the species can now evolve the ruggedness of their landscapes, and the persistent attempts by species to invade new niches, when successful, will insert a new species into an old niche and may disrupt any ESS attained. (Page 232)
    • CoevolvingLandscapes
    • Figures 10.7 and 10.8 show these results. Each species has N = 44 traits; hence epistatic coupling can be as high as 43, creating random landscapes, or as low as 0, creating Fujiyama landscapes. As generations pass, the average value of K in the coevolving system converges onto an intermediate value of K, 15 to 25, and stays largely within this narrow range of intermediate landscape ruggedness (Above)). Here fitness is high, and the species do reach ESS equilibria where all genotypes stop changing for considerable periods of time, before some invader or invaders disrupt the balance by driving one or more of the coadapted species extinct. (Page 232)
    • When K is held high or low, deep in the ordered regime or deep in the chaotic regime, huge extinction avalanches rumble through the model ecosystems. The vast sizes of these events reflect the fact that fitness is low deep in the ordered regime because of the high-K conflicting constraints, and fitness is low deep in the chaotic regime because of the chaotic rising and plunging fitness values. In either case, low fitness of a species makes it very vulnerable to invasion and extinction. The very interesting result is that when the coevolving system can adjust its K range, it self-tunes to values where average fitness is as high as possible; therefore, the species are least vulnerable to invasion and extinction, so extinction avalanches appear to be as rare as possible. This shows up in Figure 10.8, which compares the size distribution and total number of extinction events deep in the ordered regime and after the system has self-tuned to optimize landscape ruggedness, K, and fitness. After the ecosystem self-tunes, the avalanches of extinction events remain a power law—the slope is about the same as when deep in the ordered regime. But over the same total number of generations, far fewer extinction events of each size occur. The self-tuned ecosystem also has far fewer extinction events than does an ecosystem deep in the chaotic regime. In short, the ecosystem self-tunes to minimize the rate of extinction! As if by an invisible hand, all the coevolving species appear to alter the rugged structures of the landscapes over which they evolve such that, on average, all have the highest fitness and survive as long as possible. (Page 234)
  • Chapter 11: In Search of Excellence
    • Organisms, artifacts, and organizations all evolve and coevolve on rugged, deforming, fitness landscapes. Organisms, artifacts, and organizations, when complex, all face conflicting constraints. So it can be no surprise if attempts to evolve toward good compromise solutions and designs must seek peaks on rugged landscapes. Nor, since the space of possibilities is typically vast, can it be a surprise that even human agents must search more or less blindly. Chess, after all, is a finite game, yet no grand master can sit at the board after two moves and concede defeat because the ultimate checkmate by the opponent 130 moves later can now be seen as inevitable. And chess is simple compared with most of real life. We may have our intentions, but we remain blind watchmakers. We are all, cells and CEOs, rather blindly climbing deforming fitness landscapes. If so, then the problems confronted by an organization—cellular, organismic, business, governmental, or otherwise—living in niches created by other organizations, is preeminently how to evolve on its deforming landscape, to track the moving peaks. (Page 247)
    • Evolution is a search procedure on rugged fixed or deforming landscapes. No search procedure can guarantee locating the global peak in an NP-hard problem in less time than that required to search the entire space of possibilities. And that, as we have repeatedly seen, can be hyperastronomical. Real cells, organisms, ecosystems, and, I suspect, real complex artifacts and real organizations never find the global optima of their fixed or deforming landscapes. The real task is to search out the excellent peaks and track them as the landscape deforms. Our “patches” logic appears to be one way complex systems and organizations can accomplish this. (Page 248)
    • The basic idea of the patch procedure is simple: take a hard, conflict-laden task in which many parts interact, and divide it into a quilt of nonoverlapping patches. Try to optimize within each patch. As this occurs, the couplings between parts in two patches across patch boundaries will mean that finding a “good” solution in one patch will change the problem to be solved by the parts in the adjacent patches. Since changes in each patch will alter the problems confronted by the neighboring patches, and the adaptive moves by those patches in turn will alter the problem faced by yet other patches, the system is just like our model coevolving ecosystems. Each patch is the analogue of what we called a species in Chapter 10. Each patch climbs toward fitness peaks on its own landscape, but in doing so deforms the fitness landscapes of its partners. As we saw, this process may spin out of control in Red Queen chaotic behavior and never converge on any good overall solution. Here, in this chaotic regime, our system is a crazy quilt of ceaseless changes. Alternatively, in the analogue of the evolutionary stable strategy (ESS) ordered regime, our system might freeze up, getting stuck on poor local peaks. Ecosystems, we saw, attained the highest average fitness if poised between Red Queen chaos and ESS order. We are about to see that if the entire conflict-laden task is broken into the properly chosen patches, the coevolving system lies at a phase transition between order and chaos and rapidly finds very good solutions. Patches, in short, may be a fundamental process we have evolved in our social systems, and perhaps elsewhere, to solve very hard problems. (Page 253)
    • It is the very fact that patches coevolve with one another that begins to hint at powerful advantages of patches compared with the Stalinist limit of a single large patch. What if, in the Stalinist limit, the entire lattice settles into a “bad” local minimum, one with high energy rather than an excellent low-energy minimum? The single-patch Stalinist system is stuck forever in the bad minimum. Now let’s think a bit. If we break the lattice up into four 5 × 5 patches just after the Stalinist system hits this bad minimum, what is the chance that this bad minimum is not only a local minimum for the lattice as a whole, but also a local minimum for each of the four 5 × 5 patches individually? You see, in order for the system broken into four patches to “stay” at the same bad minimum, it would have to be the case that the same minimum of the entire lattice happens also to be a minimum for all four of the 5 × 5 patches individually. If not, one or more of the patches will be able to flip a part, and hence begin to move. Once one patch begins to move, the entire lattice is no longer frozen in the bad local minimum. (Page 256)
    • Breaking large systems into patches allows the patches literally to coevolve with one another. Each climbs toward its fitness peaks, or energy minima, but its moves deform the fitness landscape or energy landscape of neighboring patches. (Page 257)
    • In the chaotic Leftist Italian limit, the average energy achieved by the lattice is only a slight bit less, about 0.47. In short, if the patches are too numerous and too small, the total system is in a disordered, chaotic regime. Parts keep flipping between their states, and the average energy of the lattice is high. (Page 258)
    • The answer depends on how rugged the landscape is. Our results suggest that if K is low so the landscape is highly correlated and quite smooth, the best results are found in the Stalinist limit. For simple problems with few conflicting constraints, there are few local minima in which to get trapped. But as the landscape becomes more rugged, reflecting the fact that the underlying number of conflicting constraints is becoming more severe, it appears best to break the total system into a number of patches such that the system is near the phase transition between order and chaos. (Page 258)
    • Here, then, is the first main and new result. It is by no means obvious that the lowest total energy of the lattice will be achieved if the lattice is broken into quilt patches, each of which tries to lower its own energy regardless of the effects on surrounding patches. Yet this is true. It can be a very good idea, if a problem is complex and full of conflicting constraints, to break it into patches, and let each patch try to optimize, such that all patches coevolve with one another. (Page 262)
    • But what, if anything, characterizes the optimum patch-size distribution? The edge of chaos. Small patches lead to chaos; large patches freeze into poor compromises. When an intermediate optimum patch size exists, it is typically very close to a transition between the ordered and the chaotic regime. (Page 262)
      • I’m pretty sure that this can be determined iteratively and within a desired epsilon. It should resemble the way a neural net converges on an accuracy.
    • I find it fascinating that hard problems with many linked variables and loads of conflicting constraints can be well solved by breaking the entire problem into nonoverlapping domains. Further, it is fascinating that as the conflicting constraints become worse, patches become ever more helpful. (Page 264)
    • I suspect that analogues of patches, systems having various kinds of local autonomy, may be a fundamental mechanism underlying adaptive evolution in ecosystems, economic systems, and cultural systems. (Page 254)
    • We are constructing global communication networks, and whipping off into space in fancy tin cans powered by Newton’s third law. The Challenger disaster, brownouts, the Hubble trouble, the hazards of failure in vast linked computer networks—our design marvels press against complexity boundaries we do not understand. (Page 265)
    • Patching systems so that they are poised on the edge of chaos may be extremely useful for two quite different reasons: not only do such systems rapidly attain good compromise solutions, but even more essentially, such poised systems should track the moving peaks on a changing landscape very well. The poised, edge-of-chaos systems are “nearly melted.” Suppose that the total landscape changes because external conditions alter. Then the detailed locations of local peaks will shift. A rigid system deep in the ordered regime will tend to cling stubbornly to its peaks. Poised systems should track shifting peaks more fluidly. (Page 266)
    • Misspecification arises all the time. Physicists and biologists, trying to figure out how complex biopolymers such as proteins fold their linear sequence of amino acids into compact three-dimensional structures, build models of the landscape guiding such folding and solve for the deep energy minima. Having done so, the scientists find that the real protein does not look like the predicted one. The physicists and biologists have “guessed” the wrong potential surface; they have guessed the wrong landscape and hence have solved the wrong hard problem. They are not fools, for we do not know the right problem. (Page 266)
      • Same for Hyperparameter tuning
    • We must learn how to learn in the face of persistent misspecification. Suppose we model the production facility, and learn from that model that a particular way to break it into patches is optimal, allowing the system to converge on a suggested solution. If we have misspecified the problem, the detailed solution is probably of little value. But it may often be the case that the optimal way to break the problem into patches is itself very insensitive to misspecifications of the problem. In the NK lattice and patch model we have studied, a slight change in the NK landscape energies will shift the locations of the minima substantially, but may not alter the fact that the lattice should still be broken into 6 × 6 patches. Therefore, rather than taking the suggested solution to the misspecified problem and imposing it on the real facility, it might be far smarter to take the suggested optimal patching of the misspecified problem, impose that on the real production facility, and then try to optimize performance within each of the now well-defined patches. In short, learning how to optimize the misspecified problem may not give us the solution to the real problem, but may teach us how learn about the real problem, how to break it into quilt patches that coevolve to find excellent solutions. (Page 267)
      • This is really worth looking at, because it can apply to round tripping simulation and real world systems as well. And a fitness test could be the time to divergence
    • receiver-based communication is roughly this: all the agents in a system that is trying to coordinate behavior let other agents know what is happening to them. The receivers of this information use it to decide what they are going to do. The receivers base their decisions on some overall specification of “team” goal. (Page 268)
    • ReceiverAttention
    • This observation suggests that it might be useful if, in our receiver-based communication system, we allowed sites to ignore some of their customers. Let’s say that each site pays attention to itself and a fraction, P, of its customers, and ignores 1 – P of them. What happens if we “tune” P? What happens is shown in Figure 11.8. The lowest energy for the entire lattice occurs when a small fraction of customers is ignored! As Figure 11.8 shows, if each site tries to help itself and all its customers, the system does less well than if each site pays attention, on average, to about 95 percent of its customers. In the actual numerical simulation, we do this by having each site consider each of its customers and pay attention to that customer with a 95 percent probability. In the limit where each site pays attention to no customers, of course, energy of the entire lattice is very high, and hence bad. (Page 268)
  • Chapter 12: An Emerging Global Civilization
    • Catalytic closure is not mysterious. But it is not a property of any single molecule; it is a property of a system of molecules. It is an emergent property. (Page 275)
    • But Fontana found a second type of reproduction. If he “disallowed” general copiers, so they did not arise and take over the soup, he found that he evolved precisely what I might have hoped for: collectively autocatalytic sets of Lisp expressions. That is, he found that his soup evolved to contain a “core metabolism” of Lisp expressions, each of which was formed as the product of the actions of one or more other Lisp expressions in the soup. (Page 278)
    • Fontana called copiers “level-0 organizations” and autocatalytic sets “level-1 organizations (Page 279)
    • The ever-transforming economy begins to sound like the ever-transforming biosphere, with trilobites dominating for a long, long run on Main Street Earth, replaced by other arthropods, then others again. If the patterns of the Cambrian explosion, filling in the higher taxa from the top down, bespeak the same patterns in early stages of a technological trajectory when many strong variants of an innovation are tried until a few dominant designs are chosen and the others go extinct, might it also be the case that the panorama of species evolution and coevolution, ever transforming, has its mirror in technological coevolution as well? Maybe principles deeper than DNA and gearboxes underlie biological and technological coevolution, principles about the kinds of complex things that can be assembled by a search process, and principles about the autocatalytic creation of niches that invite the innovations, which in turn create yet further niches. It would not be exceedingly strange were such general principles to exist. Organismic evolution and coevolution and technological evolution and coevolution are rather similar processes of niche creation and combinatorial optimization. While the nuts-and-bolts mechanisms underlying biological and technological evolution are obviously different, the tasks and resultant macroscopic features may be deeply similar. (Page 281)
    • The difficulty derives from the fact that economists have no obvious way to build a theory that incorporates what they call complementarities. The automobile and gasoline are consumption complementarities. You need both the car and the gas to go anywhere. (Page 282)
    • The use, I claim, is that we can discover the kinds of things that we would expect in the real world if our “as if” mock-up of the true laws lies in the same universality class. Physicists roll out this term, “universality class,” to refer to a class of models all of which exhibit the same robust behavior. So the behavior in question does not depend on the details of the model. Thus a variety of somewhat incorrect models of the real world may still succeed in telling us how the real world works, as long as the real world and the models lie in the same universality class. (Page 283)
    • An “enzyme” might be a symbol string in the same pot with a “template matching” (000) site somewhere in it. Here the “enzyme match rule” is that a 0 on the enzyme matches a 1 on the substrate, rather like nucleotide base-pairing. Then given such a rule for “enzymatic sites,” we can allow the symbol strings in the pot to act on one another. One way is to imagine two randomly chosen symbol strings colliding. If either string has an “enzymatic site” that matches a “substrate site” on the other, then the enzymatic site “acts on” the substrate site and carries out the substitution mandated in the corresponding row (Page 285)
    • Before we turn to economic models, let us consider some of the kinds of things that can happen in our pot of symbol strings as they act on one another, according to the laws of substitution we might happen to choose. A new world of possibilities lights up and may afford us clues about technological and other forms of evolution. Bear in mind that we can consider our strings as models of molecules, models of goods and services in an economy, perhaps even models of cultural memes such as fashions, roles, and ideas. Bear in mind that grammar models give us, for the first time, kinds of general “mathematical” or formal theories in which to study what sorts of patterns emerge when “entities” can be both the “object” acted on and transformed and the entities that do the acting, creating niches for one another in their unfolding. Grammar models, therefore, help make evident patterns we know about intuitively but cannot talk about very precisely. (Page 287)
    • These grammar models also suggest a possible new factor in economic takeoff: diversity probably begets diversity; hence diversity may help beget growth. (Page 292)

        Diversity begets growth opportunities. Pure growth is fastest in a monoculture of simple items with short maturity cycles

    • DiversityThe number of renewable goods with which an economy is endowed is plotted against the number of pairs of symbol strings in the grammar, which captures the hypothetical “laws of substitutability and complementarity.” A curve separates a subcritical regime below the curve and a supracritical regime above the curve. As the diversity of renewable resources or the complexity of the grammar rules increases, the system explodes with a diversity of products. (Page 193)
    • Friend, you cannot even predict the motions of three coupled pendula. You have hardly a prayer with three mutually gravitating objects. We let loose pesticides on our crops; the insects become ill and are eaten by birds that sicken and die, allowing the insects to proliferate in increased abundance. The crops are destroyed. So much for control. Bacon, you were brilliant, but the world is more complex than your philosophy. (Page 302)

When Worlds Collide


Charlottesville demonstrations, summer 2017 (link)

I’ve been thinking about this picture a lot recently. My research explores how extremist groups can develop using modern computer-mediated communication, particularly recommender systems. This picture lays the main parts like a set of nested puzzle pieces.

This is a picture of a physical event. In August 2017, various “Alt-Right” online communities came to Charlottesville Virginia to ostensibly protest the removal of confederate statues, which in turn was a response to the Charleston South Carolina church shooting of 2015. From August 11th through 12th, sanctioned and unsanctioned protests and counter protests happened in and around Emancipation Park.

Although this is not a runway in Paris, London or New York, this photo contains what I can best call “fashion statements”, in the most serious use of the term. They are mechanisms for signifying and conveying identity,  immediately visible. What are they trying to say to each other and to us, the public behind the camera?

Standing on the right hand of the image is a middle-aged white man, wearing a type of uniform: On his cap and shirt are images of the confederate “battle flag”. He is wearing a military-style camouflage vest and is carrying an AR-15 rifle and a 9mm handgun. These are archetypal components of the Alt-right identity.

He is yelling at a young black man moving in from the left side of the photo, who is also wearing a uniform of a sort. In addition to the black t-shirt and the dreadlocks, he is carrying multiple cameras – the sine qua non of credibility for young black men in modern America. Lastly, he is wearing literal chains and shackles, ensuring that no one will forget the slave heritage behind these protests.

Let’s consider these carried items, the cameras and the guns. The fashion accessories, if you will.

Cameras exist to record a selected instant of reality. It may be framed, with parts left out and others enhanced, but photographs and videos are a compelling document that something in the world happened. Further, these are internet-connected cameras, capable of sharing their content widely and quickly. These two elements, photographic evidence and distribution are a foundation of the #blacklivesmatter movement, which is a response to the wide distribution of videos where American police killed unarmed black men. These videos changed the greater social understanding of a reality encountered by a minority that was incomprehensible by the majority before these videos emerged.

Now to the other accessory, the guns. They are mechanisms “of violence to compel our opponent to fulfil our will”. Unlike cameras, which are used to provide a perspective of reality , these weapons are used to create a reality through their display and their threatened use. They also reflect a perception  of those that wield them that the world has become so threatening that battlefield weapons make sense at a public event.

Oddly, this is may also be a picture of an introduction of sorts. Alt-right and #blacklivesmatter groups almost certainly interact significantly. In fact, it is doubtful that, even though they speak in a common language , one group can comprehend the other. The trajectories of their defining stories are so different, so misaligned, that the concepts of one slide off the brain of the other.

Within each group, it is a different story. Each group shares a common narrative, that is expressed in words, appearance, and belief. And within each group, there is discussion and consensus. These are the most extreme examples of the people that we see in the photo. I don’t see anyone else in the image wearing chains or openly carrying guns. The presence of these individuals within their respective groups exerts a pull on the overall orientation and position of the group in the things that they will accept. Additionally, the individuals in one group can cluster in opposition to a different group, which is a pressure that drives each group further apart.

Lastly, we come to the third actor in the image, the viewer. The photo is taken by Shelby Lum, an award-winning staff photographer for the Richmond Times-Dispatch. Through framing, focus and timing, she captures this frame that tells this story.  Looking at this photo, we the audience feel that we understand the situation. But photographs are inherently simplifying. The audience fills in the gaps – what’s happened before, the backstory of the people in the image. This image can mean many things to many people. And as such, it’s what we do with that photo – what we say about it and what we connect with it that makes the image as much about us as it is about the characters within the frame.

It is those interactions that I focus on, the ways that we as populations interact with information that supports, expands, or undermines our beliefs. My theory is that humans move through belief space like animals move on the planes of the Serengeti. And just as the status of the ecosystem can be inferred through the behaviors of its animal population, the health and status of our belief spaces can be determined by our digital behaviors.

Using this approach, I believe that we may be able look at populations at scale to determine the “health” of the underlying information. Wildebeest behave differently in risky environments. Their patterns of congregation are different. They can stampede, particularly when the terrain is against them, such as a narrow water crossing. Humans can behave in similar ways for example when their core beliefs about their identity is challenged, such as when Galileo was tried by the church for essentially moving man from the literal center of the universe..

I think that this sort of approach can be used to identify at-risk (stampeding) groups and provide avenues for intervention that can “nudge” groups off of dangerous trajectories. It may also be possible to recognize the presence of deliberate actors attempting to drive groups into dangerous terrain, like native Americans driving buffalo off of pishkun cliffs, or more recently the Russian Internet Research Agency instigating and coordinating a #bluelivesmatter and a #blacklivesmatter demonstration to occur at the same time and place in Texas.

This theory is based on simulations that are based on the assumption that people coordinate in high-dimensional belief spaces based on orientation, velocity and social influence. Rather than coming to a static consensus, these interactions are dynamic and follow intuitions of belief movement across information terrain. That dynamic process is what I’ll be discussing over the next several posts.

Karl Marx and the Tradition of Western Political Thought – The Broken Thread of Tradition

Hanna Arendt – Thinking Without a Banister

These two connected statements had already been torn asunder by a tradition that translated the one by declaring that man is a social being, a banality for which one would not have needed Aristotle, and the other by defining man as the animal rationale, the reasoning animal. (pg 23)

What Aristotle had seen as one and the same human quality, to live together with others in the modus of speaking, now became two distinct characteristics, to have reason and to be social. And these two characteristics, almost from the beginning, were not thought merely to be distinct, but antagonistic to each other: the conflict between man’s rationality and his sociability can be seen throughout our tradition of political thought (pg 23)

The law was now no longer the boundary (which the citizens ought to defend like the walls of the city, because it had the same function for the citizens’ political life as the city’s wall had for their physical existence and distinctness, as Heraclitus had said), but became a yardstick by which rule could be measured. Rule now either conformed to or overruled the law, and in the latter case the rule was called tyrannical usually, although not necessarily, exerted by one man-and therefore a kind of perverted monarchy. From then on, law and power became the two conceptual pillars of all definitions of government, and these definitions hardly changed during the more than two thousand years that separate Aristotle from Montesquieu. (pg 28)

But bureaucracy should not be mistaken for totalitarian domination. If the October Revolution had been permitted to follow the lines prescribed by Marx and Lenin, which was not the case, it would probably have resulted in bureaucratic rule. The rule of nobody, not anarchy, or disappearance of rule, or oppression, is the ever present danger of any society based on universal equality. (pg 33)

In Marx’s own opinion, what made socialism scientific and distinguished it from that of his predecessors, the “utopian socialists,” was not an economic theory with its scientific insights as well as its errors, but the discovery of a law of movement that ruled matter and, at the same time, showed itself in the reasoning capacity of man as “consciousness,” either of the self or of a class. (pg 35)

The logic of dialectal movement enables Marx to combine nature with history, or matter with man; man becomes the author of a meaningful, comprehensible history because his metabolism with nature, unlike an animal’s, is not merely consumptive but requires an activity, namely, labor. For Marx labor is the uniting link between matter and man, between nature and history. He is a “materialist” insofar as the specifically human form of consuming matter is to him the beginning of everything (pg 35)

Politics, in other words, is derivative in a twofold sense: it has its origin in the pre-political data of biological life, and it has its end in the post-political, highest possibility of human destiny (pg 40)

the fact that the multitude, whom the Greeks called hoi polloi, threatens the existence of every single person, runs like a red thread throughout the centuries that separate Plato from the modern age. In this context it is irrelevant whether this attitude expresses itself in secular terms, as in Plato and Aristotle, or if it does so in the terms of Christianity. (pg 40)

true Christians wohnen fern voneinander, that is, dwell far from each other and are as forlorn among the multitude as were the ancient philosophers. (pg 41)

Each new birth endangers the continuity of the polis because with each new birth a new world potentially comes into being. The laws hedge in these new beginnings and guarantee the preexistence of a common world, the permanence of a a continuity that transcends the individual life span of e each generation, and in which each single man in his mortality can hope to leave a trace of permanence behind him. (pg 46)

introduced the terms nomo and physei, by law or by nature. Thus, the order of the universe, the kosmos of natural things, was differentiated from the world of human affairs, whose order is laid down by men since it is an order of things made and done by men This distinction too, survives in the beginning of our tradition, where Aristotle expressly States that political science deals with things that are nomo and not physei. (pg 47)



Beyond Individual Choice

Beyond Individual Choice: Teams and Frames in Game Theory

  • Michael Bacharach
  • Natalie Gold
  • Robert Sugden
  • From In the classical tradition of game theory, Bacharach models human beings as rational actors, but he revises the standard definition of rationality to incorporate two major new ideas. He enlarges the model of a game so that it includes the ways agents describe to themselves (or “frame”) their decision problems. And he allows the possibility that people reason as members of groups (or “teams”), each taking herself to have reason to perform her component of the combination of actions that best achieves the group’s common goal. Bacharach shows that certain tendencies for individuals to engage in team reasoning are consistent with recent findings in social psychology and evolutionary biology.
  • The following list of notes is oldest (bottom) to newest (top)
  • It is a central component of resolute choice, as presented by McClennen, that (unless new information becomes available) later transient agents recognise the authority of plans made by earlier agents. Being resolute just is recognising that authority (although McClennen’ s arguments for the rationality and psychological feasibility of resoluteness apply only in cases in which the earlier agents’ plans further the common ends of earlier and later agents). This feature of resolute choice is similar to Bacharach’ s analysis of direction, explained in section 5. If the relationship between transient agents is modelled as a sequential game, resolute choice can be thought of as a form of direction, in which the first transient agent plays the role of director; the plan chosen by that agent can be thought of as a message sent by the director to the other agents. To the extent that each later agent is confident that this plan is in the best interests of the continuing person, that confidence derives from the belief that the first agent identified with the person and that she was sufficiently rational and informed to judge which sequence of actions would best serve the person’s objectives. (pg 197)
  • The problem posed by Heads and Tails is not that the players lack a common understanding of salience; it is that game theory lacks an adequate explanation of how salience affects the decisions of rational players. All we gain by adding preplay communication to the model is the realisation that game theory also lacks an adequate explanation of how costless messages affect the decisions of rational players. (pg 180)
  • The fundamental principle of this morality is that what each agent ought to do is to co-operate, with whoever else is co-operating, in the production of the best consequences possible given the behaviour of non-co-operators’ (Regan 1980, p. 124). (pg 167)
  • Ordered On Social Facts
    • Are social groups real in any sense that is independent of the thoughts, actions, and beliefs of the individuals making up the group? Using methods of philosophy to examine such longstanding sociological questions, Margaret Gilbert gives a general characterization of the core phenomena at issue in the domain of human social life.
  • Schema 3: Team reasoning (from a group viewpoint) pg 153
    • We are the members of S.
    • Each of us identifies with S.
    • Each of us wants the value of U to be maximized.
    • A uniquely maximizes U.
    • Each of us should choose her component of A.
  • Schema 4: Team reasoning (from an individual viewpoint) pg 159
    • I am a member of S.
    • It is common knowledge in S that each member of S identifies
      with S.
    • It is common knowledge in S that each member of S wants the
      value of U to be maximized.
    • It is common knowledge in S that A uniquely maximizes U.
    • I should choose my component of A.
  • Schema 7: Basic team reasoning pg 161
    • I am a member of S.
    • It is common knowledge in S that each member of S identifies
      with S.
    • It is common knowledge in S that each member of S wants the
      value of U to be maximized.
    • It is common knowledge in S that each member of S knows his
      component of the profile that uniquely maximizes U.
    • I should choose my component of the profile that uniquely
      maximizes U.

      • Bacharach notes to himself the ‘hunch’ that this schema is ‘the basic rational capacity’ which leads to high in Hi-Lo, and that it ‘seems to be indispensable if a group is ever to choose the best plan in the most ordinary organizational circumstances’. Notice that Schema 7 does not require that the individual who uses it know everyone’s component of the profile that maximizes U.
  • His hypothesis is that group identification is an individual’s psychological response to the stimulus of a particular decision situation. It is not in itself a group action. (To treat it as a group action would, in Bacharach’ s framework, lead to an infinite regress.) In the theory of circumspect team reasoning, the parameter w is interpreted as a property of a psychological mechanism-the probability that a person who confronts the relevant stimulus will respond by framing the situation as a problem ‘for us’. The idea is that, in coming to frame the situation as a problem ‘for us’, an individual also gains some sense of how likely it is that another individual would frame it in the same way; in this way, the value of w becomes common knowledge among those who use this frame. (Compare the case of the large cube in the game of Large and Small Cubes, discussed in section 4 of the introduction.) Given this model, it seems that the ‘us’ in terms of which the problem is framed must be determined by how the decision situation first appears to each individual. Thus, except in the special case in which w == 1, we must distinguish S (the group with which individuals are liable to identify, given the nature of the decision situation) from T (the set of individuals who in fact identify with S). pg 163
  • The psychology of group identity allows us to understand that group identification can be due to factors that have nothing to do with the individual preferences. Strong interdependence and other forms of common individual interest are one sort of favouring condition, but there are many others, such as comembership of some existing social group, sharing a birthday, and the artificial categories of the minimal group paradigm. (pg 150)
  • Wherever we may expect group identity we may also expect team reasoning. The effect of team reasoning on behavior is different from that of individualistic reasoning. We have already seen this for Hi-Lo. This has wide implications. It makes the theory of team reasoning a much more powerful explanatory and predictive theory than it would be if it came on line only in games with th3e right kind of common interest. To take just one example, if management brings it about so that the firm’s employees identify with the firm, we may expect for them to team-reason and so to make choices that are not predicted by the standard theories of rational choice.(pg 150)
  • As we have seen, the same person passes through many group identities in the flux of life, and even on a single occasion more than one of these identities may be stimulated. So we will need a model of identity in which the probability of a person’s identification is distributed over not just two alternatives-personal self-identity or identity with a fixed group-but, in principle, arbitrarily many. (pg 151)
  • The explanatory potential of team reasoning is not confined to pure coordination games like Hi-Lo. Team reasoning is assuredly important for its role in explaining the mystery facts about Hi-Lo; but I think we have stumbled on something bigger than a new theory of behaviour in pure coordination games. The key to endogenous group identification is not identity of interest but common interest giving rise to strong interdependence. There is common interest in Stag Hunts, Battles of the Sexes, bargaining games and even Prisoner’s Dilemmas. Indeed, in any interaction modelable as a ‘mixed motive’ game there is an element of common interest. Moreover, in most of the landmark cases, including the Prisoner’s Dilemma, the common interest is of the kind that creates strong interdependence, and so on the account of chapter 2 creates pressure for group identification. And given group identification, we should expect team reasoning.(pg 144)
  • There is a second evolutionary argument in favour of the spontaneous team-reasoning hypothesis. Suppose there are two alternative mental mechanisms that, given common interest, would lead humans to act to further that interest. Other things being equal, the cognitively cheapest reliable mechanism will be favoured by selection. As Sober and Wilson (1998) put it, mechanisms will be selected that score well on availability, reliability and energy efficiency. Team reasoning meets these criteria; more exactly, it does better on them than the alternative heuristics suggested in the game theory and psychology literature for the efficient solution of common-interest games. (pg 146)
  • BIC_pg 149 (pg 149)
  • I think MB is getting at the theory for why there is explore/exploit in populations
  • We have progressed towards a plausible explanation of the behavioural fact about Hi-Lo. It is explicable as an outcome of group identification by the players, because this is likely to produce a way of reasoning, team reasoning, that at once yields A. Team reasoning satisfies the conditions for the mode-P reasoning that we concluded in chapter 1 must be operative if people are ever to reason their way to A. It avoids magical thinking. It takes the profile-selection problem by the scruff of the neck. What explains its onset is an agency transformation in the mind of the player; this agency transformation leads naturally to profile-based reasoning and is a natural consequence of self-identification with the player group. (pg 142)
  • Hi-Lo induces group identification. A bit more fully: the circumstances of Hi-Lo cause each player to tend to group-identify as a member of the group G whose membership is the player-set and whose goal is the shared payoff. (pg 142)
  • If what induces A-choices is a piece of reasoning which is part of our mental constitution, we are likely to have the impression that choosing A is obviously right. Moreover, if the piece of reasoning does not involve a belief that the coplayer is bounded, we will feel that choosing A is obviously right against a player as intelligent as ourselves; that is, our intuitions will be an instance of the judgemental fact. I suspect, too, that if the reasoning schema we use is valid, rather than involving falacy, our intuitions of reality are likely to be more robust. Later I shall argue that team reasoning is indeed nonfallacious. (pg 143)
    • I think this is more than “as intelligent as ourselves”, I think this is a position/orientation/velocity case. I find it compelling that people with different POVs regard each other as ‘stupid’
    • When framing tendencies are culture-wide, people in whom a certain frame is operative are aware that it may be operative in others; and if its availability is high, those in it think that it is likely to be operative in others. Here the framing tendency is-so goes my claim-universal, and a fortiori it is culture-wide. (pg 144)
    • But for the theory of endogenous team reasoning there are two differences between the Hi-Lo case and these other cases of strong interdependence. First, outside Hi-Los there are counterpressures towards individual self-identification and so I-framing of the problem. In my model this comes out as a reduction in the salience of the strong interdependence, or an increase in that of other features. One would expect these pressures to be very strong in games like Prisoner’s Dilemma, and the fact that C rates are in the 40 per cent range rather than the 90 percent range, so far from surprising, is a prediction of the present theory (pg 144)
      • This is where MB starts to get to explore/exploit in populations. There are pressueres that drive groups together and apart. And as individuals, our thresholds for group identification varies
  • Now it is the case, and increasingly widely recognized to be, that in games in general there’s no way players can rationally deliberate to a Nash equilibrium. Rather, classical canons of rationality do not in general support playing in Nash equilibria. So it looks as though shared intentions cannot, in the general run of games, by classical canons, be rationally formed! And that means in the general run of life as well. This is highly paradoxical if you think that rational people can have shared intentions. The paradox is not resolved by the thought that when they do, the context is not a game: any situation in which people have to make the sorts of decisions that issue in shared intentions must be a game, which is, after all, just a situation in which combinations of actions matter to the combining parties. (pg 139)
  • Turn to the idea that a joint intention to do (x,y) is rationally produced in 1 and 2 by common knowledge of two conditional intentions: Pl has the intention expressed by ‘I’ll do x if and only if she does y’, and P2 the counterpart one. Clearly P1 doesn’t have the intention to do x if and. only if P2 in fact does y whether or not Pl believes P2 will do y; the right condition must be along the lines of:
    (C1) P1 intends to do x if and only if she believes P2 will do y. (pg 139)

    • So this is in belief space, and belief is based on awareness and trust
  • There are two obstacles to showing this, one superable, the other not, I think. First, there are two Nash equilibria, and nothing in the setup to suggest that some standard refinement (strengthening) of the Nash equilibrium condition will eliminate one. However, I suspect that my description of the situation could be refined without ‘changing the subject’. Perhaps the conditional intention Cl should really be ‘I’ll do x if and only if she’ll do y, and that’s what I would like best’. For example, if x and y are the two obligations in a contract being discussed, it is natural to suppose that Pl thinks that both signing would be better than neither signing. If we accept this gloss then the payoff structure becomes a Stag Hunt – Hi-Lo if both are worse off out of equilibrium than in the poor equilibrium (x’ ,y’). To help the cause of rationally deriving the joint intention (x,y), assume the Hi-Lo case. What are the prospects now? As I have shown in chapter 1, there is no chance of deriving (x,y) by the classical canons, and the only (so far proposed) way of doing to is by team reasoning. (pg 140)
  • The nature of team reasoning, and of the conditions under which it is likely to be primed in individual agents, has a consequence that gives further support to this claim. This is that joint intentions arrived at by the route of team reasoning involve, in the individual agents, a ‘sense of collectivity’. The nature of team reasoning has this effect, because the team reasoner asks herself not ‘What should I do?’ but ‘What should we do?’ So, to team-reason, you must already be in a frame in which first-person plural concepts are activated. The priming conditions for team reasoning have this effect because, as we shall see later in this chapter, team reasoning, for a shared objective, is likely to arise spontaneously in an individual who is in the psychological state of group-identifying with the set of interdependent actors; and to self-identify as a member of a group essentially involves a sense of collectivity. (pg 141)
  • One of the things that MB seems to be saying here is that group identification has two parts. First is the self-identification with the group. Second is the mechanism that supports that framing. You can’t belong to a group you don’t see.
  • To generalize the notions of team mechanism and team to unreliable contexts, we need the idea of the profile that gets enacted if all the agents function under a mechanism. Call this the protocol delivered by the mechanism. The protocol is , roughly, what everyone is supposed to do, what everyone does if the mechanism functions without any failure. But because there may well be failures, the protocol of a mechanism may not get enacted, some agents not playing their part but doing their default actions instead. For this reason the best protocol to have is not in general the first-best profile o*. In judging mechanisms we must take account of the states of the world in which there are failures, with their associated probabilities. How? Put it this way: if we are choosing a mechanism, we want one that delivers the protocol that maximizes the expected value of U. (pg 131)
  • Group identification is a framing phenomenon. Among the many different dimensions of the frame of a decision-maker is the ‘unit of agency’ dimension: the framing agent may think of herself as an individual doer or as part of some collective doer. The first type of frame is operative in ordinary game-theoretic, individualistic reasoning, and the second in team reasoning. The concept-clusters of these two basic framings center round ‘I/ she/he’ concepts and ‘we’ concepts respectively. Players in the two types of frame begin their reasoning with the two basic conceptualizations of the situation, as a ‘What shall I do?’ problem, and a ‘What shall we do?’ problem, respectively. (pg 137)
  • A mechanism is a general process. The idea (which I here leave only roughly stated) is of a causal process which determines (wholly or partly) what the agents do in any simple coordination context. It will be seen that all the examples I have mentioned are of this kind; contrast a mechanism that applies, say, only in two-person cases, or only to matching games, or only in business affairs. In particular, team reasoning is this kind of thing. It applies to any simple coordination context whatsoever. It is a mode of reasoning rather than an argument specific to a context. (pg 126)
  • In particular, [if U is Paretian] the correct theory of Hi-Lo says that all play A. In short, an intuition in favour of C’ supports A-playing in Hi-Lo if we believe that all players are rational and there is one rationality. (pg 130)
    • Another form of dimension reduction – “We are all the same”
  • There are many conceivable team mechanisms apart from simple direction and team reasoning; they differ in the way in which computation is distributed and the pattern of message sending. For example, one agent might compute o* and send instructions to the others. With the exception of team reasoning, these mechanisms involve the communication of information. If they do I shall call them modes of organization or protocols. (pg 125)
  • A mechanism is a general process. The idea (which I here leave only roughly stated) is of a causal process which determines (wholly or partly) what the agents do in any simple coordination context. It will be seen that all the examples I have mentioned are of this kind; contrast a mechanism that applies, say, only in two-person cases, or only to matching games, or only in business affairs. In particular, team reasoning is this kind of thing. It applies to any simple coordination context whatsoever. It is a mode of reasoning rather than an argument specific to a context. (pg 126)
  •  .
    • BIC_102 (page 102)
    • BIC107 (pg 107)
    • BIC107b (pg 107)
  • Evolutionary reasons for cooperation as group fitness, where group payoff is maximized. This makes the stag salient in stag hunt.
  • Explaining the evolution of any human behavior trait (say, a tendency to play C in Prisoner’s Dilemmas) raises three questions. The first is the behavior selection question: why did this trait, rather than some other, get selected by natural selection? Answering this involves giving details of the selection process, and saying what made the disposition confer fitness in the ecology in which selection took place. But now note that ‘When a behavior evolves, a proximate mechanism also must evolve that allows the organism to produce the target behavior. Ivy plants grow toward the light. This is a behavior, broadly construed. For phototropism to evolve, there must be some mechanism inside of ivy plants that causes them to grow in one direction rather than in another’ (Sober and Wilson 1998, pp. 199-200). This raises the second question, the production question: how is the behavior produced within the individual-what is the ‘proximate mechanism’? In the human case, the interest is often in a psychological mechanism: we ask what perceptual, affective and cognitive processes issue in the behavior. Finally, note that these processes must also have evolved, so an answer to the second question brings a third: why did this proximate mechanism evolve rather than some other that could have produced the same behavior? This is the mechanism selection question. (pg 95)
    • These are good questions to answer, or at least address. Roughly, I thing my answers are
      • Selection Question: The three phases are a very efficient way to exploit an environment
      • Production Question: Neural coupling, as developed in physical swarms and moving on to cognitive clustering
      • Mechanism Question: Oscillator frequency locking provides a natural foundation for  collective behavior. Dimension reduction is how axis are selected for matching.
  • “We need to know, in detail, what deliberations are like that people engage in when they group-identify”. Also, agency transformationAgencyTransformation
  • Dimension reduction is a form of induced conceptual myopia (pg 89)? Conceptual Myopia
  • GroupIdentification
  • Group as Frame
  • Categorizatino and bias

Three views of the Odyssey

  • I’ve been thinking of ways to describe the differences between information visualizations with respect to maps, diagrams, and lists. Here’s The Odyssey as a geographic map:
  • Odysseus'_Journey
  • The first thing that I notice is just how far Odysseus travelled. That’s about half of the Mediterranean! I thought that it all happened close to Greece. Maps afford this understanding. They are diagrams that support the plotting of trajectories.Which brings me to the point that we lose a lot of information about relationships in narratives. That’s not their point. This doesn’t mean that non-map diagrams don’t help sometimes. Here’s a chart of the characters and their relationships in the Odyssey:
  •  odyssey
  • There is a lot of information here that is helpful. And this I do remember and understood from reading the book. Stories are good about depicting how people interact. But though this chart shows relationships, the layout does not really support navigation. For example, the gods are all related by blood and can pretty much contact each other at will. This chart would have Poseidon accessing Aeolus and  Circe by going through Odysseus.  So this chart is not a map.
  • Lastly, is the relationship that comes at us through search. Because the implicit geographic information about the Odyssey is not specifically in the text, a search request within the corpora cannot produce a result that lets us integrate it
  • OdysseySearchJourney
  • There is a lot of ambiguity in this result, which is similar to other searches that I tried which included travelsail and other descriptive terms. This doesn’t mean that it’s bad, it just shows how search does not handle context well. It’s not designed to. It’s designed around precision and recall. Context requires a deeper understanding about meaning, and even such recent innovations such as sharded views with cards, single answers, and pro/con results only skim the surface of providing situationally appropriate, meaningful context.

Some thoughts on alignment in belief space. 


A nagging question for me is why phase locking, a naturally occurring phenomenon, was selected for to produce collective intelligence instead of something else. My intuition is that building communities using rules of physical and cognitive alignment takes advantage of randomness to produce a good balance of explore/exploit behaviors in the population.

Flocking depends on the ability to align, based on a relationship with neighbors. The ease of alignment is proportional to two things (I think).

  1. A low number of dimensions. The fewer the dimensions, the easier the alignment. It is easier to get a herd of cattle to stampede in a slot canyon than an open field. This is the fundamental piece.
  2. A contributing factor to the type of collective behavior is the turning rate with respect to velocity. The easier it is to turn, the easier it is to flock. It’s no accident that starlings, a small, nimble bird, can produce murmurations. Larger birds, such as geese, have much less dynamic formations.

This applies to belief space as well. It is easier for people to agree when a concept is simplified. Similarly, the pattern of consensus will reflect the groups’ overall acceptance or resistance to change. I think this is a critical difference between a progressive and a reactionary. 

Within an established population that exhibits collective behavior, there should be two things then:

  1. A shared perception of a low-dimension physical/belief space
  2. A similar velocity and turning rate between individuals

I’m going to assume that like in most populations, these qualities have a normal distribution. There will be a majority that have very common dimension perception, velocity, and turning rates. There will also be individuals at either tail of the population. At one end, there will be those who see the world very simply. At the other, there will be those who see complexity where the majority don’t. At one end, there will be those who cannot adapt to any change. At the other, there will be those who hold no fixed opinion on anything.

Flocking depends, on alignment. But the individuals at the extremes will have difficulty staying with the relative safety of the flock. This means that there will be selection pressures. Those individuals who oversimplify and are unable to change direction should be selected against. When it’s more important to attend to your neighbors that find food, things don’t end well. What happens at the other end?

There is one tail of this population that produces nimble individuals that perceive a greater complexity in the world. They also have difficulty staying with the flock, because their patterns of behavior are influenced by things that they perceive that the rest of the flock does not. In cooperative game theory, this ‘noticing too much’ disrupts the common frames (alignment) that groups use to make implicit decisions (page 14).

I believe that these individuals become explorers. Explorers are also selected against, but not as much. The additional perception provides a better understanding of potential threats. Nimbleness helps to prevent getting caught. These explores provide an extended footprint for the population, which means greater resilience if the primary population encounters problems.

A population can rebuild from an explorer diaspora. Initially, the population will consist of too many explorers, and will have poor collective behaviors, but over time, selection pressures will push the mean so that there is sufficient alignment for flocking, but not so much that there is regular stampeding.

A final thought. There is no reason that these selection pressures exist only on populations that use genes to control their evolution. Looked at, for example, a machine learning context, the options can be restated (loosely) in statistical language:

  1. Nomadic: Overfit to the environment terms and underfit to the social term
  2. Flocking: Fit with rough equivalence to the environmental and social terms
  3. Stampede: Overfit to the social term and underfit to the environmental term

Since it is always computationally more efficient to align tightly with a population that is moving in the right direction (it’s copying your answers from your classmates), there will always be pressure to move towards stampedes. The resiliency offered by nomadic exploration is a long term investment that does not have a short term payoff. The compromise of flocking gives most of the benefits of either extreme, but it is a saddle point, always under the threat of unanticipated externalities.

When intelligent machines come, they will not be tuned by millions of years of evolution to be resilient, to have all those non-optimal behaviors that “even the odds”, should something unforeseen happen. At least initially, they will be constructed to provide the highest possible return on investment. And, like high-frequency trading systems, stampedes, in the form of bubbles and crashes will happen.

We need to understand this phenomena much more thoroughly, and begin to incorporate concepts like diversity and limited social influence horizons into our designs.

Schooling as a strategy for taxis in a noisy environment

Schooling as a strategy for taxis in a noisy environment

Journal: Evolutionary Ecology: Evolutionary Ecology is a conceptually oriented journal of basic biology at the interface of ecology and evolution. The journal publishes original research, reviews and discussion papers dealing with evolutionary ecology, including evolutionary aspects of behavioral and population ecology. The objective is to promote the conceptual, theoretical and empirical development of ecology and evolutionary biology; the scope extends to all organisms and systems. Research papers present the results of empirical and theoretical investigations, testing current theories in evolutionary ecology.

Author: Daniel Grunbaum: My research program seeks to establish quantitative relationships between short-term, small-scale processes, such as individual movement behaviors, and their long-term, large-scale population level effects, such as population fluxes and distributions.


  • A common strategy to overcome this problem is taxis, a behaviour in which an animal performs a biased random walk by changing direction more rapidly when local conditions are getting worse.
    • Consider voters switching from Bush->Obama->Trump
  • Such an animal spends more time moving in right directions than wrong ones, and eventually gets to a favourable area. Taxis is ineffcient, however, when environmental gradients are weak or overlain by `noisy’ small-scale fluctuations. In this paper, I show that schooling behaviour can improve the ability of animals performing taxis to climb gradients, even under conditions when asocial taxis would be ineffective. Schooling is a social behaviour incorporating tendencies to remain close to and align with fellow members of a group. It enhances taxis because the alignment tendency produces tight angular distributions within groups, and dampens the stochastic effects of individual sampling errors. As a result, more school members orient up-gradient than in the comparable asocial case. However, overly strong schooling behaviour makes the school slow in responding to changing gradient directions. This trade-off suggests an optimal level of schooling behaviour for given spatio-temporal scales of environmental variations.
    • This has implications for everything from human social interaction to ANN design.


  • Because limiting resources typically have `patchy’ distributions in which concentrations may vary by orders of magnitude, success or failure in finding favourable areas often has an enormous impact on growth rates and reproductive success. To locate resource concentrations, many aquatic organisms display tactic behaviours, in which they orient with respect to local variations in chemical stimuli or other environmental properties. (pp 503)
  • Here, I propose that schooling behaviours improve the tactic capabilities of school members, and enable them to climb faint and noisy gradients which they would otherwise be unable to follow. (pp 504)
  • Schooling is thought to result from two principal behavioural components: (1) tendencies to move towards neighbours when isolated, and away from them when too close, so that the group retains a characteristic level of compactness; and (2) tendencies to align orientation with those of neighbours, so that nearby animals have similar directions of travel and the group as a whole exhibits a directional polarity. (pp 504)
    • My models indicate that attraction isn’t required, as long as there is a distance-graded awareness. In other words, you align most strongly with those agents that are closest.
  • I focus in this paper on schooling in aquatic animals, and particularly on phytoplankton as a distributed resource. However, although I do not examine them specifically, the modelling approaches and the basic results apply more generally to other environmental properties (such as temperature), to other causes of population movement (such as migration) and to other socially aggregating species which form polarized groups (such as flocks, herds and swarms). (pp 504)
  • Under these circumstances, the search of a nektonic filter-feeder for large-scale concentrations of phytoplankton is analogous to the behaviour of a bacterium performing chemotaxis. The essence of the analogy is that, while higher animals have much more sophisticated sensory and cognitive capacities, the scale at which they sample their environment is too small to identify accurately the true gradient. (pp 505)
    • And, I would contend for determining optimal social interactions in large groups.
  • Bacteria using chemotaxis usually do not directly sense the direction of the gradient. Instead, they perform random walks in which they change direction more often or by a greater amount if conditions are deteriorating than if they are improving (Keller and Segel, 1971; Alt, 1980; Tranquillo, 1990). Thus, on average, individuals spend more time moving in favourable directions than in unfavourable ones. (pp 505)
  • A bacterial analogy has been applied to a variety of behaviours in more complex organisms, such as spatially varying di€usion rates due to foraging behaviours or food-handling in copepods and larval ®sh (Davis et al., 1991), migration patterns in tuna (Mullen, 1989) and restricted area searching in ladybugs (Kareiva and Odell, 1987) and seabirds (Veit et al., 1993, 1995). The analogy provides for these higher animals a quantitative prediction of distribution patterns and abilities to locate resources at large space and time scales, based on measurable characteristics of small-scale movements. (pp 505)
  • I do not consider more sophisticated (and possibly more effective) social tactic algorithms, in which explicit information about the environment at remote points is actively or passively transmitted between individuals, or in which individual algorithms (such as slowing down when in relatively high concentrations) cause the group to function as a single sensing unit (Kils, 1986, described in Pitcher and Parrish, 1993). (pp 506)
    • This is something that could be easily added to the model. There could be a multiplier for each data cell that acts as a velocity scalar of the flock. That should have significant effects! This could also be applied to gradient descent. The flock of Gradient Descent Agents (GDAs) could have a higher speed across the fitness landscape, but slow and change direction when a better value is found by one of the GDAs. It occurs to me that this would work with a step function, as long as the baseline of the flock is sufficiently broad.
  • When the noise predominates (d <= 1), the angular distribution of individuals is nearly uniform, and the up-gradient velocity is near zero. In a range of intermediate values of d(0.3 <= d <= 3), there is measurable but slow movement up-gradient. The question I will address in the next two sections is: Can individuals in this intermediate signal-to-noise range with slow gradient-climbing rates improve their tactic ability by adopting a social behaviour (i.e. schooling)? (pp 508)
  • The key attributes of these models are: (1) a decreasing probability of detection or responsiveness to neighbours at large separation distances; (2) a social response that includes some sort of switch from attractive to repulsive interactions with neighbours, mediated by either separation distance or local density of animals*; and (3) a tendency to align with neighbours (Inagaki et al., 1976; Matuda and Sannomiya, 1980, 1985; Aoki, 1982; Huth and Wissel, 1990, 1992; Warburton and Lazarus, 1991; Grunbaum, 1994). (pp 508)
    • * Though not true of belief behavior (multiple individuals can share the same belief), for a Gradient Descent Agent (GDA), the idea of attraction/repulsion may be important.
  • If the number of neighbours is within an acceptable range, then the individual does not respond to them. On the other hand, if the number is outside that range, the individual turns by a small amount, Δθ3, to the left or right according to whether it has too many or too few of them and which side has more neighbours. In addition, at each time step, each individual randomly chooses one of its visible neighbours and turns by a small amount, Δθ4, towards that neighbour’s heading. (pp 508)
  • The results of simulations based on these rules show that schooling individuals, on average, move more directly in an up-gradient direction than asocial searchers with the same tactic parameters. Figure 4 shows the distribution of individuals in simulations of asocial and social taxis in a periodic domain (i.e. animals crossing the right boundary re-enter the left boundary, etc.). (pp 509)
  • Gradient Schooling
  • As predicted by Equation (5), asocial taxis results in a broad distribution of orientations, with a peak in the up-gradient (positive x-axis) direction but with a large fraction of individuals moving the wrong way at any given time (Fig. 5a,b). By comparison, schooling individuals tend to align with one another, forming a group with a tightened angular distribution. There is stochasticity in the average velocity of both asocial and social searchers (Fig. 5c). On average, however, schooling individuals move up-gradient faster and more directly than asocial ones. These simulation results demonstrate that it is theoretically possible to devise tactic search strategies utilizing social behaviours that are superior to asocial algorithms. That is, one of the advantages of schooling is that, potentially, it allows more successful search strategies under `noisy’ environmental conditions, where variations on the micro-scales at which animals sense their environment obscure the macro-scale gradients between ecologically favourable and unfavourable regions. (pp 510)
  • School-size effects must depend to some extent on the tactic and schooling algorithms, and the choices of parameters. However, underlying social taxis are the statistics of pooling outcomes of independent decisions, so the numerical dependence on school size may operate in a similar manner for many comparable behavioural schemes. For example, it seems reasonable to expect that, in many alternative schooling and tactic algorithms, decisions made collectively by less than 10 individuals would show some improvement over the asocial case but also retain much of the variability. Similarly, in most scenarios, group statistics probably vary only slowly with group size once it reaches sizes of 50-100. (pp 514)
  • when group size becomes large, the behaviour of model schools changes in character. With numerous individuals, stochasticity in the behaviour of each member has a relatively weaker effect on group motion. The behaviour of the group as a whole becomes more consistent and predictable, for longer time periods. (pp 514)
    • I think that this should be true in belief spaces as well. It may be difficult to track one person’s trajectory, but a group in aggregate, particularly a polarized group may be very detectable.
  • An example of group response to changing gradient direction shows that there can be a cost to strong alignment tendency. In this example, the gradient is initially pointed in the negative y-direction (Fig. 9). After an initial period of 5 time units, during which the gradient orients perpendicularly to the x-axis, the gradient reverts to the usual x-direction orientation. The school must then adjust to its new surroundings by shifting to climb the new gradient. This example shows that alignment works against course adjustment: the stronger the tendency to align, the slower is the group’s reorientation to the new gradient direction. This is apparently due to a non-linear interaction between alignment and taxis: asymmetries in the angular distribution during the transition create a net alignment flux away from the gradient direction. Thus, individuals that pay too much attention to neighbours, and allow alignment to overwhelm their tactic tendencies, may travel rapidly and persistently in the wrong direction. (pp 516)
    • So, if alignment (and velocity matching) are strong enough, the conditions for a stampede (group behavior with negative outcomes – in this case, less food) emerge
  • The models also suggest that there is a trade-off in strengthening tendencies to align with neighbours: strong alignment produces tight angular distributions, but increases the time needed to adjust course when the direction of the gradient changes. A reasonable balance seems to be achieved when individuals take roughly the same time to coalesce into a polarized group as they do to orient to the gradient in asocial taxis. (pp 518)
    • There is something about the relationship between explore and exploit in this statement that I really need to think about.
  • Social taxis is potentially effective in animals whose resources vary substantially over large length scales and for whom movements over these scales are possible. (pp 518)
    • Surviving as a social animal requires staying in the group. Since belief can cover wide ranges (e.g. religion), does there need to be a mechanism where individuals can harmonize their beliefs? From Social Norms and Other Minds The Evolutionary Roots of Higher Cognition :  Field research on primate societies in the wild and in captivity clearly shows that the capacity for (at least) implicit appreciation of permission, prohibition, and obligation social norms is directly related to survival rates and reproductive success. Without at least a rudimentary capacity to recognize and respond appropriately to these structures, remaining within a social group characterized by a dominance hierarchy would be all but impossible.
  • Interestingly, krill have been reported to school until a food patch has been discovered, whereupon they disperse to feed, consistent with a searching function for schooling. The apparent effectiveness of schooling as a strategy for taxis suggests that these schooling animals may be better able to climb obscure large-scale gradients than they would were they asocial. Interactive effects of taxis and sociality may affect the evolutionary value of larger groups both directly, by improving foraging ability with group size, and indirectly, by constraining alignment rates. (pp 518)
  • An example where sociality directly affects foraging strategy is forage area copying, in which unsuccessful fish move to the vicinity of neighbours that are observed to be foraging successfully (Pitcher et al., 1982; Ranta and Kaitala, 1991; Pitcher and Parrish, 1993). Pitcher and House (1987) interpreted area copying in goldfish as the result of a two-stage decision process: (1) a decision to stay put or move depending on whether feeding rate is high or low; and (2) a decision to join neighbours or not based upon whether or not further solitary searching is successful. Similar group dynamics have been observed in foraging seabirds (Porter and Seally, 1982; Haney et al., 1992).
  • Synchrokinesis depends upon the school having a relatively large spatial extent: part of a migrating school encounters an especially favourable or unfavourable area. The response of that section of the school is propagated throughout the school by alignment and grouping behaviours, with the result that the school as a whole is more effective at route-finding than isolated individuals. Forage area copying and synchrokinesis are distinct from social taxis in that an individual discovers and reacts to an environmental feature or resource, and fellow group members exploit that discovery. In social taxis, no individual need ever have greater knowledge about the environment than any other — social taxis is essentially bound up in the statistics of pooling the outcomes of many unreliable decisions. Synchrokinesis and social taxis are complementary mechanisms and may be expected to co-occur in migrating and gradient-climbing schools. (pp 519)
  • For example, in the comparisons of taxis among groups of various sizes, the most successful individuals were in the asocial simulation, even though as a fraction of the entire population they were vanishingly small. (pp 519)
    • Explorers have the highest payoff for the highest risks

Alignment in social interactions

Alignment in social interactions (2016)

Journal: Consciousness and Cognition, an International Journal, provides a forum for a natural science approach to the issues of consciousnessvoluntary control, and self. The journal features empirical research (in the form of articles) and theoretical reviews. The journal aims to be both scientifically rigorous and open to novel contributions.

Mattia Gallotti (Scholar):  Manager of The Human Mind Project at the School of Advanced Study of the University of London. I have a keen interest in academic management and governance, and I now consult on aspects of social innovation in the public sector.

Merle Theresa Fairhurst-MenuhinMerle is equally driven by a passion for art and science. Her days are split between work in cognitive neuroscience and exploring the rich repertoire of art song.

Chris Frith (Scholar):  I have been trying to delineate the mechanisms underlying the human ability to share representations of the world, for it is this ability that makes communication possible and allows us to achieve more than we could as individuals. We think that there are two major processes involved. The first is an automatic form of priming (sometimes referred to as contagion or empathy), whereby our representations of the world become aligned with those of the person with whom we are interacting. The second is a form of forward modelling, analogous to that used in the control of our own actions.


  • According to the prevailing paradigm in social-cognitive neuroscience, the mental states of individuals become shared when they adapt to each other in the pursuit of a shared goal. We challenge this view by proposing an alternative approach to the cognitive foundations of social interactions. The central claim of this paper is that social cognition concerns the graded and dynamic process of alignment of individual minds, even in the absence of a shared goal. When individuals reciprocally exchange information about each other’s minds processes of alignment unfold over time and across space, creating a social interaction. Not all cases of joint action involve such reciprocal exchange of information. To understand the nature of social interactions, then, we propose that attention should be focused on the manner in which people align words and thoughts, bodily postures and movements, in order to take one another into account and to make full use of socially relevant information.


  • The concept of alignment has since evolved and is used to describe the multi-level, dynamic, and interactive mechanisms that underpin the sharing of people’s mental attitudes and representations in all kinds of social interactions (Dale, Fusaroli, & Duran, 2013). (pp 253)
  • The underlying justification for subsuming all these cases under the same mechanism is that cognition and action cannot be separated. The sharing of minds and bodies can then be conceptualized in terms of an integrated system of alignment, defined as the dynamic coupling of behavioural and/or cognitive states of two people (Dumas, Laroche, & Lehmann, 2014). (pp 253)
  • we are interested in the explanatory significance of alignment for a more general theory of social interaction, not in instrumental behaviour and/or alignment per se. (pp 254)
  • The central claim of this paper is that the alignment of minds, which emerges in social interactions, involves the reciprocal exchange of information whereby individuals adjust minds and bodies in a graded and dynamic manner. As these processes of alignment unfold, interacting partners will exchange information about each other’s minds and therefore act socially, whether or not a shared goal is in place. (pp 254)
  • In particular, in recent theoretical and empirical work on social cognition, reciprocity is increasingly recognized as a useful resource to capture the ‘‘jointness” of a joint action.Interpersonal understanding can be achieved by reading into one another’s mind reciprocally (Butterfill, 2013), and an explanation of the processes whereby the alignment of minds and bodies unfolds in space and time should involve an account of reciprocity (Zahavi & Rochat, 2015). In the process of a reciprocal exchange of information, individuals may adapt to varying degrees to one another. This is certainly the case in instances of temporal synchronisation and coordination in which physical alignment in time and space has been theorized to depend on cognitive models of adaptation (Elliott, Chua, & Wing, 2016Hayashi & Kondo, 2013Repp & Su, 2013) and thus on reciprocal interactions (D’Ausilio, Novembre, Fadiga, & Keller, 2015Keller, Novembre, & Hove, 2014Tognoli & Kelso, 2015). The behaviour of one player results in a change in behaviour of the other in a reciprocal way so as to achieve temporal synchrony. Interestingly, though not surprisingly, this reciprocal exchange of information results in physical alignment, which in turn has also been shown to result in greater degrees of affiliation and greater mental alignment (Hove & Risen, 2009Rabinowitch & Knafo-Noam, 2015Wiltermuth & Heath, 2009). Specifically, we suggest that, rather than a focus on the sharedness of the intended goal, we should attend to the graded exchange of information that creates alignment. The most social of interactions, in our formulation, are those in which ‘‘live” (‘‘online”, see Schilbach, 2014) information is exchanged dynamically (i.e. over time, across multiple points) bidirectionally and used to adapt behaviour and align with another (Jasmin et al., 2016). (pp 255)
  • Indeed, it is possible to have reciprocity and thus social interaction without cooperation. This would be the case, for example, in a competitive scenario in which the minds of the subjects are aligned at the appropriate level of description, and the sharing is essential to solve social dilemmas involving antagonistic behaviour (Bratman, 2014). In these exchanges, what is needed for the minds of the agents to attune to one another is that they adapt thoughts, bodily postures and movements, to take one another into account and reason as a team, even though the team might consist of competitive actors where none is aware that they are acting from the perspective of the same group and in the pursuit of some common goal (Bacharach, 2006). (pp 255)
  • fundamentally social nature has to do with the process whereby systems reciprocate thoughts and experiences, rather than with the endpoint i.e. the goal. It turns out that two features are often taken to be central to the process whereby interacting agents align minds and bodies. First, the interacting agents must be aware that they are doing something together with others. Second, the success of their joint performance is taken as a measure of how shared the participants’ goals are. (pp 255)
  • our suggestion is that what matters for the relevant alignment of minds and bodies to occur is the reciprocal exchange of information, not awareness of the reciprocal exchange of information. (pp 255)
    • This is all that is needed for flocking to happen. It is the range of that exchange that determines the phase change from independent to flock to stampede. Trust is involved in the reciprocity too, I think
  • Becoming mutually aware that we are sharing attitudes, dispositions, bodily postures, perhaps goals, does not mean that the ‘jointness’ of our actions has become available to each of us for conscious report. Reciprocity of awareness is emphatically not the same as awareness of reciprocity. The process of reciprocally exchanging information and mutually adapting to one another need not necessarily result in any degree of shared awareness. (pp 256)
  • In animals, a signal, for example about the source of food, that is too weak for an individual fish to follow can be followed by a group through the simple rules of bodily alignment that create shoaling behaviour (Grunbaum, 1998). Shoaling behaviour can also be observed in humans (Belz, Pyritz, & Boos, 2013), who can achieve group advantage through more complex forms of adjustment than just bodily alignment. Pairs of participants trying to detect a weak visual signal can achieve a greater group advantage when they align the terms they use to report their confidence in what they saw (Fusaroli et al., 2012). Indeed, linguistic alignment at many levels can be observed in dialogue (Pickering & Garrod, 2004) and can improve comprehension (Adank, Hagoort, & Bekkering, 2010; Fusaroli et al., 2012). (pp 256)
  • Much research has been driven, so far, by the implicit goal of identifying optimal group performance as a proxy for mental alignment (Fusaroli et al., 2012), however, there is conceptual room and empirical evidence for arguing that optimal task performance is not a good index of mental alignment or ‘optimal sociality’. In other words, taking achievement of a shared goal as the paradigm of a social interaction leads to the binary conception of sociality according to which an interaction is either (optimally) social, or it is not. (pp 256)
    • This is a problem that I have with opinion dynamics models. Convergence on a particular opinion isn’t the only issue. There is a dynamic process where opinions fall in and out of favor. This is the difference between the contagion model, which is one way (uninfected->infected) and motion through belief space. The goal really doesn’t matter, except in a subset of cases (Though these may be very important)
  • Two systems can interact when they have access to information relating to each other (Bilek et al., 2015). There are different ways of exchanging information between systems and hence different types of interaction (Liu & Pelowski, 2014), but in everycase some kind of alignment occurs (Coey, Varlet, & Richardson, 2012Huygens, 1673). (pp 257)
  • Such offline interaction can be contrasted with the case of online social interactions, where both participants act. The distinction between offline and online social interaction tasks is now acknowledged as crucial for advancing our understanding of the cognition processes underlying social interaction (Schilbach, 2014). (pp 257)
  • In contrast to salsa, consider the case of tango in which movements are improvised and as such require constant, mutual adaptation (Koehne et al., 2015; Tateo, 2014). Tango dancers have access to information relating to each other and, by virtue of the task, they exchange information with one another across time in a reciprocal and bidirectional fashion. The juxtaposition of tango with salsa highlights a spectrum of degrees of mutual reciprocity, with a richer form of interaction and greater need for alignment in tango compared with salsa.
  • we will take reciprocity to be the primary requirement for social interactions. We suggest that reciprocity can be identified with a special kind of alignment, mutual alignment, involving adjustment in both parties to the interaction. However, not all cases of joint action lead to mutual alignment. It is important to distinguish this mutual alignment from other types of alignment, which do not involve a reciprocal exchange of information between the agents. (pp 257)
  • In contrast to salsa, consider the case of tango in which movements are improvised and as such require constant, mutual adaptation (Koehne et al., 2015Tateo, 2014). Tango dancers have access to information relating to each other and, by virtue of the task, they exchange information with one another across time in a reciprocal and bidirectional fashion. The juxtaposition of tango with salsa highlights a spectrum of degrees of mutual reciprocity, with a richer form of interaction and greater need for alignment in tango compared with salsa. (pp 257)
  • AlignmentInSocialInteractions(pp 258)
  • The biggest challenge currently facing philosophers and scientists of social cognition is to understand social interactions. We suggest that this problem is best approached at the level of processes of mental alignment rather than through joint action tasks based on shared goals, and we propose that the key process is one of reciprocal, dynamic and graded adaptation between the participants in the interaction. Defining social interactions in terms of reciprocal patterns of alignment shows that not all joint actions involve reciprocity and also that social interactions can occur in the absence of shared goals. This approach has two particular advantages. First, it emphasises the key point that interactions can only be fully understood at the level of the group, rather than the individual. The pooling together of individual mental resources generates results that exceed the sum of the individual contributions. But, second, our approach points towards the mechanisms of adaptation that must be occurring within each individual in order to create the interaction (Friston & Frith, 2015). (pp 259)
  • This picture of social interaction in terms of mental alignment suggests two important theoretical developments. One is about a possible way to characterize the idea that types of social interaction lie on a continuum of possible solutions. If we focus on the task or the shared goal being pursued by agents jointly, as the current literature suggests, then only limited subdivisions of types of interaction will emerge. If, however, our focus extends so as to integrate the nature of the interaction, conceived of in terms of information exchange, then we can arrive at a higher degree of resolution of the space in which social interaction lie. This will define a spectrum of types of interaction (not just offline versus online social cognition), suggesting a dimensional rather than a discrete picture. After all, alignment comes in degrees and a spectrum-like definition of sociality implies that there is a variety of forms of alignment and hence of interactions. (pp 269)
    • My work would indicate that meaningful transitions occur for Unaligned (pure explore), Complex (flocking), and Total (stampede).