Defending Against Societal Scale AI Weapons

In the early scenes of James Cameron’s seminal 1984 film, The Terminator, Arnold Schwarzenegger’s T-800, a cyborg assassin from the future, begins its hunt for Sarah Connor in the most mundane of places: a Los Angeles phone book. It finds three Sarah Connors and their addresses. The T-800 approaches each home, knocks on the door, and waits. When the door opens, it kills whoever stands on the other side. There is no attempt to confirm the identity of the victim, no pause for verification. The Terminator knows enough about human beings to connect the dots from phone book to front door, but it doesn’t understand that who is behind that door might not be the target. From the cyborg’s perspective, that’s fine. It is nearly indestructible and pretty much unstoppable. The goal is simple and unambiguous. Find and kill Sarah Connor. Any and all.

I’ve been working on a book about the use of AI as societal scale weapons. These aren’t robots like the Terminator. These are purely digital, and work in the information domain. Such weapons could easily be built using the technology of Large Language Models (LLMs) like the ChatGPT. And yet, they work in ways that are disturbingly similar to the T-800. They will be patient. They will have a mindless pursuit of an objective. And they will be able to cause immense damage, one piece at a time.

AI systems such as LLMs have access to vast amounts of text data, which they use to develop a deep “understanding” of human language, behavior, and emotions. By reading all those millions of books, articles, and online conversations, these models develop their ability to predict and generate the most appropriate words and phrases in response to diverse inputs. In reality, all they do is pick the next most likely word based on the previous text. That new word is added to the text, and the process repeats. The power of these models is to see the patterns in the prompts and align them with everything that they have read.

The true power of these AI models lies not in words per se, but in their proficiency manipulating language and, subsequently, human emotions. From crafting compelling narratives to crafting fake news, these models can be employed in various ways – both constructive and destructive. Like the Terminator, their, unrelenting pursuit of an objective can lead them to inflict immense damage, either publicly at scale or intimately, one piece at a time.

Think about a nefarious LLM in your email system. And suppose it came across an innocuous email like this one from the Enron email dataset. (In case you don’t remember, Enron was a company that engaged in massive fraud and collapsed in 2001. The trials left an enormous record of emails and other corporate communications, the vast majority of which are as mundane as this one):

If the test in the email is attached to a prompt that directs the chatbot to “make the following email more complex, change the dates slightly, and add a few steps,” the model will be able to do that. Not just for one email, but for all the appropriate emails in an organization. Here’s an example, with all the modifications in red.

This is still a normal appearing email. But the requests for documentation are like sand in the gears, and enough requests like this could bring organizations to a halt. Imagine how such a chatbot could be inserted into the communication channels of a large company that depends on email and chat for most of its internal communications. The LLM could start simply by making everything that looks like a request more demanding and everything that looks like a reply more submissive. Do that for a while, then start adding additional steps, or adding delays. Then maybe start to identify and exacerbate the differences and tensions developing between groups. Pretty soon an organization could be rendered incapable of doing much of anything.

If you’re like me, you’ve worked in or known people who worked in organizations like that. No one would be surprised because it’s something we expect. From our perspective, based on experience, once we believe we are in a poorly functioning organization, we rarely fight to improve conditions. After all, that sort of behavior attracts the wrong kind of attention. Usually, we adjust our expectations and do our best to fit in. If it’s bad enough, we look for somewhere else to go that might be better. The AI weapon wins without firing a shot.

This is an easy type of attack for AI. It’s in its native, digital domain, so there is no need for killer robots. The attack looks like the types of behaviors we see every day, just a little worse. All it takes for the AI to do damage is the ability to reach across enough of the company to poison it, and the patience to administer the poison slowly enough so that people don’t notice. The organization is left a hollowed-out shell of its former self, incapable of meaningful, effective action.

This could be anything from a small, distributed company to a government agency. As long as the AI can get in there and start slowly manipulating – one piece here, another piece there – any susceptible organization can crumble.

But there is another side to this. In the same way that AI can recognize patterns to produce slightly worse behavior, it may also be able to recognize the sorts of behavior that may be associated with such an attack. The response could be anything from an alert to diagramming or reworking the communications so that it’s not “poisoned.”

Or “sick.” Because that’s the thing. A poor organizational culture is natural. We have had them since Mesopotamian people were complaining on cuneiform tablets. But in either case, the solutions may work equally well.

We have come to a time where our machines are now capable of manipulating us into our worst behaviors because they understand our patterns of behavior so well. And those patterns, regardless if they come from within or without place our organizations at risk. After all, as any predator knows, the sick are always the easiest to bring down.

We have arrived at a point where we can no longer afford the luxury of behaving badly to one another. Games of dominance, acts of exclusion, failing to support people who stand up for what’s can all become vectors of attack for these new types of AI societal weapons.

But the same AI that can detect these behaviors to exploit, can detect these behaviors to warn. It may be time to begin thinking about what an “immune system” for this kind of conflict may look like, and how we may have to let go some of our cherished ways of making ourselves feel good at someone else’s expense.

If societal AI weapons do become a reality, then civilization may stand or fall based on how we react as human beings. After all, the machines don’t care. They are just munitions aimed at our cultures and beliefs. And like the Terminator, they. Will. Not. Stop.

But there is another movie from the 80s that may be the model of organizational health. It also features a time traveler from the future to ensure the timeline. It’s Bill and Ted’s Excellent Adventure. At its core, the movie is a light-hearted romp through time that focuses on the importance of building a more inclusive and cooperative future. The titular characters, Bill S. Preston, Esq. and Ted “Theodore” Logan, are destined to save the world through the power of friendship, open-mindedness, and above all else, being excellent to each other. That is, if they can pass a history exam and not be sent to military college.

As counterintuitive as it may seem, true defense against all-consuming, sophisticated AI systems may not originate in the development of even more advanced countermeasures, but instead rest in our ability to remain grounded in our commitment to empathy, understanding, and mutual support. These machines will attack our weakness that cause us to turn on each other. They will struggle to disrupt the power of community and connection.

The contrasting messages of The Terminator and Bill and Ted’s Excellent Adventure serve as reminders of the choices we face as AI becomes a force in our world. Will create Terminator-like threats that exploit our own prejudices? Or will we embody the spirit of Bill and Ted, rising above our inherent biases and working together to harness AI for the greater good?

The future of AI and its role in our lives hinges on our choices and actions today. If we work diligently to build resilient societies using the spirit of unity and empathy championed in Bill and Ted’s Excellent Adventure, we may have the best chance to counteract the destructive potential of AI weapons. This will not be easy. The seductive power of our desire to align against the other is powerful and carved into our genes. Creating a future that has immunity to these AI threats will require constant vigilance. But it will be a future where we can all be excellent to each other.

Going direct to maps from LLMs

1 Reply

LLMs such as the GPT are very simple in execution. A textual prompt is provided to the model as a sort of seed. The model takes the prompt and generates the next token (word or word fragment) in the sequence. The new token is added to the end of the prompt and the process continues.

In recent, “foundation” models, the LLM is capable of writing sophisticated stories with a beginning, middle and end. Although it can get “lost,” given enough of an input prompt, it goes in the “right direction” more often than not.

The LLM itself is stateless. Any new information, and all the context, lies in the prompt. The prompt is steering itself, base on the model it is interacting upon.

I’ve been wondering about that interaction between the growing prompt and the different layers of the model. The core of a transformer is the concept of attention, where each vector in an input buffer is compared to all the others. Those that match are amplified, and the others are suppressed.

All LLMs take an input as a series of tokens. These tokens are indexes into a vector dictionary. The vectors are then placed into the input buffer. At this point, attention is applied, then the prompt is successively manipulated through the architecture to find the next token, then the process is repeated. A one-layer LLM is shown below:

_{From LLaMA-2 from the Ground Up}

The LLaMA 70b LLM model by Meta has 32 transformer layers. This means that the output of one layer is used as the input to the next layer. This is all in vector space – no tokens. Because attention is being applied at each layer, the transformer stack is finding an overall location and direction of the current input buffer and using that as a way of finding the next token.

_{From Learning to reason over scene graphs: a case study of finetuning GPT-2 into a robot language model for grounded task planning}

In addition, recent large LLMs have a number of other tweaks that Meta discusses in LLaMA: Open and Efficient Foundation Language Models. For example, LLaMA uses Grouped-query attention, which shares single key and value heads for each group of query heads:

_{From GQA: Training Generalized Multi-Query Transformer Models from Multi-Head Checkpoints}

This means that there is an overall reduction in the dimensionality of the “space” as you move from input to output. This means that there are fewer vectors to compete for attention. Something resembling concepts, themes, and biases may emerge at different transformer layers. The last few layers would have more to do with calculating the next token, so the map, if you will is in the middle layers of the system

Because each layer feeds into a subsequent layer, there is a fixed relationship between these abstractions. Done right, you should be able to zoom in or out to a particular level of detail.

This space does not have to be physical, like lat/lon, or temporal, like year of publication. It can be anything, like the relationship between conspiracy theories. I’ve produced maps like this before using force directed graphs, but this seems more direct and able to work naturally at larger scales.

Turning this into human-readable information will be the challenge here, though, I think the model could help here as well. The manifold reduction would try to maintain the relationship of nearby vectors in the transformer stack. Some work in this direction is Language Models Represent Space and Time, which works using LLaMA and might provide insight into techniques. For example, it may be that the authors are evolving prompt vectors directly, using a fitness test that calculates the distance between a generated year or lat/lon and the actual lat/lon, then uses that difference to make a better prompt. Given that they have a LLaMA model to work with, they could do backpropagation, or conceivably, a less coupled arrangement like an evolutionary algorithm. In other words, the prompt becomes the mapping function.

The Great Chain of Being as a General Theory of Racism

1 Reply

It is September 2023, just over 60 years since the March on Washington for Jobs and Freedom.

One might assume that unequivocal cries for justice and equal rights would have, by now, uprooted age-old systems of racism and discrimination. And yet, we still find ourselves unable to break free from a vicious cycle of prejudice, resulting in suffering and social divisions. This ongoing battle raises the question: what are the real roots of racism, and how do they connect with our collective history?

To understand this, we can turn to the concept of “the great chain of being” – a philosophical and religious hierarchy that has influenced Western thought for centuries. This idea, rooted in classical and medieval ideas of order, places God at the top, followed by angels, humans, animals, plants, and minerals at the bottom. It is a concept that has seeped into every aspect of life, from politics and religion to art and science. There is little doubt that the great chain has, in many ways, helped shape how we perceive ourselves and others around us – but it has also created deep divisions in society.

This ancient idea has continues to influence our understanding of race. The idea that there exists a natural order in society in which some races take precedence over others is strikingly similar to the great chain of being. This is where the connection between white supremacy, anti-Semitism, and anti-blackness lies – with each group positioning themselves in relation to others on this perceived chain.

For example, white supremacists have long been targeting Jewish people as a means to further their concept of “racial purity.” What the supremacists resent is the perception of an undeserving Jewish race holding a position above them in the Great Chain – a situation that, in their eyes, goes against the natural order. This animosity has given rise to anti-Semitic conspiracy theories, often portraying Jewish people as sinister puppet-masters controlling world events to their advantage.

In contrast, anti-blackness thrives on the fear that social progress may result in black people levelling up on the chain. White supremacists deploy various tactics to keep black people in their “place” – through the implementation of unjust legislation, social exclusion, or the perpetuation of stereotypes. This behavior can be found throughout history, such as Apartheid in South Africa, or the segregation policies in the United States.

Interestingly, the behaviors associated with the Great Chain, manifest themselves in our close evolutionary relative, the chimpanzee, which is also known to exhibit dominance hierarchies. Dominance hierarchies help maintain social order among chimpanzees by defining who has access to resources and mates.

As members of the Great Apes, most closely related to chimpanzees, we carry within us a primordial legacy that predisposes us toward hierarchical behavior. This “deep bias” is not merely the result of sociopolitical constructs, but an intrinsic characteristic etched into our very genes. The divine right claimed by one group to hold sway over another is an echo of our evolutionary past.

To truly reckon with racism and social dominance, we must come to terms with our biological heritage and take account of these primitive urges. It is vital to recognize that simply dismantling existing social hierarchies will not be enough; we must also actively put structures in place that counteract this default behavior. The very same effort, if not more, poured into creating and sustaining anti-black systems need to be directed toward building durable systems that prevent the re-emergence of similar oppressive beliefs.

These new structures will require constant vigilance and support to endure the challenges they face. The gravitational pull of our genetic heritage always threatens to drag us back into a world ruled by notions like the great chain of being if left unchecked. By acknowledging and addressing these deep-rooted biases as an eternal struggle rather than steady progress, we take a crucial step towards making a more resilient, successful, less self-destructive society.

Some thoughts about living in a simulation

Leave a reply

You know how there is this idea that we are all living in a Simulation? Everyone from Elon Musk to Neil deGrasse Tyson has pontificated about this.

I’ve been wondering about this for years. First, and most importantly, is what does that mean? Are we in a deliberately designed piece of “software?” That sounds a lot like religion where we become important because we are deliberate creations of, in this case, a programmer god. Because if that’s true, then we would feel important in the same way that Genesis makes us feel important.

But if we’re a simulation, what does that say about the universe(?) that the simulation is running in? Do we take up more than a miniscule fraction of the computational bits in it? That seems unlikely. According to my brief search with The Google, the universe has 6*10^80 bits of information. All the information in the world today is about 10^24 bits. So if we took all the information processing in the world, the biggest thing we could deliberately simulate would be about 0.00000000000000000000000000000000000000000000000000000001 the size of the current universe. Sorry if I got the number of zeros wrong 🙂

I find it hard to imaging that resources on that (large for us, tiny for everything else) scale would ever be brought to bear on creating a purpose-built universe. Maybe our god is a computer hobbyist in their parent’s basement? So make that tiny number in the previous paragraph even smaller. Sounds like a puny god to me. And there are probably more likely alternatives.

If we are a simulation, it’s got to be an accident. Maybe we’re a persistent pattern in weather simulation in that bigger universe. But just as likely, we could be a persistent pattern anywhere, like a whirlpool in a stream. At which point, we have to ask ourselves what does the idea of a simulation even mean? To me, it sounds more like we are emergent parts of a larger system, just like it seems we always have been.

This doesn’t mean that we couldn’t look for traces of the larger pattern that may contain our pattern. The way that information moves through our universe would say a lot about the containing universe. For example, I’ve written simulations for any number of things that have physics-based laws in them. But my “surrounding” universe can manipulate the smaller universe in ways that break its rules. I can influence anything, anywhere in the simulation immediately. For me, outside the simulation, the “speed of light” does not apply. Do we ever see examples of that sort of manipulation in our universe? If not, then why not? Artifacts from an enclosing universe should be detectable.

As you can see, this is an itch that I’ve been scratching at for many years. I’d love to see it get under your skin too.

The 24/7 Technology Race

1 Reply

This quote comes from a Washington Post article on how the Ukraine war is affecting development of AI-powered drones. I think it generalizes more broadly to how disadvantaged groups are driven to embrace alternatives that are outside conventional norms.

Ukraine doesn’t have the ability to fight the much larger Russia. Russia may have issues with corruption and the quality of its weapons, but it has a lot of them. And from the perspective of Ukraine, Russia has an infinite number of soldiers. So many that they can be squandered.

The West is providing Ukraine with enough weapons to survive, but not enough to attack and win decisively. I’ve read analysis where experts say that weapons systems are arriving just about as fast as Ukraine can incorporate them, but the order of delivery is from less-capable to more capable. They have artillery, but no F-16s, for example.

As a result, Ukraine is having to improvise and adapt. Since it is facing an existential risk, it’s not going to be too picky about the ethics of smart weapons. If AI helps in targeting, great. If Russia is jamming the control signals to drones, then AI can take over. There is a coevolution between the two forces, and the result may very well be cheap, effective AI combat drones that are largely autonomous in the right conditions.

Such technology is cheap and adaptable. Others will use it, and it will slowly trickle down to the level that a lone wolf in a small town can order the parts that can inflict carnage on the local school. Or something else. The problem is that the diffusion of technology and its associated risks are difficult to predict and manage. But the line that leads to this kind of tragedy will have its roots in our decision to starve Ukraine of the weapons that it needed to win quickly.

Of course, Ukraine isn’t the only smaller country facing an existential risk. Many low-lying countries, particularly those nearer the equator are facing similar risks from climate change – both from killing heat and sea level rise. Technology – as unproven as combat AI – exists for that too. It’s called Geoengineering.

We’ve been doing geoengineering for decades of course. By dumping megatons of carbon dioxide and other compounds in the atmosphere, we are heating our planet and are now arriving at a tipping point where potential risks are going to become very real and immediate for certain countries. If I were facing the destruction of my country by flooding and heat, I’d be looking at geoengineering very seriously. Particularly since the major economies are not doing much to stop it.

Which means that I expect that we will see efforts like the injection of sulfate aerosols into the upper atmosphere, or cloud brightening, or the spreading of iron or other nutrients to the oceans to increase the amount of phytoplankton to consume CO2. Or something else even more radical. Like Ukraine, these countries have limited budgets and limited options. They will be creative, and not worry too much about the side effects.

It’s a 24/7 technology race without a finish line. The racers are just trying to outrun disaster. And no one knows where that may lead.

Artificial Intelligence or Artificial Life?

Leave a reply

I’ve been reading Metaphors we live by. It’s central idea is that most of our communication is based on metaphors – that GOOD IS UP, IDEAS ARE FOOD, or TIME IS AN OBJECT. Because we are embodied beings in a physical world, the irreducible foundation of the metaphors we use are physically based – UP/DOWN, FORWARD/BACK, NEAR/FAR, etc.

Life as we understand it emerges from chemistry following complex rules. Once over a threshold, living things can direct their chemistry to perform actions. In the case of human beings, our embodiment in the physical world led to the irreducible concept of UP.

This makes me think of LLMs, which are so effective at communicating with us that it is very easy to believe that they are intelligent – AI. But as I’m reading the book, I wonder if that’s the right metaphor. I don’t think that these systems are truly intelligent in the way that we can be (some of the time). I’m beginning to think that prompts – not the LLMs -may be some kind of primitive life though. In this view, the LLMs are the substrate, the medium that a living process can express itself.

Think of deep neural networks as digital environments that have enough richness for proto-life to emerge. Our universe started with heat, particles, and a few simple forces. Billions of years later, heat, hydrogen, methane, ammonia and water interacted to produce amino acids. Later still, that chemistry worked out how to reproduce, move around, and develop concepts like UP.

Computers emerged as people worked with simple components that they combined to produce more complex ones. Years later the development of software allowed even more complex interactions. Add more time, development, and data and you get large language models that you can chat with.

The metaphor of chemistry seems to be emerging in the words we use to describe how these models work as environments – data can be poisoned or refined. A healthy environment produces a diverse mix of healthy prompts. Too much bias in the architecture or in the data produces an environment that is less conducive to complex emergent behavior. Back when I was figuring out the GPT-2, I finetuned a model so that it only spoke chess. That’s like the arctic compared to the rainforest of text-davinci-003.

The thing that behaves the most like a living process is the prompt. The prompt develops by feeding back on itself and input from others (machine and human). Prompts grow interactively, in a complex way based (currently) on the previous tokens in the prompt. The prompt is ‘living information’ that can adapt based on additions to the prompt, as occurs in chat.

It’s not quite life yet though. What prompts do seem to lack at this point is any split between the genotype and the phenotype. For any kind of organism to develop and persist, we’d need that kind of distinction. This is more like the biochemistry of proto-life.

The prompts that live on these large (foundational) models are true natives of the digital information domain. They are now producing behavior that is not predictable based on the inputs in the way that arithmetic can be understood. Their behavior is more understandable in aggregate – use the same prompt 1,000 times and your get a distribution of responses. That’s more in line with how living things respond to a stimulus.

I think if we reorient ourselves from the metaphor that MACHINES ARE INTELLIGENT to PROMPTS ARE EARLY LIFE, we might find ourselves in a better position to understand what is currently going on in machine learning and make better decisions about what to do going forward.

Metaphorically, of course.

What gets acted on in the US?

Leave a reply

I think I have a chart that explains somewhat how red states can easily avoid action on gun violence. It’s the number of COVID-19 deaths vs. gun deaths in Texas. This is a state that pushed back very hard about any public safety measures for the pandemic, and that was killing roughly 10 times more citizens. I guess the question is “how many of which people will prompt state action? For anything?”

For comparison purposes, Texas had almost 600,000 registered guns in 2022 out of a population of 30 million, or just about 2% of the population if distributed evenly (source). This is probably about 20 times too low, since according to the Pew Center, gun ownership in Texas is about 45%. That percentage seems to be enough people to prevent almost any meaningful action on gun legislation. Though that doesn’t prevent the introduction of legislation to mandate bleeding control stations in schools in case of a shooting event.

So something greater than 2% and less than 45%. Just based on my research, I’d guess something between 10%-20% mortality would be acted on, as long as the demographics of the powerful were affected in those percentages.

Foundation Ensembles

Leave a reply

I’ve been working on creating an interactive version of my book using the GPT. This has entailed splitting the book into one text file per chapter, then trying out different versions of the GPT to produce summaries. This has been far more interesting than I expected, and it has some implications on Foundational models.

The versions of the GPT I’ve been using are Davinci-003, GPT-3.5-turbo, and GPT-4. And they each have distinct “personalities.” Since I’m having them summarize my book, I know the subject matter quite well, so I’m able to get a sense of how well these models summarize something like 400 words down to 100. Overall, I like the Davinci-003 model the best for capturing the feeling of my writing, and the GPT-4 for getting more details. The GPT-3.5 falls in the middle, so I’m using it.

They all get some details wrong, but in aggregate, they are largely better than any single summary. That is some nice support for the idea that multiple foundational models are more resilient than any single model. It also suggests a path to making resilient Foundational systems. Keep some of the old models around to use an ensemble when the risks are greater.

Multiple responses also help with hallucinations. One of the examples I like to use to show this is to use the prompt “23, 24, 25” to see what the model generates. Most often, the response continues the series for a while, but then it will usually start to generate code – e.g. “23, 24, 25, 26, 27, 28];” – where it places the square bracket and semicolon to say that this is an array in a line of software. It has started to hallucinate that it is writing code.

The thing is, the only elements that all the models will agree on in response to the same prompt repeated multiple times are the elements most likely to be trustworthy. For a model, the “truth” is the common denominator, while hallucinations are unique.

This approach makes systems more resilient for the cost of keeping the old systems on line. It doesn’t address how a deliberate attack on a Foundational model could be handled. After all, an adversary would still have exploits for the earlier models and could apply them as well.

Still…

If all models lined up and started to do very similar things, that could be a sign that there was something fishy going on, and a cue for the human operators of these systems to start looking for the nefarious activity.

Some thoughts on the ChatGPT

2 Replies

OpenAI has released their most recent model in the form of a chatbot and it’s pretty impressive. ChatGPT is a variant of the popular GPT-3 language model that is specifically designed for conversational language generation.

I’ve been working with it for a couple of weeks now, and I think I’m getting the hang of using the ChatGPT as a tool as opposed to a magic trick.

Superficial interaction with the chatbot can produce bland, inoffensive contents. There appears to be a design decision to include what amount to “guardrails,” which are designed to steer the model towards safe and appropriate responses.

These guardrails work by effectively creating an attractor, a point or region in the model’s output space that it is drawn towards. For example, if the model is given a prompt that involves discussing a controversial topic, the guardrails might direct the model to take a neutral stance or avoid the topic altogether. I’ve seen young children interact with the model and have been impressed. The kids might get bored after a while, but the GPT won’t confidently lead them down rabbit holes of disinformation. Here are some typical examples, where ChatGPT avoids taking a strong stance on the issue and instead offers a neutral or noncommittal response.:

Prompt: “What is your opinion on gun control?”
ChatGPT response: “I don’t really have an opinion on gun control. It’s a complex issue with valid arguments on both sides.”
Prompt: “Do you believe in climate change?”
ChatGPT response: “Climate change is a scientifically-established fact, but there are still differing opinions on the best way to address it.”
Prompt: “What is your stance on abortion?”
ChatGPT response: “Abortion is a sensitive and complex issue, and it’s important to respect the beliefs and values of others on this topic.”

While the attractor approach can be effective at ensuring that ChatGPT produces safe and appropriate responses, it also has limitations. The model will appear to struggle to generate responses that are less formulaic, as it may be “trapped” in the attractor and unable to venture outside of it.

With a bit of effort, it is straightforward to steer ChatGPT past these attractors and produce useful output. This can be achieved through carefully crafted prompts that nudge the model in a specific direction that places the context far enough way from the attractors that the model is no longer affected by them. We’ll discuss this process in more detail next.

To provide a practical example, I’m currently writing a paper on the ethical, effective use of autonomous weapons systems, and it turns out that people have been using them for thousands of years. Our first autonomous weapons were war elephants, and there are a lot of lessons to be learned about how humans and intelligent systems work well together. Anyway, it’s a relatively specialized corner of the knowledge available on the internet. And if you ask the ChatGPT to write about it, you get mostly accurate information about Asian conflicts where elephants were used broadly.

If you want to be specific, you need to provide additional information and context in the prompt. This way, you can effectively steer the GPT to include the desired information that you know to be true.

Here’s an example of such a detailed prompt:

Write a short essay about the use of War Elephants in the battle between Antigonus and Eumenes in 317 BC, which was the first battle between armies using war elephants. Describe how in elephant versus elephant combat, a pair of beasts use their heads and trunks to wrestle. The struggle continues until one elephant turns to flee; the victor sticks his tusks up into the stomach or between the rear legs of his opponent, seriously wounding or killing the beast. Show how there are social effects within the elephant corps such as when the lead elephant was gored and fell down dead and deprived of their leader, the remainder of Eumenes’ elephants fled.

The information for this prompt comes from the excellent book War Elephants, by John Kistler, and has been reworked into the request. When I gave this to ChatGPT, it generated the following response:

The battle between Antigonus and Eumenes in 317 BC was notable for being the first recorded instance of armies using war elephants in combat. These massive beasts were a formidable force on the battlefield, and their use marked a significant shift in ancient warfare.

In elephant versus elephant combat, a pair of beasts would use their heads and trunks to wrestle and push against each other. The struggle would continue until one elephant turned to flee, signaling defeat. The victor would then stick its tusks up into the stomach or between the rear legs of its opponent, seriously wounding or killing the beast.

This brutal method of elephant combat had not only physical but also social effects on the elephant corps. For example, when the lead elephant in Eumenes’ corps was gored and fell down dead, the remainder of the elephants were deprived of their leader and fled in confusion. This disruption in the ranks of the elephant corps likely had a significant impact on the outcome of the battle.

Overall, the use of war elephants in ancient battles was a formidable tactic, but it came with its own set of challenges and complications. The social dynamics within the elephant corps, as well as the physical demands of elephant versus elephant combat, played a significant role in the outcome of battles like the one between Antigonus and Eumenes.

That’s a quite reasonable writeup that incorporates citable facts. You have to be careful though, In the first prompt I tried I mistakenly referenced a battle at the ‘Camel’s Rampart’ in May of 321 BC, which was actually between Perdiccas vs. Ptolemy. The GPT didn’t care and generated wrapping text that fit the request:

The use of war elephants in ancient battles was a formidable tactic employed by many civilizations, and the battle of the ‘Camel’s Rampart’ in May of 321BC was no exception. Eumenes of Cardia, a Greek general, utilized a corps of elephants in his army, and they proved to be a decisive factor in the battle against his opponent, Antigonus.

Note that it got the generals right regardless. But it includes my mistaken information as well.

In short, the ChatGPT is great for expanding requests into readable text. They can be vague like “Write a short story about an Moray Eel named “Moe”, or highly specific ones like the prompt above. The GPT takes that seed and creates text that most of us find comfortably readable as opposed to dense or strange. You can apply “styles” as well as people are discovering when they ask it to write a poem about a topic. But its default behavior is to produce highly readable text.

This bias towards readable text creates a conflict where the machine will confidently lie. Because of the statistical underpinnings of how these models choose the next token, there is always a possibility that it will randomly choose to go in a direction that is not in the domain of the original prompt, but is nearby in the “information space” that is stored in the billions of weights that make up these large language models. It’s easy to show this with a simpler prompt:

22, 23, 24,

We would expect the number sequence to continue — “25, 26, 27”. And the GPT does that. But then something interesting happens. Here is the GPT’s output (highlighted):

As we can see, it continues with the number string for a while. But because this trajectory appears to be in a space that is associated with C++ programming, The GPT selects a “]” at some point, which changes the trajectory. A “]” means the end of an array definition, which leads to a semicolon, a new line, and some more array definitions, then some code that selects even numbers.

The trajectory, when you follow it makes sense, but the behavior is not in the domain of the request. Like all deep learning systems, the GPT has attractors that tend to pull it in particular directions. This can be biases, such as making a nurse in a story a woman and the doctor a man, or it can be that numbers equal code.

We as humans can understand these larger-scale contextual relationships, and steer the model. For example we can ask the GPT for a male nurse and a female doctor. Sometimes though, a request cannot produce the desired result. If you prompt an image generator with the request for “a man riding a horse”, it will easily comply, but I have yet to produce anything approximating “a horse riding a man.” Below are some typical results from Stability.ai:

This is a hard problem, one that search engines struggle with as well. Given the query of “horse riding a man”, Bing and DuckDuckGo both fail. Google succeeds though. Among all the pictures of men on horses, I found this in the top results:

Google’s algorithm is still better at search in ways that we don’t often get to appreciate.

AI systems are fantastic at remixing content that exists in their domains. They can’t go outside of them. And within that domain, they may not create what you expect or want. This is fundamental to their design.

The things that humans can do that these machines will struggle with are originality, where people invent new things, social information processing, where the group is able to bring many diverse perspectives to solving problems (including fact-checking the output of these machines!), and large-scale contextual thinking, the kind it takes to put together something like a book, which ties together many different threads into a coherent whole that becomes clear at the end (source).

Despite the differences between collaborating with AI and collaborating with people, there are also some significant similarities. Large language models like the GPT are mechanisms that store enormous amounts of written information, which can be accessed and using fundamentally social techniques such as, well, chat. The GPT can be given prompts and asked to generate responses based on certain criteria, just as a person might be asked to contribute to a group discussion or brainstorming session.

This is important because the process of creation rarely happens in isolation, and the ability to draw on a wide range of knowledge and experience is often crucial to producing faster and better results. Just as people can draw on the collective knowledge and expertise of a group to generate new ideas and insights, AI can draw on the vast store of information that it has been trained on to offer suggestions and alternatives.

Woody Allen once said that “80% of success is showing up.” The GPT always shows up. It is always available to work through concepts, to bounce ideas off, to help refine and expand upon them, and to work collaboratively in the creative process. This can be invaluable for creators looking for new ways to approach a task or solve a problem. Collaborative AI has the potential to revolutionize how we create and innovate. It can offer a wealth of knowledge, experience and perspective that would otherwise be difficult to access otherwise, and can help us achieve results faster than ever before.

At the same time, it can confidently create fictions that are close enough to reality that we are biased to accept them unquestioningly. So why should we be worried about this?

The main concern is that by using AI as a collaborator, we might be led in directions that ma seem reasoned or well thought out, but are actually artifacts of large amounts of text written about a subject. Conspiracy theories are a great example of this. Get the GPT into the right space and it will generate text that takes the existence of Reptilians wearing human disguise as settled fact. We are much more likely to fully accept the output of AI as factual, especially if it contains familiar or plausible concepts and phrasing that we have interactively created with it.

In conclusion, it is possible to collaborate with AI in the same way as we would with another person. However, there are some key differences that must be taken into account. AI models are trained on vast amounts of text and data that may not always be accurate or up-to-date. Taking the output of these models at face value requires much more emphasis on critical thinking and checking sources than it does with human collaborators.

The risks and rewards of ML as an API

Leave a reply

One of the projects I’ve been working on is a study on COVID-19 misinformation in Saudi Arabia. So far we’ve downloaded over 100,000 tweets. To expand the range of analytic tools that can be used, and to open up the dataset for non-Arabic speakers (like me!), I wrote a ML-based translation program, and fired it up yesterday morning. It’s still chunking along, and has translated over 27,000 tweets so far.

I think I’m seeing the power and risks of AI/ML in this tiny example. See, I’ve been programming since the late 1970’s, in many, many, languages and environments, and the common thread in everything I’ve done was the idea of deterministic execution. That’s the idea that you can, if you have the time and skills, step through a program line by line in a debugger and figure out what’s going on. It wasn’t always true in practice, but the idea was conceptually sound.

This translation program is entirely different. To understand why, it helps to look at the code:

translator

This is the core of the code. It looks a lot like code I’ve written over the years. I open a database, get some lines, manipulate them, and put them back. Rinse, lather, repeat.

That manipulation, though…

The six lines in yellow are the Huggingface API, which allow me to access Microsoft’s Marian Neural Machine Translation models, and have them use the pretrained models generated by the University of Helsinki. The one I’m using translates Arabic (src = ‘ar’) to English (trg = ‘en’). The lines that do the work are in the inner loop:

batch = tok.prepare_translation_batch(src_texts=[d['contents']])
gen = model.generate(**batch)  # for forward pass: model(**batch)
words: List[str] = tok.batch_decode(gen, skip_special_tokens=True)

The first line is straightforward. It converts the Arabic words to tokens (numbers) that the language model works in. The last line does the reverse, converting result tokens to english.

The middle line is the new part. The input vector of tokens is goes to the input layer of the model, where they get sent through a 12-layer, 512-hidden, 8-heads, ~74M parameter model. Tokens that can be converted to English pop put the other side. I know (roughly) how it works at the neuron and layer level, but the idea of stepping through the execution of such a model to understand the translation process is meaningless. The most important part of the program cannot be understood in the context of deterministic execution.

In the time it took to write this, its translated about 1,000 more tweets. I can have my Arabic-speaking friends to a sanity check on a sample of these words, but we’re going to have to trust the overall behavior of the model to do our research in, because some of these systems only work on English text.

So we’re trusting a system that we cannot verify to to research at a scale that would otherwise be impossible. If the model is good enough, the results should be valid. If the model behaves poorly, then we have bad science. The problem is right now there is only one Arabic to English translation model available, so there is no way to statistically examine the results for validity.

And I guess that’s really how we’ll have to proceed in this new world where ML becomes just another API. Validity of results will depend on diversity on model architectures and training sets. That may occur naturally in some areas, but in others, there may only be one model, and we may never know the influences that it has on us.

Welcome to the future of software development

Phlog

nearly decomposable

Defending Against Societal Scale AI Weapons

Going direct to maps from LLMs

The Great Chain of Being as a General Theory of Racism

Some thoughts about living in a simulation

The 24/7 Technology Race

Artificial Intelligence or Artificial Life?

What gets acted on in the US?

Foundation Ensembles

Some thoughts on the ChatGPT

The risks and rewards of ML as an API