The Self, Free Will, and Agency in AGI
Summary of posts 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11 and 12: A complete theory of Artificial General Intelligence (AGI) must first outline the invisible, automatic mental functions that drive higher-level, complex behaviour. One such function is “becoming aware of something”; either something external or within one’s own mind. “Becoming aware” is a problem-solving process that results in the formation of concrete memories or thoughts (sights and sounds). These can later be re-experienced as needed. The content you became aware of was the solution to a perceived problem (aka a “tension”). New tensions are learned when other tensions are repeatedly experienced without solution; new solutions are learned as exceptions to these.
Much of current Machine Learning research assumes that the primary function of the conscious mind is to passively model data. The explicit goal of classifiers, LLMs, Behavioural Cloning, indeed any supervised or semi-supervised approach to ML is to faithfully recreate patterns of ingested data. And though Reinforcement Learning¹ may be active when it comes to discovering new behavioural policies, even with curiosity-driven exploration, it is still a passive mirror when modelling the world. Empiricism has had a lasting influence on our definitions of cognition.
Everyday experience, however, doesn’t always align with this interpretation. There is a strong intuition that the “self” has more going on than just processing and compressing patterns of stimuli. Something in us compels us to believe that there is a “you”, a conscious agent that wills, enacts, has intentions, and pushes for change. There also seems to be a “you” in the Cartesian theatre that is actively observing all these events. Our everyday language has embedded in it the assumption of a unified subject, whether or not this is an accurate assumption. And behind this interplay we imagine we can see a motivated spirit, whose presence is the strongest subjective justification for the belief in a “free will”.
Many philosophies have attempted to give voice to this apparent “ghost in the machine”; to describe its function and behaviour, to give it dignity, to make it a cause of action. Some have even contrived to build deities in its image. Others reject it as an illusion; a self-protective measure invented by the organic machine out of selfish and ultimately unjustified ego-drives. Both of these frame the self using definitions of mechanization and freedom that perhaps deter their discursive opponents purely on their moral assumptions. Yet this back and forth will continue as long as there is the smallest gap in understanding the mind; because it is into this gap that people fill their moral “idols” and ideals. Were this gap to disappear, and you could understand the totality of your mind in detail, a new way to frame the active agent would reveal itself.
This series has so far left that gap open. We have focused on the invisible, automatic mechanisms of the mind; those that admit of no exceptions, and thus can be asserted as building blocks for complex learning, thinking and acting. The language of the posts, however, has alternated between using the terms “you” and “the mind”, as in “you see your mind having a thought” and “the mind sees itself having a thought”. This discrepancy needs to be resolved. We couldn’t do that until now, because we hadn’t built up the necessary pieces. We finally have enough of a foundation to fill in the gap.
As far as we know there are three sources — or causes — of mental events, and all of them are outside your direct control. They are (1) the experiences of the senses, (2) the thoughts you have², and (3) the tensions and solutions that drive learning. All three generally have unknown or uncontrollable origins: (1) you can’t fully control or predict what experiences you will have in the world, (2) your thoughts, including their content, appear to pop into your consciousness from deep, inscrutable sources, and (3) the things you feel good and bad about are notoriously difficult to control.
So what is left? In which corner of the mind does the “agent” reside? Even when you look into your own mind, what you see there is likely going to be a surprise — in any case, we would be reluctant to say that a person has agency because they can occasionally predict what they will experience.
There was a phrase in that last sentence which provides a starting point for unravelling this puzzle. I said “when you look into your own mind”. Again, I distinguished “you” and “your mind” — shouldn’t the sentence have been “when your mind looks into itself”? Even if I had written that, how would that have changed anything? The only difference appears to be the linguistic connotation: calling the observer “you” instead of “your mind” gives it a sense of agency, as if it chose to, or was trying do something. It seems like the difference in agency between “you” and “your mind” comes from focusing your mental efforts. Automatic, routine behaviours done thoughtlessly don’t count: agency requires “directed thinking”.
Directed thinking can be broken down into two parts: the motivations that drive you to pay attention, and the thoughts that resulted from this activity. You likely chose to look into your own mind because you experienced a problem that doing so would have solved: e.g. you were worried about some recurring thought, or you were unsure about your own feelings, or you just wanted to prove that you could, etc. What you did next, and how it affected your thinking is still poorly defined. There is a gap between the two — between motivations and subsequent mental actions — where personal agency seems to reside.
Few people understand why and how they learned to think the way they currently do. And when you do manage to bring one aspect into focus (e.g. your motivations), then the other becomes obscure (e.g. your thoughts and memories). So no causal connection is discovered between them.
To show the level of difficulty involved in connecting motivations and thoughts, let’s look at an example. If I asked you to imagine your ideal form of government, then asked how you could prove that it was “fair” (or unfair), there would be a hundred or more thoughts running through your head regarding governments, proof, what is and isn’t justice, and so on; none of which would have an obvious origin. You might say that, “being democratic, it would be just”. How did you know that? Why did you think those thoughts and not others? What made “democratic” pop into your head — even as an abstract idea? What made you think it was relevant to the question of justice?
It is rare that someone can answer such granular questions about their thoughts. Past moments of learning that cause you to think and reason in the way you do are largely forgotten or hidden, so it is hard to explain where such thoughts, when they do appear, came from. “It’s common sense” is a catch-all explanation we use to push aside and ignore this obvious gap in self-understanding. It is much easier to explain where a particular explicit memory comes from (e.g. that time you ate candied apple at a park and got sick), and harder to figure out where our templates for reasoning come from (e.g. connecting justice to thoughts of democracy, then asserting it as a necessary but not sufficient cause.) Yet both of these examples must have been learned somehow.
This series has already provided a few clues as to how to answer these questions. There is one piece left that is especially relevant to the problem of agency: it is the connection between ongoing motivations and immediate thoughts.
Since opinions about democracy and justice vary, let’s use another less contentious example: opening and closing a door. The sight of a door itself can cause you to think of many related thoughts (words, images, etc). None of them consistently predominates over the others. However, a specific motivational context, e.g. a tension like “I need privacy”, will narrow down the thoughts that show up to a useful thought of the door closing. Similarly, the tension “the door is closed and someone wants to get in”, will lead to the opposite thought of opening the door.
There’s an obvious connection between the motivation and the thought-images. The image that appears is a solution to the problem being solved. Until now we suggested that what you “become aware of” (learn) is based on what solves the underlying tension. This is still true, but there are many possible thoughts that can be triggered by a piece of sensory input like the word “door” — e.g. thinking of the trimming on the door, the hinges, heavy doors, the band “The Doors”, etc. Parsing which is the useful one out of the noise would be a challenge and would take too long. The presence of a tension narrows down those thoughts that are elicited to the ones it itself created. A different tension would cause a different reaction to the same input: e.g. if someone asks you for your opinion of a door’s appearance, thinking of the trimming may be useful. This makes sense if you consider a thought as a type of action: when faced with a problem, you would want to highlight any solutions that are relevant to it.
This becomes particularly useful when you consider that the mind must not only deal with many possible thoughts but many possible input cues as well. Consider the process of trying to think of a person’s name. You recognize many people by appearance, and for each of them you know many things —including their name. When you need help, and want to signal someone you see and get their attention, it is useful to first think of their name in your mind then repeat it out loud. But the motivation itself —i.e. you need their attention— is only about the problem you are currently experiencing, not about the person who happens to be around at the time. It’s generic. Only when the problem context is combined with the image of the person should it elicit that person’s name in your mind. Without the image of that person’s face (or some other cue), it is unlikely that their specific name would show up.
So the cause of a thought is a combination of two factors. We can call them “circumstance” and “context”. Circumstance is the set of current sensory inputs, including those that are self-generated (aka thoughts). Context is the prevailing tension that is looking to be solved. You can still have thoughts without having a problem context; you just won’t remember them, though you may act on them (routine actions). And you can also have thoughts that are not directly connected to the current problem context. The latter is critical for transferring knowledge from context to another. All a problem context does is heighten the relative likelihood that its own solutions, if they exist, will be elicited. This is why you tend to experience the world differently depending on what you are trying to do. When you are driving, you will have different thoughts if you are trying to park, versus if you are looking for a restaurant.
When your mind seems to be putting effort in a particular direction, that means is that a tension is present, and it is driving the specificity of your thoughts. And because the thoughts that show up are more likely to be solutions related to the ongoing problem, they seem to be chosen, as though they came from yourself. They feel genuinely “you” — even though you don’t know where the thought, say of democracy, came from in that moment. You are willing to defend it because you wanted to do so when you originally made the connection in the past. On the other hand, when random thoughts appear that aren’t obvious solutions to ongoing problems, you are more inclined to say they came out of nowhere (i.e. not you).
Agency, then, is just your complete agreement with your automatic thoughts within the context of a problem. It’s worth remembering that the concept of agency, and the assertion that you have agency, is itself is a subsequent interpretation of your own mental events. As such it is a solution to its own problem — the problem being that people are insinuating you are an deterministic automaton. The solution is to look into your mind and recognize something you can call your “will”.
Earlier on, we said that the only remaining aspect of “free will”, namely the gap in our understanding of the mind, was the question of how your mind directs its focus and efforts, and what happens as a result. Here, we can finally close that gap, by connecting the tension that drives thinking with the thoughts that it elicits. When faced with a door and you need privacy, the reason you think of it being closed is because the sight of the door closing is what solved the problem the last time. When you focus your attention on something, the thoughts that occur to you are echoes of experiences that have served you well before. This even applies to logical reasoning and abstract deliberation, as in the example of democracy above. Your chain of thoughts — sometimes referred to as stream of consciousness — is a series of such “problem — focus — solution” loops, each adding a new thought (image or sound) onto the pile, and each thought, in turn causing new search loops.
The next post will address some of the remaining questions around thinking; specifically, how this model of thinking and physical actions are more efficient than standard approaches to learning to deal with a chaotic world. We are finally getting to a point where we can start formalizing all this into usable structures and code.
Next post: How AI can survive outside the lab
¹ The one exception in Machine Learning. AI, however, is a broad field.
² Reasoning and logic are included under thinking here, but since reasoning is a manipulation of thoughts, it is therefore a cause of new thoughts; and so it is part of the unseen and uncontrollable aspect of thinking. For example, people who are well-trained in formal logic would have different thoughts appear in their mind than those who aren’t. Yet both wouldn’t fully understand where their respective thoughts came from.