Thinking is an act of imposing one’s will onto truth, not passive prediction

Addressing Nietzsche’s riddle of the unseen causes of thoughts

15 min readMay 25, 2024

The primal mystery

A thought comes when “it” wants to, and not when “I” want it to — Nietzsche, Beyond Good and Evil

When Nietzsche wrote the above observation he was drawing attention to the complete lack of control we humans have over our own thoughts — we cannot decide which thoughts will appear, or when. He pinpointed a flaw in the ideal of the rational man popular in his time, one who exercised control of his world by searching out and attaining his goals using reason. In contrast, Nietzsche highlighted the completely uncontrolled nature of our own contemplation. “Why did I think this thought, now, and not something else? Why is my stream of conscious thoughts exactly what it is?” He didn’t believe you could ever be the cause of your own thoughts, because thoughts are not something you can choose to have. Any thought you have occurs before you have the opportunity to choose it.

Our modern image of the rational man hasn’t yet accommodated Nietzsche’s observation. Rational man is still a planner: he understands and models the world, and then, after deciding on his interests, he makes predictions based on this model to discover a route from where he is to where he wants to be. Thinking, according to this paradigm, is in essence a kind of directed search algorithm. And whether that search is discrete or probabilistic, this basic theory remains the fundamental pattern of a rational thinker, and the blueprint for all of modern AI:

If the organism carries a “small-scale model” of external reality and of its own possible actions within its head, it is able to try out various alternatives, conclude which is the best of them. — Craik, The Nature of Explanation
The mental model is the arena where imagination takes place. It enables us to experiment with different scenarios by making local alterations to the model. — Pearl, The Book of Why

It also introduces a gap, a nearly imperceptible one, which plagues model-based AI from becoming completely autonomous. It is what Nietzsche highlighted in his quotation above, namely: if every thought you have occurs without a known cause, how does your mind know which thoughts to have when you’re making plans? Even if you build a fully causal model which could answer your every predictive query, how do you know where to start the chain of causes to end up at your goal?

Planning minus prediction

To make this concrete, let’s consider the following example:

Imagine you are hungry. You have a goal — satiation. You know that food causes you to be satiated, and you also know that opening a fridge causes you to find food. In addition, you are aware that opening a dresser causes you to find clothes, and opening a shed causes you to find tools. In summary:

Open a fridge → food
Open a dresser → clothes
Open a shed → tools

And also:

Food → satiation

You can draw a straightforward predictive path to your goal.

If your goal is to be satiated, then clearly opening the fridge would be the most useful of the above actions to take. There is a slight twist in this approach however: opening a fridge is being presented as an action. But when creating a plan in your imagination, the plan can’t contain any actions, only thoughts of actions. This distinction is important: actions are taken, they are not thought or remembered. A plan that you remember for later use must consist of only mental images and sounds.

This is a distinction that Model Based Reinforcement Learning (MBRL) — a type of machine learning that makes plans using a model of the world — doesn’t recognize. Since the agent always carries out its plans immediately after making them, remembering a plan over the long term is unnecessary. Humans on the other hand can make plans they have no intention of acting on, and merely keep them in memory for when they may need them later — e.g. “if I won the lottery, here’s what would I do…” This possibility of a “free-floating” plan, sometimes referred to as prospective memory, is not something you can find in the MBRL literature.

So when seeking food, the plan you or I would imagine involves visualizing yourself opening a fridge, followed by seeing — again, in imagination — the food inside. As you recognize, in thought, that this accomplishes your aims, you can “save” it, i.e. lock it in. All that remains is to find some way to carry it out. To do this last step requires that you have previously learned to connect the thought of opening a fridge, and the physical action of doing so; a difficult but not insurmountable challenge.

The connection between an imagined action and the physical action must be learned through experience. You are not born with this knowledge.

Such an approach to planning could be extended indefinitely. For example, you may have to remember how to get to the fridge from the bedroom before you can open it. Moreover, each step of the plan can be broken down into sub-steps, such as planning how to get to the kitchen, how to go down the stairs, how to turn the corner, etc. In the end, if your thoughts end up where you want to be, you have something like a plan.

In order to make plans in this manner, however, you must have a starting point. If you start off by thinking of the fridge, you will, by predicting that it contains food, recognize that it will achieve your ends. But why did you think of a fridge? Why did that thought come to you of its own accord? You could have thought of a shed, or a dresser. In fact, any search-based planning method assumes that there are a finite number of “starting points”, when in fact there could be an infinite number of thoughts you could begin the chain of inferences with. You could not iterate between all the possible starting points for every plan, no matter how trivial the plan.

Where do you start a chain of inferences? Only the top row actually works.

Modern MBRL gets around this issue by iterating through a finite number of actions — given the immediate state the agent is in — rather than hypothetical thoughts or images. This approach is somewhat tractable. But what if you are imagining a scenario that will happen in the future, such as how you will tell your boss tomorrow that the reports are not complete? It is not actions, then, that you are planning, but a set of sound-thoughts — what you will say, and what you will hear. And you could say an infinite number of possible things.

When Nietzsche said that “thoughts come when they want to”, we should pay careful attention to the consequences of that fact. The thought that occurs in any given moment is necessarily the starting point for your planning, since this is where you begin your chain of inferences. It is what you imagine and contemplate as your possible options.

And in real life (thankfully) these tend to be useful to your immediate needs. When I find myself hungry, my thoughts tend to be of my fridge, the nearest takeout locations, restaurants, etc. rather than dragons, license plates, or random math equations. I am glad that my mind is so accommodating, otherwise making even the smallest plan would be an ordeal, and not worth my time. True, it sometimes takes me a bit longer to come up with a candidate thought that sends me in the right direction; but in most cases it is quite fast.

The “forward” model of planning therefore must be reconsidered. So far the theory we have been taking as our foundation is that the mind builds up a causal predictive model, then uses it to discover a hierarchy of plans:

This assumes that the initial candidate thoughts occur of their own accord. Where do those come from? It’s not as if I find myself in a situation where I need something, and then, by chance, I hope that the right thoughts occur to me that line up with the right plan. If this were so, creating a multi-step plan would take impossibly long. Rather, the thought of the goal must itself elicit related thoughts that would be useful to that goal.

This is simply the reverse of the above graph, where predictions are replaced by intents.

This reversed approach is often referred to as abduction, or, in formal logic, backwards chaining. Given an imagined goal, you bring to mind the set of possible precursors that might cause it. This is in fact the state you want to be in, and it is, coincidentally, the end state of planning. When you think that you want food, you want that thought of food to cause the thought of the fridge, not just that the thought of the fridge predicts food.

According to Erik Larson, this particular mode of reasoning is sorely missing in AI. He spends the bulk of The Myth of Artificial Intelligence, advocating that logical abduction is at the core of human intelligence:

In open-ended scenarios requiring knowledge about the world like language understanding, abduction is central and irreplaceable.
The field [of AI] requires a fundamental theory of abduction. In the meantime, we are stuck in traps. — Larson, The Myth of Artificial Intelligence

The standard approach of making deductive predictions is only indirectly useful for the purpose of later working backwards to a set of actions; in other words, abduction towards a plan. Any prediction you learn that is not used in planning — even if only planning what to say to others — becomes a wasted effort. Moreover, building complete mental models of the universe is impossible — there is too much to learn; so the mind would do well to be selective in what it learns.

What you want instead of a world model is a set of specific thoughts that will be pertinent to your goals. If you could obtain these, you would be able to plan without needing to learn a predictive model first. Given your goal — or sub-goal for that matter — you want to learn what steps will get you to that goal. This is more valuable than prediction since it gives you something immediately useful: an intent, something to aim for.

And this is what this post is about: an inversion of the order of thinking, where the goal that you want to achieve makes its cause appear in your mind, so you know what to aim for. This latter thought is an image or sound which, if made a reality, will cause your goal to appear — if you open the fridge there will be food.

A twist on operant conditioning

I’ll called such thoughts “intents”, although there is no English word that perfectly encapsulates this type of mental activity. It is a recreation of a set of experiences that would cause you to achieve your desires, given the circumstances. Any time you make a plan you end up with a set of such intents, each leading “backwards in time” from the desired end state, to a thought (image or sound) of how to get there, or what might cause it; a set of “if-then” scenarios, or ideal thoughts of what you want to do.

So how does one build such an inverted causal model? Let’s use the example of getting food again, and imagine an infant that finds herself hungry. Not understanding much of the world, she feels only the tension of hunger, and perhaps the accompanying stomach sensations. Soon, food is presented to her, which she sees, and shortly afterwards her hunger is somehow satiated — though she doesn’t know exactly how. She now has a clue as to what could cause satiation — the image of food. If we assume for the moment that she did not take any action to make this happen (such as crying) then it is not an action that she remembers. It is the sight of the food.

This pattern of learning is similar in many ways to operant conditioning. In operant conditioning, any action that precedes a desired outcome is recorded; if pressing the lever causes food to appear, you record that action. When the agent is subsequently faced with a similar situation and a similar need (e.g. hunger), they recall that same action and carry it out.

*Operant conditioning, learning to push a level to get food.*

In the case of the infant remembering food, however, instead of recording and recalling an action, her mind records and re-elicits a set of sensory experiences as thoughts when in a similar situation. If recalling the action was useful, surely any knowledge of what caused the desired outcome would also be useful.

Example of remembering the cause of a reward. The girl sees the plushie, observes it’s purchase, and gets the plushie (reward). Later, when seeing a plushie again, she thinks of buying it — the cause of the reward.

Let’s consider the consequences of this presupposition. The next time the child is hungry, the thought of food comes to mind, as a sort of intent; it is what the child seeks to experience. The assumption is that the food caused the satiation, and so making food appear again would also make its effect (satiation) occur. Note that the child doesn’t record the predictive chain in the forward direction — namely that food causes satiation — but rather she recalls to mind the desired experience (food) when she feels hungry. This, as mentioned above, is far more useful.

Merely thinking of food is, of course, not sufficient. Often food does not appear immediately, and the infant is in no position to do anything about it. The thought of food that occurs in her mind is a kind of false stimulus — she can see the desired object in her mind’s eye. Now imagine that her mother, sensing something is wrong with the child in her arms, opens the fridge, at which point the child is subsequently fed. This time, the precursor to achieving her desires was not the feeling of hunger, as it was the previous time, but the thought of the food she herself imagined.

Going forward, the image of food will now cause the thought of opening the fridge. This pattern, you may note, is starting to resemble a plan in that it works backwards from goals to casual intents. The next time she thinks of food, the fridge will come to mind, as a means to the goal; an intent which supports the other, higher-level intent. Such a chain can be continued backwards in time indefinitely, all the while building intents forwards in time. And most importantly, no prediction was necessary — she only learned what she needed to know.

Connecting the thought of food to the thought of the fridge.

One benefit of this approach as opposed to the MBRL version of creating a world model is that it can be generated out of context and stored for later. This is because, as mentioned, when you have thoughts of what you deem to be useful, these get elicited as false sensory inputs. Given some fortuitous combinations, they can make you aware of a problem when juxtaposed with each other. You may bring to mind that a report is due on Friday, which is a fine and useful thing to remember; then suddenly combine that with the realization that today is Saturday, and you missed the deadline. Imagined problems, and even their solutions can be triggered without their causes actually being present in real life. And since you are not including the actual actions, but rather only an intent of what would be valuable to see or hear right after you saw/heard another thing, it is independent of the immediate situation.

Thinking is your will to control your world

Let’s look back one more time at that quote:

A thought comes when “it” wants to, and not when “I” want it to.

It is true that the thought — the thought of a useful cause — comes of its own accord, but it is also true that it would likely be aligned with what would be desirable to you at the time. It is something you want, in the loosest sense of that word, to experience given the situation. Every such thought you have is your best guess of what is useful to you in that context.

Most people recognize that thoughts should, in general, be useful. But the question of what “useful” means in this context may seem peculiar; it appears to take on two different definitions. In one definition, the content of the thought represents something useful, e.g. the fridge or the food are both useful in and of themselves. You could also, however, think of both of those thoughts as plans or intents. They are, in your experience, good to have in mind, since you could know what to recreate through action. Thus having the thought itself is useful, not just its contents.

The most common example of how having a thought could be useful is learning new words. Say that a child wants something tasty, and hears the word “ice cream” right before seeing ice cream arrive. The child remembers that others said the word before ice cream appeared, and remembers it as a possible cause for making ice cream appear. At a later date he recalls it and tries to recreate it with his own voice. By remembering what sounds were present, and bringing them into his mind, he gets the chance to “effect them” (make them happen), as long as he learns how to imitate the individual phonemes. He could not do this if he simply valued the sound when he heard it, but without remembering it.

Thinking of the cause of ice cream appearing (the words “ice cream) reminds you what to aim for, and gives you the opportunity to make it a reality.

Creating new thoughts only insofar as they are useful to you suggests that world-modelling isn’t a passive reception and recording of stimuli, but an active process of willing and shaping your perceived world around what you want it to be. This is a far cry from statistical correlation or Bayesian inference. The popular theory that the mind learns to make predictions without consideration for their ultimate utility is overly simplistic, and it ignores what we have long known about humankind: that people think preferentially. It is why we say that “people believe what they want to believe”. You are not a passive mirror to your experiences, but an active participant in shaping your subjective reality around your worldview, around your perspective.

A species grasps a certain amount of reality in order to become master of it, in order to press it into service.— Nietzsche, The Will to Power

This is build into the function of imagination, the ability to think not as the data tells you is true, but as is useful to you. All planning, regardless of the method, inherently involves imagining counterfactuals, that is, a different set of possibilities then what happened. You are always wondering “what if?” This is not a predictive model, it is by nature a creative tool for control, a thought about how to make things different from what they are, how you would like them to be — in other words, an intent. Data itself does not allow any deviation from what was observed, but imagination requires it. You must intervene and make the world different than it was in the past, or at least to pick that version of the past that is most agreeable to you.

This is why the mind, in focusing solely on useful experiences, necessarily assumes causation rather than correlation. Any content that is worth remembering for the purpose of later implementation, would only be so if it was expected to cause a desired outcome.

Causality can only be assumed if what you learned was correct at the time. At the heart of causal thinking is the notion of an “intervention”, the ability to make things go one way or another. Where this fails to produce the desired outcome, and you cannot control the world to your will, you have not discovered a cause and effect. The problem recurs, and the mind now looks for another “intervention” or means of effecting its ends; another intent, another assumed cause. Thus the mind assumes causation until the point it realizes it is unable to control the effect through the purported cause. Then, on failing to find utility in its plans, it must update them to a new intent, either erasing the prior one or adjusting it for a particular situation. It does both through the same process as seen above.

The goal of this whole system is control. Why even make plans? Why make predictions? Why think at all, if not to be able to control the world to meet your needs? Ultimately you are building a intent-centred edifice of thoughts. Your mind is always wondering: given a set of experiences, what experiences (intents) would be ideal? What would be the best thing to say or to do? As much as it is able, it always makes you think of the right thing in the right moment. This is a different kind of world-model that the predictive one: it is comforting. Human knowledge gives humans comfort.

It is our needs that interpret the world; our drives and their For and Against.
One seeks a picture of the world in that philosophy in which we feel freest; i.e., in which our most powerful drive feels free to function. — Nietzsche, The Will to Power

We can finally answer Nietzsche’s riddle; to explain where thoughts come from, and why those exact ones appeared. They appeared because experience has taught you that, given the circumstances, they are the useful causes of your ends. Your mind builds an image of the world as you want it to be, not as it is. This is the power of the human will, that it exerts itself even on truth, and fashions reality in its own image.