Intention and Thought: Code Demos

From Narrow To General AI
5 min readSep 5, 2022

Your thoughts are your intentions. Specifically, they represent what you would like to experience in a given context: the words you’d like to hear, the object changes you’d like to see, etc. The content of the thought is something that proved useful to you when it was learned. Your thoughts are you imagining the thing that solved your problem the last time.

If, for example, the presence of a stranger causes a child unease (aka “tension”), and the appearance of his mother removes that unease, then when he sees a stranger again, he will immediately think of the appearance of his mother — the solution.

Below are three examples of how this is currently implemented in code. You may spot a few features and constraints that made this a challenging problem to solve.

All videos have been slowed down to make them easier to understand.

Intention chaining

This first demo is an example of chaining intentions, meaning that one intention caused a second one to be triggered.

The red triangle predicts a problem — the red outline around the viewport indicates a tension is active —which is relieved by the green square. Afterwards, when the red triangle appears, the agent immediately thinks of the green square. In another context, a green square also predicts a problem, which is itself solved by a blue circle.

Now when the agent sees a red triangle, even in a new context (the office), it thinks of the solution, the green square. That is, it imagines seeing it. This recreated image of the square now triggers it to think of the solution to the second problem, the blue circle.

As mentioned in a previous article, thoughts can be interpreted as an intersection of predictions, expectations, plans, intentions, imaginings, and wishes. They can be any and all of these depending on the context.

Relative Positioning

You may notice that the position of the shapes varies between instances. This is realistic since you wouldn’t be able to guarantee where an object will be in your visual field. The small red-green circle in the videos indicates where the agent is looking — what direction its eyes are rotated towards. The eye is always drawn to look in the direction of the greatest high-contrast change. This is why they turn to a newly appearing object. The colour and contrast inputs are oriented relative to the centre of the eye, like cones and rods on the retina.

When the agent imagines the green square, it imagines it where it was relative to the centre of the eye. The assumption is that objects will generally be relative to each other in position —e.g. if you see a body, then the head will generally be above it, regardless of where the body was seen. In addition, because the eye centres its view on the object wherever it is, it has location invariance.

The algorithm also ignores irrelevant inputs, like the office background. They are considered irrelevant because they weren’t present at the time the agent learned the object. Each colour or contrast input point — represented by the circles — is either present or absent in the solution in a boolean manner.

As a result, when an intention (thought) is elicited, the mind adds the inputs that it wants to see to current perception; like a hallucination overlaid on top of external senses. And when it elicits the green circle, it also sees it in its “mind’s eye” and on seeing it, it thinks of the next intention, the blue circle.

Imagining in 3D space

Imagining in 2D space is relatively simple, but if the agent is moving in 3D, all the while looking around at points of interest, it can be tricky to overlay the prediction in the appropriate place. This is especially true if the new scene is not exactly the same as the original, which is likely to be true in the real world. Moreover, since the learning is done in a single shot, there must be enough margin for variation while still interpreting the inputs in a similar way.

In the above example, think of the agent as becoming “hungry”, and the green block as the food. On contact, that hunger is relieved. The second time the agent runs through the world, there is no green cube, so it imagines it where it was before. You can see the green inputs overlaid on top of the grey landscape in the second run-through. The imagined cube follows a similar timing and is in a similar location as when it appeared the first time.

Positional Tracking in Thought

This last example shows how the agent can track a moving object and complete its trajectory in thought.

In the first run the agent learns the expected trajectory of the red circle. The first half of the trajectory predicts a tension which is resolved by the second half. This is similar to seeing a baseball tracing a trajectory towards you. On spotting the first part of the trajectory your mind predicts a problem — you may not catch it. So you imagine the second part, where it will land, which helps you solve the problem.

This act of completing the trajectory in thought is shown twice above, once with nothing behind the circle, and a second time with a complex background. In both cases the mind combines the imagined solution with whatever is behind it.



From Narrow To General AI

The road from Narrow AI to AGI presents both technical and philosophical challenges. This blog explores novel approaches and addresses longstanding questions.