Why AI has difficulty conceptualizing time

How to bridge the gap between AI and transcendental concepts

From Narrow To General AI
16 min readFeb 17, 2024

This is the twenty-third post in a series on AGI. You can read the previous post here. You can also see a list of all posts here.

Time isn’t something you can see, hear, or touch. It’s what Kant referred to as “transcendental”, which means it has no correlates in the senses. You can’t label or classify time. This makes it difficult for AI to build a concept of time. For example, consider what happens when you ask ChatGPT how long you have been speaking to it:

Its estimates of time are wildly incorrect, and likely mimicking a transcript from a training conversation. It increments its answer by 3 minutes every time. Update: its responses have recently changed to simply reject the question.

The model struggled with these questions, despite the fact that telling time is a trivial task for a computer. The reason is that there is nothing in the chat history to indicate how long ago the conversation started. Large Language Models (LLMs) essentially “black out” between prompts, and are revived for the next question. They are not living creatures with an ongoing memory of what’s happening around them, e.g. memories of sunrises, class schedules, eating dinner, etc.

This problem is somewhat fixable. An LLM could have a custom routine that directs prompts that ask about time to an internal clock, and performs a calculation on the timestamps associated with the relevant text. Though this sounds simple, there is a reason it has not yet been introduced into ChatGPT. It requires the AI to realize when you are asking about “real time” and not, say, hypothetical questions about time, so that it knows to look outside the text at the meta-data.

Any implementation of this routine would require additional data engineering along with hand-written conversational markers, e.g. a marker for <the start of the conversation>.

Humans, in contrast, don’t have such built-in markers. Instead, your mind calculates the passage of time using the content of your memories. You can do this because, unlike LLMs, you experience life continually. Your experiences and memories are self-driven — e.g. you eat breakfast because you’re hungry, not because you’re prompted to by a user. When you need to make a determination about time for some reason, you have a set of skills to help you select the relevant memories from within your flow of thoughts. You’ve also learned how to logically arrange these around memories with known times, like a class bell you heard, or looking at your watch, or even your feeling of tiredness due to your sleep-wake cycle. Since you lack a standard internal clock that measures time in hours and minutes, you must resort to circumstantial evidence, along with what knowledge and skills you have to piece it together.

Human perception of time is fluid, flexible, and by default non-linear. There is no subroutine in your brain that automatically maintains a chronological record from the day you were born, and checks to make sure all your memories are in the correct order; you must figure that out yourself. Hence people often have dreams where they are back in primary school, or are talking to a dead relative, and never notice the temporal inconsistency. Even during waking hours, false memories may find their way into your psyche and confuse its history.

Ensuring that you experience time linearly actually takes effort; it is something you must actively create for yourself. It is an active effort that depends on you wanting to carry it out. There must be some motivation — e.g. social pressure, financial incentives — behind why you do it, and why you should do it right. The more you care, the more effort you will likely put in. Your conclusions will get more accurate as you collect additional facts, cross-reference events, and make corrections to your early guesses.

A disembodied LLM on the other hand has no choice but to rely on an internal clock, and timestamps associated with each line of text. Though this may sound like an improvement on our own sloppy and haphazard calculations, deferring to an internal clock — one that is linear, precise, and absolute — is actually overly restrictive. There are benefits to conceiving time in a way that is flexible, vague, and open to revision. For example, you can invent unconventional formulations of time like time dilation and time-reversal. You can develop a rich, deep, and malleable understanding of time which transcends anything that can be given to you by a fixed timekeeper. You don’t merely follow time — any computer can do that — you can build a subjective conception of time.

The fact that you can talk about “time” using English words means you have learned to connect the transcendental reality of time to your own conscious abstractions of it. You can perceive time, in the loosest sense, with your introspective eye, in the same way you can perceive existence or causation. You can talk about time as though it were a substance, like a river. You can gauge the value of having time, and try to acquire more of it (e.g. longer life). You can notice your own limitations in measuring time accurately, and invent more reliable timepieces like an hourglass or an atomic clock. You can realize certain interesting features of time, such as how time seems to fly by when you’re having fun, or that time can’t be turned back. You can even personify time, e.g. “father time” or “time is on my side”.

Your many experiences of time can be converted into a “conception” of time. This is not a trivial task.

All this goes beyond simply working within time, which any robot can do. You are pulling out aspects of your experience and connecting them to a named identifier: “time”. That you can even do this is a marvel that is rarely discussed. It raises some perplexing questions. If time is really an internal “sense” of some kind, what exactly are you perceiving when you sense time — is it the general concept of time, a particular moment in time, a difference in two times, a relative time, etc? And how do you convert that into a format you can use to solve problems related to time? This post will do its best to answer those questions. In the process it will challenge some common assumptions about how your perception of time works.

Understanding evolves through error

Time is one of thousands of abstract concepts that are tricky to formally define. Others (at least those that we have words for) include space, conditional, similar, consciousness, existence, shape, natural, object, mind, bad, justice, colour, beautiful, etc. And within these concepts there are subtle shades of meaning, such as the difference between usual, mundane, and regular.

No one is born knowing any of these English words. So there must be a learning process that associates your experiences with them somehow. Unfortunately, it’s impossible to find any consistent or distinct sensory experiences to attach any of them to — e.g. what sense input would you associate “existence” with? Their meaning must somehow arise spontaneously from your own mind, rather than empirically from the world outside. This is why concepts like beauty and consciousness are subjective, and why you can even understand and use many of these words without necessarily being able to clearly define them (e.g. how would you define “consciousness”?) At first glance, this suggests they may be innate or hard-coded into the brain. However, many abstract concepts like software and post-modernism are obviously not innate. So if they’re not innate, and not empirically derived, where do these concepts come from?

In the rest of this post we’ll outline an answer that is based on the approach discussed in the previous post. Specifically, we’ll show that generalized abstractions arise out of individual efforts to solve problems.

The first step to cracking the riddle of abstraction is to notice that people can successfully accomplish tasks without understanding the concepts involved in them. For example, a child can learn to engage in buying or selling without grasping that the tokens he is using are instances of a broader concept called money. He may have only been taught how to give a specific set of coins to a particular vendor and get a particular object in return. This, to him, is a unique case, a special action in a fixed place and time. In other words, he can successfully use money without knowing it’s money. In the same vein, people often correctly use words whose meaning they don’t know, simply based on context and a gut feeling. For example, in the phrase “separate the wheat from the chaff” most people couldn’t say what “chaff” is, but they sense it’s a lower grade of product compared to wheat.

The purpose of these examples is to show that useful understanding can be disentangled from conceptual understanding. Clearly defined concepts like money are a late arrival in a person’s understanding, and are usually learned for the purpose of effective communication. All early learning, on the other hand, is limited to superficial, one-off sensory experiences with no overarching concept behind it. We can expect no more depth than that at the start. Over time, a more “correct” and complete understanding can be built up out of these flawed and spurious associations. This happens as the mind experiences limitations on their ability to effectively interact with the problem domain, or to communicate it to others.

Before discussing time, let’s briefly look at a concrete example of such error correction processes by describing how children explore the word “grandma”. A toddler who learned this word may initially behave as if his grandmother is the only one associated with it. As his social circle grows, he may get momentarily confused when other people also seem to share this designation. The child will then adjust and learn to use the word as a label for other people by including special modifications — e.g. “Zoey’s grandma”. He will make these corrections because it is socially useful for him to call both women “grandma”, though in slightly different ways and contexts.

Still, it would be premature to assume he has generalized the concept “grandma” at this point merely because he used the word in a few different cases. For all he knows, the two people just share the same appellation, like a proper name. Many people are named “Caleb”, but no deeper concept of “Caleb” need unite them.

A critical and often ignored consideration when analyzing how we learn words is why the child learned the word “grandma” in the first place. This is generally downplayed as some form of statistical association, combined with paying attention. However, we already discussed in a previous post why simple association is not effective for learning in most unsupervised settings. The real answer as to why a child learns “grandma” is both simple and obvious: the child likes his grandmother, and often wants to get her help and attention. Humans readily learn facts and behaviours that solve their problems. The child discovered that the word “grandma” — regardless of who it is uttered by — was a useful tool to get her affections.

A child may say “grandma” to get her help.

This motive matters; it will shape what he learns about the word “grandma”. For example, when other children use the word “grandma” with respect to their own grandmothers, this child’s grandmother will not respond, and the anticipated loved one will not come. So the motive that drove him to learn the expression will be disappointed. What counts as “successful” identification, including any subsequent corrections the child makes, depends on the problem he is solving.

Someone else saying “grandma” results in disappointment.

The effect that comes from saying “grandma” depends on who says it. This is not true of all words, especially for things of which the child thinks there is only one (e.g. his “daycare”). “Grandma” is relative to the person speaking, and this property becomes the root of relativity in the concept. Such a property is not consciously recognized by the toddler — it influences him through the successes and failures of various actions, and through special cases that need to be learned. Relativity will also play a role later when, as an adult, he starts to use the word “grandmother” more abstractly — e.g. grandmother centrioles in cellular mitosis.

Grandmother is ultimately an abstract concept. You cannot fully learn to use the word simply through sensory associations. And so we come to the heart of this post, and one of the greatest challenges in identification: how to learn to identify abstract entities. The focus of the rest of this post will be on applying the tools discussed in this and previous posts to develop a concept of time.

Before we proceed, I must provide an important caveat. It is impossible to explain in a single article all the ways people build up a given concept, especially a complex one like time. The more nuanced it is, the more layers of subtlety and interconnections it will have within your thoughts. All we can do is provide a set of examples of how particular acts of identification may be learned, and from there derive a general rule.

A rough beginning of “time”

Let’s imagine a child who has never engaged with the topic of time. Her first encounter with the concept could not, therefore, depend on being able to identify time itself directly — no identifiable concept would exist in her mind yet. Her first steps into this domain will necessarily be approximations. They will only become refined through later correction, as with the case of “grandma” above. We can’t expect much more from a person who has just begun to learn about a topic than best-effort learning based on their existing knowledge and skills.

For example, infants lack enough understanding of the world to be able to imagine parents’ future behaviours. So for them, being told that they will get something “later”, or “tomorrow”, or that they must “wait”, are all indistinguishable from being simply told “no”. Hearing any of these words is merely a predictor of disappointment.

Once the infant has enough imagination to predict others’ actions, and to recall and correlate events across time, she may be able to perceive that “later” often meets with a resolution, whereas “no” does not. From this, she may learn a set of anticipatory hopeful thoughts when told she will receive something “later”. These thoughts now serve as an imagined solution to the initial disappointment: she need only “wait”.

Whereas “later” and “no” used to be synonymous, “later” now admits of a distinct set of solutions compared to being simply told “no”. When told “no”, she must push more forcefully for her demands, or accept defeat; not so for “later”. This is the the basic distinction in her mind between the two — the solution to the problems is different.

As the child grows to be a toddler and learns more about the world, new types of problems begin to register of which she was not aware when she was an infant. One day she may find herself annoyed to be sitting in a waiting room, not doing the things she wants, and being told to behave herself. We adults recognize this as boredom, but the toddler is simply reacting to being forced to sit still. She does not perceive it as her time being wasted.

Later she may be told that she can leave her seat when a clock shows a particular position. She keeps looking up at the clock, seeing the minute hand get closer in distance to the desired position. Finally, she is allowed to leave. Since clocks only go in one direction, over many such situations, and many waiting rooms, she may learn to predict the changes in its position in her imagination, and hope for it in real life (i.e. it becomes an intention). It doesn’t need to be a clock either— perhaps it is a church service whose liturgy she has memorized, and whose end she can predict.

By noticing and predicting the changes in the clock, the girl lays the groundwork for her understanding of duration. We need not look for any innate understanding of duration to explain how the girl can engage with the problems above, because she is only observing the clock’s motions. The order of learning is reversed compared to our usual assumptions: duration first emerges out of her encounters with this problem of waiting. As she matures she may develop a more mathematical and refined version of duration. Yet even then the “feeling” of duration will still bear the stamp of its motivated origins: it is rooted in her memories of having to wait against her will, or her disappointment when things she enjoyed ended too soon. Going forward, she will usually care about the duration of an experience, and start looking at the clock, only when she gets bored of the activity. We adults are not significantly different in this respect.

Other motives, and other reasons will also arise that require her to address time indirectly. She may be late to class and get chastised for it. Afterwards, she may learn to look at a clock and notice that problems arise for her if her actions don’t coincide with certain arm positions (punctuality). In other cases she may learn that she has to choose between two mutually exclusive activities that can’t both be engaged in at the same time (exclusivity). She may even learn how to say the word “time” without having a concept of it; such as when playing tag and yelling “time!” The function of that word would be to get others to stop chasing her so that she can tie her shoes without hindrance, or without losing the game.

Each such situation will have its own domain and appropriate solution; solutions such as heading out the door at 8:30 instead of 9:00, so as not to be chastised. They have not yet been joined together under a common banner as problems of “time”, although some may start to intertwine — she may be forced to wait in detention and is therefore late for soccer practice. The first time these overlapping problems appear, she will deal with them on a case by case basis. She is not cognizant of much beyond them. Time, here is still in its primordial stages.

Understanding is always partial

There is no way for the girl to circumvent this slow process and go directly to an essential understanding of time. All identification, even of abstractions like time, is founded on the moment-to-moment discovery of useful interactions. Her concept of time is her grasp of solving its problems. Her frustration at not having enough time to finish a game, or the embarrassment of not being on time for class are necessary to motivate her to learn about it. We mentioned at the beginning of the post that you can only engage in calculations of the duration of time if you are motivated to do so. We can now see where these motivations came from.

Like the rest of us, the girl in the example above starts off having only one part of the larger picture, and building on it with each new problem. This picture is infinitely large, however. It will never be complete. Understanding a concept like time is not all-or-nothing; even when you ultimately develop a refined concept of time it will never be complete or thorough. Thus you could say that no one has a concept of time, only a subset of useful motives, thoughts, and responses related to it.

The diversity of problems related to time is immense. It spans both social-linguistic problems, like the child yelling “time!” to influence her play-mates, to more personal ones, like choosing which of two activities to do in a fixed span of time. From these arise your complex understanding of time. For example, your desire to negotiate and trade-off time commitments is what makes you understand time as a “substance”; i.e. it is fungible. Your fear of not being able to accomplish all the things you want makes you yearn for more of this substance (longer life). Whenever you “perceive” time, what you are actually seeing are the solutions that seem to be inherent in these interactions, or that seem to emerge out of it.

It is admittedly difficult to grasp how a mind can engage with time without having a singular, comprehensive concept of it. When we try to imagine what a child was thinking when they called for “time!” during a game, we project our own, complex concept of time onto their behaviour, instead of accepting that for them it was simply a pragmatic word in that moment. They said “time”, but they weren’t thinking time.

Our adult, objective understanding of time and its features — its exclusivity, its linear order, etc — is the end product of a lifelong analysis, not its beginning. We have learned so much nuance across such a wide range of practical applications, and forgotten so much of where it came from, that it is compelling to think that our concept of time is innate. This post, I hope, has shown that you would not have reached this understanding if it weren’t for the myriad of specific problems that forced you in that direction.

This is the flaw in adding an objective timekeeper to the LLM, as described at the start of the post. To focus on getting an AI to have the objectively “right” conception of time is to jump the gun. Grounded learning requires that we let the concept of time evolve organically from rough, imperfect seeds. Otherwise, we are preempting the AI’s opportunity to define time by itself, and thus cutting off the option for it to build a useful concept of time. Like an overeager parent drilling a child through wrote memorization, the resulting learning is brittle, and will fail when the situation deviates slightly from expectation.

This post only laid the basic foundations for understanding how abstraction, concept creation, motives, and identification are all intertwined when it comes to defining time. In the interest of brevity, the more detailed mechanics will be discussed in more depth a few posts from now, when we delve into the stream of consciousness. As we’ll see, the details of your stream of thoughts are consequential to the result; every moment of awareness is an instance of identification or assignment, and contributes to the larger corpus of understanding.

Next post: The paradoxes of identification

--

--

From Narrow To General AI

The road from Narrow AI to AGI presents both technical and philosophical challenges. This blog explores novel approaches and addresses longstanding questions.