Getting past Kant’s and Hume’s legacy in AI research
Scientific revolutions, corporate motives, and the need to renew AGI’s philosophical commitments
When Isaac Newton developed a new set of principles for the physical sciences, he changed, or more accurately solidified, the kinds of questions that its practitioners should ask. He recognized that the proper way to treat the motions of objects is to reduce them to mathematical calculations, even inventing a new math — calculus — to do this. Although it seems obvious in hindsight, this was not the type of question physical scientists before him asked, because there was no reason to assume a priori that the physical world obeys mathematical rules. It was just as likely, perhaps more likely, that things tend to move or fall to the ground in a vague and imprecise way.
Prior to Newton, when “scientists” studying nature wondered why and through what mechanisms objects moved, the answer they expected to find was qualitative. Lucretius argued that man-made fire was less “fine” than fire from lightening because the latter comes from heaven. The physical world was viewed as a mixture of idiosyncratic characters and personalities, with each interaction giving rise to its own peculiar type of behaviour¹.
Newton abolished that approach — the “quality” of the participants in an interaction was no longer a valid question for physics. His biggest contribution to the field was to introduce the equals sign, “=”, into the definition of “change”; that is, forces pushing in two directions must be numerically equal to each other. Taking that as his assumption he retroactively tried to formulate forces on both sides in a way that fit the framework. By luck or insight, he focused on acceleration (or momentum), which gave him a much-needed foot in the door. In summary, the way he framed his questions — as quantitative forces — was the key to producing useful answers.
“I checked it very thoroughly,” said the computer, “and that quite definitely is the answer. I think the problem, to be quite honest with you, is that you’ve never actually known what the question is.”
“But it was the Great Question! The Ultimate Question of Life, the Universe and Everything!” howled Loonquawl.
“Yes,” said Deep Thought with the air of one who suffers fools gladly, “but what actually is it?”- The Hitchhiker's Guide to the Galaxy
AGI (Artificial General Intelligence) is in a similar position. The study of human-level intelligence is still a nascent field. It does not know what its questions are yet. There is only a vague, abstract intent, something about humans and computers being like each other. But there is no definitive consensus on what the criteria for human intelligence is. And without the right question, you can’t frame inquiry in a way that will give useful answers.
What little AGI has, it has borrowed — like the rest of AI — from adjacent fields. Psychology and cognitive science explore human thinking and reasoning with the goal of predicting observed behaviour across test subjects. Psychiatry generally frames its questions with an eye to “normal”, healthy human behaviour. Sociology often does the same on a larger scale. Each of these has imparted its own contributions to AI research.
However, by far the largest source of framing assumptions for AI research has come from the business needs it supports. In modern research the driving principles that define the success of an AI model are repeatability, reliability, and the obligation to match some desired performance metric or ground truth. These assumptions are enshrined in the so-called “state of the art”: performance benchmarks based on a human-defined metric, where the goal is to replicate an expected pattern of “correct” behaviour. Such benchmarks are useful in production, where 99% success on a repetitive task is more financially lucrative than 97%.
Unfortunately, they also skip over the fact that humans don’t usually think in terms of probabilities or performance optimization. We jump to hasty, good-enough conclusions, only self-correcting when obliged to. We hold to irrational convictions based on social motivations like pride and egoism. And we are lazy and overcautious when exploring our environment. Unlike a warehouse robot, we rarely perform the exact same activity a thousand times, and so optimizing one or two percentage points of performance is of minimal benefit in everyday life. Accuracy and reliability are the values of an employee or a production robot — they are not necessarily concerns in our personal endeavours.
What is left out of the picture is just as indicative. The yearning for revolutionary creativity, or spiritual introspection are alien to business operations. Rare, special, or unique behaviours are treated as outliers to be washed away and ignored. Jokes are “funny” only if more than 20.1% of people mark them as funny on a questionnaire — private jokes don’t exist. Precisely generating 3D spaces and high-resolution video are prized since they can be sold. Business needs have defined these criteria with a view to their own productive goals, and these goals have consequently rippled backwards through funding in research.
“If it doesn’t make dollars, it doesn’t make sense”. — Dollaz + Sense, DJ Quik
Even the modern popularity of Large Language Models is not the result of a shift in technology or science — transformer architectures are nearly ten years old now — but a shift in how effectively they can be delivered to consumers. As business funding continues to drive the direction of research, the focus of researchers themselves is drawn to the sparkle of venture capital. Under these conditions the dominating questions of the field are unlikely to change.
When paradigms change, there are usually significant shifts in the criteria determining the legitimacy both of problems and of proposed solutions. — Kuhn, The Structure of Scientific Revolutions
Any real sea-change in a field requires practitioners to step behind its fundamental assumptions and goals, and this, by definition, means engaging in philosophy. Unfortunately, many in the field of AI research see philosophy as a dead art. For them, questioning fundamental assumptions is irritating, of dubious value, and points to a lack of productive focus; despite the fact that all major scientific revolutions (or revolutions in general) have involved a reinvention of fundamental assumptions.
The optimistic faith in iterative improvements on existing tech, and the hope that the assumptions of the field are enough to carry us forward, is tempting, because it gives a solid foundation for productive work within the space defined by those assumptions. The same status quo, however, resists any attempt to break out of those assumptions, since the very possibility seems ludicrous in view of its existing commitments. Why try to build a robot that enjoys listening to music the way humans do? How will that make money?
It’s true that in the last century our understanding of human psychology and intelligence has shattered many barriers and upended just as many old assumptions. The “bitter lesson” of AI is the most recent one; we gave up our reliance on neat logic for more scruffy connectionism, and undoing those assumptions has paid off massively with the preponderance of effective Deep Learning models.
Yet there are many philosophical assumptions that AI research has held onto since the time of the 18th century empiricists. They have largely been inherited indirectly, via the scientific method that AI researchers themselves adopt to do their work. The “rational agent” that forms the foundation of AI models is a phantom and an ideal of scientific objectivity, a projection of what scientists aspire to be, as opposed to what humans actually are. As a Bayesian calculator of probabilities, causes, and effects it is nearly unrecognizable in everyday thinking. Only a decidedly dogmatic mind could convince itself that Joe on the street is making, or even can make, Bayesian calculations when choosing whether or not to adopt a new puppy.
AI’s ongoing love affair with statistics and probabilities is partly a result of its roots in data science, and partly a legacy of Hume’s empiricism. Hume defined belief as how the mind aligns with what is statistically most likely to happen. This approach confuses the best practices of data science — following statistical correlations — with the actual practices of human thinking. It is a clear case of mixing “ought” with “is”². It’s self-evidently absurd to suggest that the human mind is a Bayesian calculator of probabilities, then have to spend eighteen years of human life teaching people how to calculate Bayesian probabilities because without such training they are regularly susceptible to innumerable cognitive biases.
Even Hume conflated his ideal, and the ideal of rationality, with reality itself. The fact that actual, lived experience consistently shows that people aren’t guided by the “preponderance of evidence” in their experiences was both recognized by Hume and simultaneously dismissed by him as human weakness. Hume remained blind to this self-contradiction in his theories until the end. Nor was the same contradiction caught by modern cognitive science research. The latter, by its very construction, relies on statistical predictability across populations of test subjects. So the failure of humans to think statistically was obscured by the practice of studying humans using population models, which, of course, invariably showed that they did³. The Humean snake was eating its own tail.
Other philosophies have also bequeathed AI its dubious inheritance. Kant rightly recognized that time and space, as well as other “categories” (unity, necessity, etc) are inventions of the mind, rather than being imparted to us by reality. In practical AI implementations this has given rise to the use of fixed internal clocks — which humans don’t have — and predefined 3D spaces in which robot movement is mapped and calculated, in addition to a plethora of mathematical and logical inductive biases which make up the complicated architectures of modern AI.
Kant himself never was able to account for how those concepts of time, space, or causation could migrate from being forms of understanding or perception — i.e. how your mind innately processes information — to concepts in understanding — i.e. how you consciously think and talk about them. Your brain may work in time, but that does not mean you would ever know about time, any more than a clock could. Kant left it as a magic “intuition”, an innate feeling for time and space given to the brain by the brain, alongside its experiences. This principle is still widely adopted across AI research, despite the fact that it has no explanatory power, nor can it be practically implemented.
‘Transcendental’⁴ concepts and mechanisms of all types, including deduction, categorization, spatial reconstruction, taxonomic hierarchies, planning, optimization, objectification (segmentation), continue to be hard-coded into modern AI. Any feature of thinking that can’t be imported from sensory experience has become a fixed part of the architecture, on the presumption that the only two ways of gaining knowledge are sensory experience and innate concepts (which they’re not). The resulting agents are inflexible and unable to adapt to unusual or changing conditions, such as when a taxonomic hierarchy is circular, or when the terms of a logical syllogism are ambiguous. They end up resembling closed-minded stereotypes of robots: rigid, uninventive, and only useful within predefined, constrained environments.
Computationalism has also had an influence on AI, though in an unexpected way. The lesson AI has taken from that philosophy is not (just) that intelligence can be computed on hardware — a fair assumption in my opinion — but that intelligence resembles computation. The latter was never part of the assumptions of computationalism; it was at most hinted at. This has lead to fixed pipelines of perception, planning, and action which deny any possibility of, say, planning without taking any action (e.g. thinking of a joke but not telling it to anyone), or perception for the sake of learning (e.g. studying finance when you get confused by your taxes), or even action for the sake of learning (e.g. turning a package over to feel how its contents shift). All of these behaviours, which are learned in humans, are either manually built into the AI’s architecture or they simply don’t happen.
One could argue that AI’s interest in philosophy ended around the time of Kant and the scientific revolution of which Kant was the philosophical product. Hegel, Nietzsche, Deleuze, and Bergson are essentially unknown to the field. Even the term “postmodernism” is taboo. The philosophical assumptions of empiricism are so thoroughly taught to students in the field of science — since they are necessary for them to do their jobs — that students form, quite early on, an image of human intelligence in the image of that scientific ideal. It’s not that such principles are inappropriate for science; they are, after all, its foundation. It’s that we’ve invented an organon of cognitive tools for one specific field — science — then proceeded to project it onto our entire model of the human mind.
There is a lot in human perception and cognition that is irreducible to Bayesian calculations. Unfortunately, “airy-fairy” notions like the sublation of concepts through contradiction (Hegel), or the eidos in phenomenological inquiry (Husserl) are antithetical to the progress of practical research, and even more antithetical to business outcomes. Here, commerce and scientific empiricism are essentially aligned. Researchers need a set of quantifiable measures, definitions, and goals to aim for so that everyday “puzzle solving” (to use Kuhn’s term) can continue; and businesses need a set of measurable outcomes to predict financial success.
Only in philosophy do people still let themselves doubt their theoretical commitments, while becoming versed in the broad spectrum of fundamental topics and theories to which others willingly turn a blind eye. The fact that the field of AGI has not yet clearly defined its assumptions, or even its subject matter, should give one the motivation to reconsider what its assumptions should be. A Newtonian shift in perspective may open up a new road to progress, a paradigm change built on hitherto unrecognized foundations.
I have, in my own way, tried to do just that. I’ve reintroduced motivations into rational thinking, broken down concepts into their individual moments and mental changes, reduced consciousness from a continual state of existence to isolated islands of awareness, and replaced reward optimization with discrete, lazy goals (satisficing). Many of these approaches are unknown to, or incompatible with modern AI research. They are still — in my estimation — the best way forward. Yet I would not have proceeded in this direction if I had adopted the assumptions of the broader field. It’s not that empiricist assumptions or the lessons of transcendental idealism are wrong — they are far better than their precursors. It’s just that they are insufficient to get us where we’re going.
¹ To be fair, Lucretius was one of the earliest thinkers to try to unify this variety of behaviours under common principles, but his laws remained qualitative, and not mathematical.
² Ironic, since Hume was the first to clearly articulate the difference between “is” and “ought”.
³ For example, a given argument moves from “30% of people did X”, to “a given person has a 30% chance of doing X”, which is taken to imply that “a brain has 30% probability of doing X”; which doesn’t follow.
⁴ I use the term “transcendental” loosely here, not as Kant used it.