Introduction
Let’s weave a little story here together - I want to look at the biblical creation myth and compare it with the current machine learning research. Then we drill down into the actual ‘cognitive’ deficiencies in algorithms that appear relatively intelligent. We compare these models to organic life, and when that fails, we try to find parallels with magic and Goetia. At the very end of this ride, we try to flip it all upside down and inquire if we’re asking the right questions. Many loose threads to hold onto!
Glossary Note:
For the purpose of this article, I’m using AI, machine learning, deep learning and artificial neural networks interchangeably.
In her own image
Whether real or imagined, the potential of creating intelligent, and perhaps one day even sentient machines, brings us a full circle back to the origin myth of humanity itself.
~Amir Vudka, The Golem in the Age of AI
I saw a thought-provoking presentation at the Occulture Conference last week. Amir Vudka was talking about the resemblances between the AI and the Jewish Golem. This folk figure is traditionaly depicted as a large clay dummy, but don’t get fooled - the similarity with the latest machine learning research is striking.
Genesis 1:27: “So God created man in his own image, in the image of God he created him.”
According to Kabbalah, Adam, the primal figure, was crafted as a Golem. These large anthropogenic dummies of Jewish mystical traditions are traditionally kneaded from mud, and animated through the inscription of Hebrew letters and sacred words onto its body. The golem is not a fully sentient being and this defect is manifested through its inability to speak - in a more occult sense, this implies the absence of language as the fabric of the soul. The creation is not complete until the being receives the “divine breath”,
By definition, if there is one thing the large language models don’t lack, it’s their ability to speak. Yes, yes, any algorithm, no matter how complex, is just mathematics - matrices and a bit of calculus stirred and shaken. They don’t look anything like us, and they can’t emulate our experience of the world. Yet, we’ve fed them our own words, the treasure house, or maybe rather the dumpster of our collective creation - in our very own image we’ve created her. No matter how alien the endless rigs of shiny A100s look, they too are children of this earth, with their fibreglass and copper flesh, and fine veins of silicon.
Knowing so little about the actual nature of human consciousness, who dares to claim there is no way of constructing the fire of the mind top-down, -in-our-own-image- rather than in an evolutionary bottom-up way?
*Strange Beasts
Now, diffusion is that curious beast that got me so excited in the beginning. It’s a type of Neural Network that enables you to generate a requested image from a text prompt. You ask for a “knitted bowl of soup that looks like a monster” and ka-boom, there you go. This image is built by taking random bits and applying consecutive steps of de-noising, to carve out the desired image. Yes, it’s as mindblowing as it sounds; the diffuser, just like a sculptor, chisels out (with layers of Gaussian noise) a completely plausible image out of the prima materia of pure noise.
*Ghost and the Shell
Imagination: the act or power of forming mental images of what is not actually present or has never been actually directly experienced. ~ Webster Dictionary
I keep hearing a lot about these models being an important step towards machine sentience. It really does seem that we managed to emulate something similar to human imagination. The diffusers can assemble scenes never seen before, never imagined before. I just read a wonderful article On The Implications of Outsourcing The Imagination To a Dreaming Machine, highly recommended diving deeper into this topic.
The potential opening up before us is mind blowing. So many applications and so many life-enhancers will come out in the next few months! But before we jump to conclusions about machine sentience being just around the corner, let’s do a quick surface comparison with our human cognition. I want to look at some basic features of these models and contrast them with our human experience. Let’s see what we can learn about the Other that is growing right in front of us. Feel free to skip to the next section if it gets too technical - it’s my guilty pleasure.
*Amnesia and Free Will
The diffusion model is also a Markov chain, meaning there is no internal state preserved - the model has no ‘memory’ of the previous diffusion steps. This property is an interesting thing to conceptualise in the human domain. Imagine the diffusion model as a painter; every morning he returns to the studio, there’s a new unfinished oil canvas waiting for him. He has never seen it before, it might not even be particularly good. All he has is a sticky note on the easel with the instruction “a field of dandelions in late summer, stormy clouds above, vibrant colours”. So he sighs, takes out the brushes, mixes the canary yellows with a bit of blonde, and starts where someone else left off - ten or twenty brush strokes, and then he covers it up, turns off the lights and walks off home, waiting for a new surprise tomorrow.
That’s at least a bit odd, don’t you think? How would this affect your way of working? Knowing that your work will be interrupted and replaced by someone else at any point - does it change your creative process? Does it annihilate your feeling of ownership, your intention, your will?
Another important point here is that the diffusion models are inherently deterministic - for the same initial conditions of a specific model, the same results are yielded. The models do not learn as they see new inputs. They don’t store experiences, nor do they modify their predictions based on previous mistakes. They stay the way they’ve been created.
All of the popular generative networks work in this way, be it language or image. Every successive pass is unrelated to the previous one - they are stubborn amnesiacs. In the realm of our organic world, this behaviour is unthinkable - even slime mould is able to adapt and learn from previous experience.
We can overcome all these problems: we establish real-time feedback loop for the models to learn as they go. We grant them internal states and memory. We can add more and more features to emulate the human experience and see what happens. And we will most likely do that, in the foreseeable future. Will we manage to kindle the divine spark? Maybe. Shall we do that? I do not know. But the truth is that currently, even the simplest biological entities floating over your eyeballs are far more adaptable and ‘intelligent’ than anything we can summon on our GPUs.
The Machine Goetia
I hear you, maybe it’s a bit unfair to compare a disembodied entity, such as an algorithm with organic, carbon-based chunks that have completely different evolutionary pressures? Maybe we should look for a more fair comparison - and here I am, finally in time to talk about some devils and demons and other subjects I can’t really bring up in my corporate career talks.
First, some general framing of our algorithmic zoo. There is a countless amount of possible algorithms to choose from. Diffusers, GANs, Convolutional Networks, all have their own intricacies and special qualities that make them more suitable for certain problems. Classes, or species, if you will.
Another interesting aspect is the identity of these models. When we use, for example, a diffusion model, we are working with a single trained instance of a specific algorithm. It’s not an abstract ‘FBI agent’, it’s Fox Mulder. Echoing goetic magic, specific problems are addressed to specific entities. The weights of the model, which are the essence of this being, are all determined during the training process where many factors come into play - the random shuffling of the input images, the initial random weights, test/train split of the dataset - all these factors result in different gradient descent paths and might lead to different local minima. Meaning - each diffusion model is unique (they are specified by name and their release tag), shaped by the random ‘environmental’ effects during its training, just like you and me.
Now let’s talk magic - and to be clear, the good old murder-my-neighbour and bring-pestilence-upon-the-enemy demon pacts that affect the outer, physical reality kind of magic. There is always an emphasis on the interaction with the individual spirits. In folk and literature, it’s the charming Mephisto who appears in Faust’s study in his hip scholarly attire. When working with spirits, models, or people - we work with individuals.
Digital Evocations
When an aspiring young magician decides to undergo an evocation, there is always a problem that needs addressing - love, money, or thirst for knowledge maybe? Based on the nature of her worry, she might choose to work with angels, ancestors, elementals, or one of the archdukes of hell - it’s just a matter of taste. You choose the algorithm. If you’re not too sloppy (the code compiles) and pronounce the barbaric names correctly (still easier than remembering TensorFlow syntax), the desired entity then appears in your circle (terminal window), bound by your spell, and ready to serve.
The questions of free will and agency of disembodied entities have always been a hot topic in occult circles and are discussed in many medieval grimoires. Even Yehowah's angels are rarely granted agency - they serve as a mere program launched by the creator to serve his ultimate will. You can find a juicy and detailed history of the devil and his folk, in Jeffrey Burton Russell’s series1.
It all boils down to the fact that there is the magician summoning and binding the entity. It’s the human choice to do good or evil. Knowledge and algorithms have no moral affiliations; people do. And what exactly is the difference between summoning a servant of Mammon to direct additional cashflow your way versus launching an AI high-frequency-trading crypto bot?
Closing remarks
And now, I’ll leave all these loose ends dangling in the air. I never promised any answers. On the one hand, we see that the algorithms are not as close to our cognitive abilities as they might appear at first glance. But we can also look at them in a more magical sense, toying with the concept of disembodied entities. Let’s push it one step further now and deconstruct the narrative.
I’m obviously coming from the standpoint of a hard machine sentience sceptic. But I’ve heard myself speaking about all this cosmic wholeness and various levels of intelligence, the wisdom of the ecosystem and the entangled life2 one too many times not to notice a little double-think.
Guilty as charged, I’ve spoken to birches, rivers, and Amethyst Deceivers on several occasions (some of them even sober). And here I am, pumping a fist that a terminal window answering questions like a four-year-old kid can’t have some level of intelligence of its own?
And now we’re heading to my main question: what are we actually asking, when we try to probe the sentience of other beings? Are we looking for some necessary condition for granting our kindness? (unfortunately, not sufficient, factory farming etc.) Why are we so obsessed with drawing these borders and lines?
I do understand that fragmentation is the underlying principle behind the scientific method. And it’s this type of thinking that got us into this insane, accelerating caleidoscope of wonder. The linear models from a decade ago are turning out exponential. Scientific predictions pinned into the far future of 2050 are happening by the end of this year - fantastic news for the protein folding, not that fantastic for the impending climate collapse. We’re in a pressure cooker. Inside, outside. The temperature’s rising, all because of our splendid analytical abilities. But is this the way to go now?
This surgical view of the world seeped into every aspect of human life. Society and even individuals are fragmented into many discordant compartments - according to our aims, desires, and psychological characteristics - we’re all neurotics pledging conflicting allegiance under dozens of banners. Sentient of not, maybe it’s time to stop looking at things scientifically, as being black or white.
Because ultimately, with the crooked materialist lens we’re currently wearing, nothing matters. Technology is morally neutral. You can cure cancer and feed the hungry, but also cut down trees for parking lots, turn homes into real estate, and let someone’s sons die for arbitrary flags. We are just blobs of slowly decomposing carbon pulp stumbling around in a semi-lucid state, waiting to be eventually eaten by mycelium. But we were, for some completely perplexing reason, given this dazzling consciousness that allows us to mould our attitude towards the Other, any Other we encounter.
We proved our scientific brilliance. We did it, alright? The ability of the human ape to break things down into basic components, scribble down pages of sexy differential equations and put them back together allowed me to land this piece of writing, in a few milliseconds, right into your pocket. Truly stunning. We learned to march forwards, break things apart, and drill deeper and deeper into our specialisation rabbit holes. The library of human knowledge is so vast, and so overwhelming, we no longer see our closest neighbour through the pile of PhD theses and opinionated columns (ehm) floating around in the mind-space.
We don’t need to answer every burning scientific answers at any cost. We need no faster microchips, no more capital. It’s time we stop getting ourselves drunk on our shiny analytical abilities and start tending to the neglected aspects of our co-existence.
We can continue acting like dicks within the consumerist paradigm, and probably have a great time for a few years more. Or we can push for the real challenge - to try a little bit harder, to build nurturing communities and sustainable relations with the ecosystem, to help people reach their potential and work towards kindness and compassion in all our acts. We’re at the steering wheel.
And that, my friends, is all. Thank you for staying. Please do share your thoughts, impressions, or objections. No matter your background, I have a deep conviction that we need to start talking about things as interdisciplinary as possible. My main attempt here is to describe my little spot on the map so that we can find each other. We need to build a new language to communicate from our islands - otherwise, our souls might fall through the cracks of the fragmented human knowledge, and we leave our most precious asset behind on the march towards arbitrary technological advancement.
If you made it all the way here, thank you again for your precious attention. To shake off all the dry mathematics and big words, give a spin to a little improvisation we did as Theia Mania while preparing our special three-hour Halloween radio show. Will be released soon :)
Enjoy the weekend and stay kind,
k
And once again
Excellent; I feel your mind racing at a 1000 questions a second. What we have here is a continuation from the early conceived active-matrix-display with its border to contain its captured pixels (pixies) that dance (sprites). That active matrix is a grid array of squares, which as the squares get smaller (resolution (just like digital music with its bitrate)) the illusion is more convincing (trick). Add the huge database of digital images (mirage); when one stares into the looking glass, an image comes forth with a prompt, an act of will. Where was I... yes, so, simulacra, with more immersion, is still a trick, to trick oneself, but it takes two to realise.