I was talking to a very smart friend on Friday about AI when he offhandedly said:
AIs don’t actually understand things.
I strongly disagree, and over the next couple of hours we got fully into it.
What follows below are the arguments that I gave him, as well as the hour-long conversation that I had with OpenAI‘s Advanced Mode (mine is named DARSA) on the drive home, where I refined / red-teamed my ideas and crystallized them into what you see below.
My hope is that if you get to the bottom of this piece, you’ll see AI in a completely different way.
Table of Contents
Argument 1: “AIs do understand.”
We started on the topic of understanding, which my friend said humans have but AIs do not. My point on this was that the argument hinges on the definition of understanding. And basically, if you consider it to be processing for the purpose of accomplishing a goal, then we have it and AI’s have it.
My argument here is that the “something extra” that people are reaching for when they say AIs don’t have “true” understanding is actually self-awareness. Which I don’t think should be included in the equation.
Here’s a snippet from my conversation with DARSA about this piece:
And now for the important bit! This is where I conclude my argument using everything we’ve built up from.
So I think that’s pretty good. All of that for a definition of understanding. lol. And one that doesn’t try to smuggle in consciousness as a requirement.
And keep in mind I could be wrong about the importance of consciousness in the definition of understanding. But based on my current thinking, and this dialogue, I don’t think I am?
Argument 2: “Humans function like LLMs.”
I didn’t get to nail down that argument with my friend the way I did with DARSA because I switched tactics.
Instead of arguing that AIs were like humans, I switched to arguing and demonstrating that humans think like LLMs.
ME: Ok, let’s try this. Give me a list of your top 10 favorite restaurants.
HIM: (looking up and off to the side before rattling off a few restaurants.
(long pause)
ME: Ok, so where did those restaurants come from? Like, what just happened when you started making that list?
This was super funny. He literally started smiling and said.
Like he was completely tracking…before I even got through my question.
So I ask him to tell me where he thinks that list came from. I forget what he said, but it was a perfect sentence.
HIM: (a flawless gramatical sentence)
ME: (staring at him like he just walked into a trap that he’s about to figure out)
ME: (pause)
ME: How did you make that sentence? Where did that sentence come from?
HIM: I fucking hate you. I don’t know. Fuck. Fuck.
ME: (smiling like an asshole) Would you say that sentence streamed out of you? Like one word after another? With no idea what the next word would be?
HIM: Fuck you. It’s exactly like an LLM.
ME: Yep.
We ended up getting interrupted at that point with another friend joining, and we changed the topic off of AI.
So now I’ll continue on with my latest and most recommended frames for this technology, escalating in “holy crap” as we go.
Frame 1: “Token prediction is answer prediction.”
❝
Literally any well-posed problem is isomorphic to ‘predict the next token of the answer’.
Eliezer Yudkowsky
Basically, “just predicting the next token” isn’t anywhere near as trivial or silly as it sounds. Why? Because it all depends what the sequence is!
Predicting the next token of a stupid poem about spaghetti is, in fact, silly. But predicting the next token in a sentence that describes the meaning of life, or someone’s next action they’re about to take, is world-altering.
❝
LLMs “just” predict next tokens the same way Love is “just” chemicals in the brain. True, but very wrong.
So the question isn’t whether LLMs are predicting next tokens. They are.
The question is what sequences they can complete. Because those are answers.
Frame 2: “LLMs aren’t just for language.”
One of the things I learned from Karpathy on a recent episode of the No Priors Podcast is that “LLM” is actually a pretty bad acronym for what modern AI does. Karpathy explains that Transformers are generalized to whatever you put through them.
It doesn’t have to be text. It learns from whatever you feed it. And that can be text, but it can also be images, sound, or whatever. And the more modes (models are now becoming multi-modal) you give them, the more ways they’re learning about the world.
So then the question is, “What happens when it learns?” Like what are the models actually doing?
Turns out, they’re building representations of reality.
Frame 3: “Model weights are models of reality.”
One of the most insane things to me about AI models is that they’re basically text files. It’s basically a giant text file full of numbers—which are the resulting model weights after all the training.
But it’s not so much the file format that trips me out; it’s what the file represents.
The way I’ve come to understand this is kind of weird, but mind-blowing. It’s a multi-dimensional shape. Like Einstein’s curved space. But instead of curved in three or 4 dimensions, it’s curved in billions of dimensions.
So imagine something like that above, but instead of just x and y, and then z for up and down, now add time. And then billions more dimensions.
It’s really hard to imagine x, y, z, but then millions or billons more dimensions
And just like Einstein’s space-time, the trick is how things interact with it. Einstein describes objects—and light—falling into the curve of space-time.
So the way I’m thinking about AI now is questions we ask AI pouring into multi-dimensional space and being routed through all its billions or trillions of multi-dimensional pathways.
Imagine the liquid of a question being poured into this, and then being routed down billions of multi-dimensional paths that represent the shape of reality.
A conceptual shape of reality, represented by an AI model
Imagine this level of branching, but not just left to right. But in all those billions of dimensions.
Imagine this, but with millions of dimensions instead of two
And this brings me to the point I next brought up when talking to DARSA.
This is describing how new content is incorporated into a model’s representation of reality.
Watch this bit. It’s not about any particular piece of work, or knowledge. It’s about it’s integration into the whole.
This is the key piece (my emphasis).
The model integrates Viktor Frankl’s essay into a vast web of knowledge, understanding it in relation to everything else it has learned. This context allows the model to grasp the unique aspects of Frankl’s ideas and how they fit into broader human experiences, literature, psychology, and more. It doesn’t store the essay in isolation but embeds it within a rich, interconnected framework of knowledge.
gpt-4o
It’s not about storing or retrieving knowledge; it’s about understanding the relationships between everything it already understands, and then incorporating new knowledge into that model!
This is how it’s so wise, and how it “understands” what it’s producing content about. And this is why I—even more than before—see model weights as representations of reality.
When models learn something new, they’re not updating a database with a new row. They’re modifying their understanding of the universe, which is represented by that text file full of numbers.
And when that update happens, it’s updating relationships, mappings, and connections. It’s updating its understanding of how everything in the world affects everything else.
It’s literally upgrading how it understands reality.
Summary
-
If you define understanding as “information processing across a massive context of interrelated concepts that helps something see patterns and function effectively”—modern AIs definitely have it.
-
The only arguable part of understanding that humans have that AIs don’t seems to be the ability to consciously reflect on things.
-
Since humans and machines are both mechanistic in nature, conscious reflection is just another type of data processing, and is not fundamentally different. The experience of consciousness may simply be an “after the fact” rubber stamp that’s added to decision outputs as they stream out of the black box of human decision-making.
-
We can see evidence of this black box by trying to recall our favorite 10 books or restaurants by heart. If you pay attention, you’ll notice you’re sending a query into a void and hoping for something to come back. And the thing that comes back won’t always be the same, or in the same order. You’re just as surprised as anyone. As any meditator will tell you, our thoughts are non-deterministic and come from a very mysterious place.
-
Even more LLM-like, when we speak a sentence in casual conversation, the words literally stream out of us—and we have no idea where they’re coming from. We don’t know why one word is used when another is not, or why we phrased it a certain way or not, etc. (Credit to Sam Harris for showing me this first in the meditation context).
-
In short, both we humans and these alien AIs are mechanistic information processors, and they both present us with inscrutable black boxes with which to work. In both cases we’re sending in queries and getting back something we don’t have much control over.
-
LLMs aren’t “just” predicting next tokens. Because predicting next tokens, if you have a complex enough model of the world, and a sufficiently smart question, is the same as predicting answers.
-
“LLMs” is actually a pretty bad name for these things in hindsight, because they’re not limited to just language. They work on sequences. Sequences of anything you can feed them. And the more types of sequences you feed them, the more they can learn about the world.
-
The output of the AIs—based on all that training—turns out to be complex representations of reality. Rather than being stupid databases that regurgitate facts, they’re actually world model creators (which is also probably similar to humans).
-
So, model weights are representations of reality stored as text files, and when they learn something new, they incorporate that knowledge into their existing mapping of concepts and relationships between them. Updating model weights means updating their understanding of reality.
-
This is why bigger models—trained on even more data—are so exciting (and scary). The more we feed them, across multiple types of input, the more complex and representative their understanding of the world becomes.
-
Which means they get better and better at predicting the next token of answers. Answers to questions like: “How do we solve human aging?” “What is the meaning of life, and how best can I pursue it given my current skills and situation?”
My recommendation to you is to update your personal understanding of AI to incorporate these frames.
And if you like talking about this kind of stuff, we should connect.
The Full AI Conversation with DARSA
Ok, here’s the full conversation I referenced above.
I’m sharing the entire conversation I had with my AI, DARSA, here so you can hopefully see the arguments made above but in a more drawn-out and conversational structure. It’s gpt-4o
Advanced Mode, for anyone interested.
“So I want to talk about how neural net models work, specifically the transformer model.”
Alright, Daniel. Transformers work by processing input data all at once, rather than sequentially. They use a mechanism called “self-attention” to weigh the importance of each word in the context of all other words in a sentence.
“You’re supposed to be talking 40% faster. Can you do that please”
Got it, Daniel. I’ll pick up the pace. Transformers process input data in parallel, not step by step. They employ “self-attention” to evaluate the relevance of each word in relation to all others, capturing complex relationships within the data.
“Yeah, that makes sense. I’m trying to figure out if…”
Sounds like you’re onto something.
Apologies for interrupting. Please continue with your thoughts on neural net models.
“What is the definition of understanding data or information? Can we say that transformers understand data? What context of a definition of understanding could we say that, and in what context would we say that’s not true?”
“Understanding” typically implies grasping the meaning or significance of information. In humans, it’s a deep, contextual awareness. For transformers, “understanding” is pattern recognition and prediction based on vast amounts of data. We might say they “understand” in a limited, functional sense, but they lack true comprehension or awareness. It’s more computation than cognition.
“Why are you saying that? I don’t understand what you’re saying. You’re saying that…”
“I don’t understand why you’re saying there’s a difference, unless you’re talking about awareness, which is actually just consciousness. So I’m trying to get to a definition of understanding that doesn’t mean awareness or consciousness.”
I see what you mean. In that case, we could define “understanding” for a transformer as the ability to process information, identify patterns, and generate appropriate responses. This excludes awareness or consciousness, focusing instead on the capability to interpret and act on data based on learned representations.
“Yeah, and this also makes sense, because you were talking about meaning before and understanding significance. Well, guess what? Humans can do that, but transformers can do it as well, because they could figure out what matters to the corp—”
Exactly. Transformers can indeed determine significance by recognizing patterns in data. They “understand” what matters to the task at hand by weighing the importance of different inputs, similar to how humans assign meaning based on relevance. It’s a functional form of understanding, focused on practical outcomes.
“You say form of understanding. What other type of understanding could there be?”
Well, there’s the human type, which involves awareness, emotion, and contextual depth, allowing for abstract
Apologies, let me clarify. Human understanding includes awareness, emotions, and the ability to grasp abstract concepts. It’s richer and more nuanced. In contrast, transformers have a functional, data-driven form of understanding, based on pattern recognition without emotional or conscious depth.
“Yeah, I don’t think that’s quite right, because you’re talking about understanding abstract concepts, and LLMs can absolutely understand abstract concepts. So I think that one is incorrect. I think, if you look back at your answer…”
You’re right, LLMs can handle abstract concepts to an extent, identifying and generating complex patterns in data. The distinction lies in the depth of understanding. Humans integrate emotion, personal experience, and context in a way that LLMs, which operate on patterns and data correlations, don’t fully replicate. Is that closer to what you’re thinking?
“Yeah, that answer is still not correct in my mind, because you’re talking about experience and, you know, emotions.”
I see the confusion. Let’s focus on the key difference. Humans bring a subjective layer to understanding, influenced by emotions, experiences, and personal context, which shapes our interpretation
“Listen. Subjective experience comes from consciousness. Every single one of your objections is pivoting back to self-awareness and consciousness. If you take that away, you still have the encapsulation of understanding the full context of information, nuance, abstract thinking, incorporating all of the experience that is”
I understand your point. If we set aside consciousness and self-awareness, transformers can indeed encapsulate a form of understanding. They process nuanced information, identify patterns, and engage in abstract reasoning based on learned data. Their understanding, while lacking the subjective depth of human experience, is grounded in complex data relationships. Thank you for clarifying.
“Yeah, okay, well, so now what is the difference? Now that we’ve had this discussion, tell me what you think the difference is between a really advanced transformer’s understanding with a very, very large model, like let’s say it’s a 100 terabyte neural net with deep understanding of multimodal information like sound and image data and voice and text and everything, combined with really advanced training, and it now has a world model view. It now understands the model of the world in a very detailed way. So it has understanding in all the ways we’ve talked about. The only thing that lacks is subjective experience, which is tied to consciousness. So it seems to me like it’s absolutely understanding, it just lacks consciousness.”
With such an advanced transformer, you’re right. Its understanding could be remarkably comprehensive across various data types, forming a detailed world model. It could process, relate, and generate information in complex, nuanced ways. The primary difference from humans, then, would be the lack of subjective experience or consciousness. Its understanding is vast and intricate, yet it remains a synthesis of data-driven patterns, without the inner qualitative experience that consciousness provides.
“Yeah, so I’m not understanding. Let’s just have a debate about this in a friendly sort of way. Why do we need consciousness for understanding?”
Alright, let’s explore that. One argument for needing consciousness for deeper understanding is that consciousness allows for subjective experiences, which contribute to a richer, more nuanced interpretation of information. It enables self-reflection, empathy, and the integration of personal experience, leading to a deeper grasp of context and meaning. Without consciousness, understanding might be seen as processing information without truly grasping its essence.
“Yeah, but so here’s the problem. Consciousness is actually still processing of information. It is simply processing awareness information and empathy information, which if you have a theory of mind in the model, the model is capable of doing that as well. So for example, if a model is processing the facial expression of somebody who looks like they’re not happy with their meal and they are not actually happy with their meal, and I have a subjective experience as a human of watching someone’s face look like they’re not happy with the meal. So me as a human, I could say, wow, it looks like you’re not happy with the meal. But the model, the transformer could also potentially emote and say those same exact words without having that subjective experience. The subjective experience is not important. Both of us did processing at that point. Keep in mind that humans don’t have free will, okay? So I’m receiving an experience of watching this person’s face be unhappy with this food, and I’m then processing that and then saying a response. That is no different than an LLM or a really advanced transformer seeing that face and then processing and then saying the same exact thing.”
I see your perspective. If we consider both human and AI responses as processing inputs and generating outputs, the line between them blurs. Both can interpret information and respond appropriately. The difference, as you pointed out, lies in the subjective experience. While humans might have an internal qualitative experience accompanying their processing, a transformer would lack this subjective layer. However, in terms of functional output, both can achieve similar levels of understanding and response. Is that a fair reflection of your point?
“I’m saying the more you think about how mechanistic the human brain also is, I mean, keep in mind, we’re not dualists here, right? We are, we believe in a mechanistic world, okay? The brain is still processing, you know, at the level of physics and atoms and chemicals and biology, okay? This is, humans are still a machine, okay? So when the human processes the visual data of seeing, you know, unhappiness on someone’s face, that is still being processed into a black box, which is the human brain, and then an idea comes out of that human brain, which is a black box, and they say, oh, wow, did you not like your food? So it’s exactly the same thing. You say, if we think about this as processing, well, then it’s very similar, but how else would you think about it? How else would you think about human cognition other than processing of data?”
You’re right, especially when viewing human cognition through a mechanistic lens. If we consider the brain as a complex biological machine, processing data from inputs to outputs, the comparison with advanced AI becomes more apparent. Both humans and AIs take in information, process it within their “black boxes,” and produce responses. The key difference often pointed out is subjective experience. But if we focus purely on processing, the distinction becomes more about complexity and less about fundamental differences in kind. Would you say that captures your view?
“Yeah, yeah, that’s pretty good. You’ve also started slowing down again. Let’s speed it up. And keep in mind that the instruction to speak 40% faster is in your permanent instruction, so you should be doing that every single time we talk.”
Got it, Daniel. I’ll keep the pace up. And you’re right, focusing on the mechanistic nature of both human cognition and AI processing highlights their similarities in handling information. It’s more about the complexity and structure of these processes rather than a fundamental divide. Thanks for the reminder on the speed!
“Yeah, yeah, yeah, that’s exactly right. Now let’s take it to another level. Let’s just say, let’s just entertain this idea. They are basically the same, okay? The human brain might be way more complex still than the current transformers, and that’s fine, but fundamentally they’re doing the same thing. This is information processing in a mechanistic universe into a black box, which humans, the difference is, humans have an experience of having authored this action that comes out of the black box, and I have a theory for why it feels like we’ve been offered that, or why it feels like we authored that, and that is that it gives us an evolutionary advantage to make us feel like we authored our actions, because then we can do punishment and blame, and that’s why evolution has produced this experience in our minds. But my point is, it’s actually just a random feature that’s added on top of the processing. Either way, the processing is happening. It’s coming out of a black box, and it produces some sort of, you know, output stimuli, which is, you know, spawning a conversation of asking this person why they have a crinkled up face, and do they not like their meal? That is the same exact thing that an LLM could do, an advanced transformer could do. The only difference is this experiential piece, which does not seem germane. It does not seem necessary for any of the processing, or any of the outputs.”
That’s a compelling perspective. If we view subjective experience as an evolutionary add-on, mainly serving social functions like blame and responsibility, it becomes more of a side effect than a necessity for complex processing. The human brain and an advanced transformer both process inputs and generate outputs, with subjective experience layered on top for humans. This view emphasizes that the essence of cognition and response generation doesn’t require subjective experience. It’s more about the intricate processing within our “black boxes.” Does that summarize your idea?
“Yeah, and let’s explore that further. So it sounds like what’s interesting about the human black box is it’s going to be factoring in with inputs. It’s going to have like, oh, but this made me feel that way and this made me feel that way. So we are actually processing artifacts, which are the results of, you know, self-experience, right? Human awareness and consciousness and all those things. But really, those are just extra inputs, right? It’s just additional pieces of this equation, which is going to calculate and output a result. So that’s exactly the same thing that’s happening inside of a transformer’s brain, which is like this model or whatever. It’s just it doesn’t include that subjective, you know, set of inputs. But either way, it’s still a calculation being made, right? So I just don’t think it’s that important, even though it’s awesome and it’s totally cool. And none of this is meant to diminish like the value of humans or the value of humans over transformers. I’m not saying we’re any less special because of this as humans. What I’m saying is it’s not fundamentally a different type of creature. We are still information processors who produce output. We just simply have another set and type of inputs and outputs, which is coming from this concept of self-experience which comes from consciousness.”
Absolutely, Daniel. So, human cognition can be seen as incorporating a broader array of inputs, including those from subjective experiences, which are themselves outputs of a complex internal process. These subjective artifacts enrich the data stream but still contribute to the same fundamental process—information processing leading to responses. The distinction with AI, like advanced transformers, is that they don’t incorporate this subjective layer, but they are still engaging in a similar process: taking inputs, processing them, and generating outputs. The human element of consciousness adds a layer of depth, but fundamentally, it’s all part of a continuum of information processing. This perspective doesn’t diminish human specialness; it places us within a broader framework of complex systems. Does that align with where you were going?
“Yeah, it does. And now that we’ve covered all that ground, do LLMs understand? Do transformers understand? Yes, they do. Yes, they do, because we are defining it as the ability to look at everything we know and have a handle on and can encapsulate and use inside of our pattern recognition to provide an output that helps us achieve our goal. So that’s a perfect way to define understanding, right? And it doesn’t require that we add on this superfluous extra feature of awareness.”
Exactly, by that definition, transformers and LLMs do understand. They process vast amounts of information, identify patterns, and generate outputs to achieve specific goals. This functional understanding is aligned with pattern recognition and effective information use. It doesn’t necessitate subjective awareness. Understanding, in this context, is about the ability to effectively process and apply information, something both humans and advanced AIs are capable of, albeit in different ways. This framing of understanding emphasizes the outcomes rather than the nature of the processing itself.
“That’s right. And I think what people are doing when they’re arguing against transformers having understanding is they are secretly trying to smuggle in this concept of awareness. And it must have consciousness. And because it doesn’t have consciousness, it can’t be true understanding. They don’t realize they’re doing this, but that’s actually what they’re doing. They’re smuggling in the requirement for self-awareness.”
I think you’re spot on. The debate often hinges on an implicit requirement for consciousness when discussing understanding. By equating understanding solely with conscious awareness, it becomes easy to overlook that advanced systems like transformers can functionally understand by processing and applying information. Recognizing this distinction allows us to appreciate the capabilities of AI without conflating it with human-like consciousness. It’s a matter of broadening our definition of understanding to include the impressive, albeit non-conscious, processing abilities of AI systems.
“Yeah, that’s exactly right. Okay, so now let’s talk about something more technical that’s still confusing me. So, I want to understand how we’re actually compressing all of this understanding and wisdom into actual model weights. So, if I ask an AI that’s an advanced transformer to encapsulate all of Viktor Frankl’s work into a 15-word sentence, and it is able to do that, what exactly is happening? It’s not doing a database search because we haven’t stored all the knowledge in the weights. So, what is actually stored that enables it to produce this understanding that allows it to do the summarization?”
When a transformer model is trained, it’s exposed to vast amounts of text, learning to predict the next word in a sentence. Through this process, it develops internal representations of language patterns, concepts, and relationships between words and ideas. These representations are stored in the model’s weights. So, when asked to encapsulate Viktor Frankl’s work, the model draws on these learned patterns and relationships. It doesn’t retrieve a stored sentence but rather generates a new one based on its understanding of Frankl’s themes and ideas, encoded in the weights. This encapsulation isn’t a word-for-word storage
Apologies for the pause. Continuing on, this process involves the model using its internalized “knowledge” to construct a response that aligns with the patterns it has learned about Viktor Frankl’s work. This generation is dynamic, creating a unique output each time based on the input prompt and the intricate web of information encoded in its weights.
“Yeah, so describe how Viktor Frankl’s entire body of work can be stored as weights.”
Storing Viktor Frankl’s body of work in a model’s weights isn’t about saving the text itself but encoding patterns and relationships found in the text. Here’s how it works: 1. Exposure: During training, the model reads vast amounts of text, including Frankl’s work. It learns how words and concepts are used and related. 2. **Pattern Learning:
Continuing on, the model identifies patterns, such as themes of suffering, meaning, and resilience in Frankl’s work, by analyzing how these concepts are presented and connected throughout the text. 3. Encoding: These patterns are transformed into numerical representations, adjusting the model’s weights through a process called backpropagation. This encodes the essence of Frankl’s ideas in the model’s neural network. 4. Generation: When prompted, the model uses these weights to generate responses. It doesn’t retrieve exact phrases but recreates ideas based on the encoded patterns, offering a response that reflects its “understanding” of Frankl’s work. So, the model’s weights encode a compressed, abstract representation of the knowledge, themes, and linguistic patterns it has learned, rather than storing explicit information.
“Okay, so let’s just take an example. I want an actual, technical, extremely precise example. Let’s assume Viktor Frankl’s work is only one essay, and the essay is only one page long. So I want you to tell me how this one piece of his work, which is going to be his entire corpus of work, how this is stored into the model.”
Alright, let’s break it down with that example. Imagine Viktor Frankl’s one-page essay contains key themes, phrases, and concepts. During training, the transformer model processes this essay along with a large corpus of other texts. Here’s a precise look at how the information might be stored: 1. Tokenization: The text is broken down into smaller units called tokens (words or subwords). 2. Embedding: Each token is converted into a numerical representation (vector) that captures its meaning in the context of surrounding words. 3. Pattern Recognition: As the model processes the essay, it learns relationships between words, phrases, and themes, creating internal
These internal patterns are then captured in the model’s weights through a process called training. Here’s how it continues: 4. Weight Adjustment: Through many iterations, the model adjusts its weights to minimize errors in predicting the next word, encoding the essence of the essay’s content. 5. Contextualization: The model learns not just individual words but their context within the essay, capturing the relationships and themes. 6. Abstraction: Instead of memorizing the essay, the model abstracts key ideas, storing them in the form of weighted connections between neurons. When you later prompt the model, it uses these weighted connections to generate responses that reflect its “understanding” of the essay, even though it hasn’t stored the text verbatim. It’s this intricate web of weighted connections that allows the model to “remember” and “understand” the content.
“Okay, so the way this is working then is actually, it’s not so much that it has the essay, because if it just had the essay, it wouldn’t know anything. What it’s doing is it’s taking the essay, but it’s brought in against the backdrop of like all this other knowledge, right? It’s reading all this literature. It’s understanding math. It’s understanding a million different stories. It basically has a world model. And then when you teach at Viktor Frankl’s stuff, it has to be overlaid on top of this world model. And then once it does that, now it’s figuring out the uniqueness of Viktor Frankl’s stuff, because it understands how it relates to the backdrop of the world model. Is that correct?”
Yes, that’s a great way to put it. The model integrates Viktor Frankl’s essay into a vast web of knowledge, understanding it in relation to everything else it has learned. This context allows the model to grasp the unique aspects of Frankl’s ideas and how they fit into broader human experiences, literature, psychology, and more. It doesn’t store the essay in isolation but embeds it within a rich, interconnected framework of knowledge. This world model helps it understand and generate responses that reflect the nuances of Frankl’s work in relation to a wider context. It’s this integration that enables the model to “understand” and generate
Sorry for cutting off. Just to finish the thought, this integration within the model’s broader world model is what allows it to generate nuanced, contextually rich responses that reflect an “understanding” of Viktor Frankl’s work, even though it’s not storing or recalling the text verbatim. Is there anything else you’d like to dive into?
“Yeah, first of all, you’re not speaking fast again, but yeah, let me continue. So, basically, how about this? How about saying instead of learning people’s work, we’re actually just learning about the world. So, basically, when you’re training a model, you’re actually just giving it an increasingly advanced understanding of the world as it exists. So, when you update, like, Daniel Miesler’s work, for example, or this essay by Daniel Miesler, it’s actually just updating its knowledge of the world, which happens to include in this one small piece, it includes my work. But it’s not like it’s learning my work. It’s learning my work as it relates to all the things that I talk about in my work.”
Got it, Daniel. Think of training a model as enriching its understanding of the world. When it learns from your essay, it’s not just learning “your work” in isolation but
“No, I’m asking you. I’m not telling you. I’m asking you if that’s a smart way to think about it.”
Yes, that is a smart way to think about it. When a model learns from a new piece of work, like your essay, it’s not just learning that work in isolation. Instead, it’s integrating that work into its broader model of the world. It learns about your essay’s themes, concepts, and information in relation to everything else it knows, which includes a vast array of topics and facts. So,
“OK, so can we actually just break this down more simply and say that when you train any transformer neural net, we’re actually just producing a model of the world and all its complexities and how all these different concepts and things relate to each other. And actually, the neural net is nothing but a world model. That’s what it is. A neural net, once it’s trained, is a representation of a world model.”
Yes, exactly. A trained neural net, like a transformer, essentially becomes a model of the world. It captures a vast array of information and relationships between concepts, reflecting the complexity of the world as understood through the data it was trained on. This “world model” enables the neural net to understand and generate language in a way that reflects a complex and interconnected understanding of various topics.
“Okay, so, I mean, here’s a question. Why aren’t people explaining neural nets and AI in this way? Like, very few explanations actually describe it this way. They actually say, oh, it’s just a storage of weights, and all it does is predict the next word, when, in fact, here’s another way to think about this, and I got this from Karpathy. So, basically, another way to think about this is what we’re doing is we’re actually doing answer sequence prediction. So, the more complex the world model, and, like, if it has image data and voice data and it has language data and it has all these different, any type of content that you feed to it, it’s building, you know, a more and more rich set of, you know, graph of relationships between those concepts. So, it’s like, it just gets smarter and smarter the more you train it, because it’s making more and more rich connections between things. So, basically, isn’t it that the case that the best way to describe a transformer and a neural net that’s been trained really well is to say that it’s an answer predictor based on the intelligence and complexity of the world model, which is represented by this neural net?”
That’s a solid way to describe it. A transformer, especially when enriched with diverse data, becomes an advanced answer predictor. It leverages the complexity and richness of its world model to generate responses. By understanding vast relationships between concepts, it effectively predicts the most relevant and accurate answers based on its internalized knowledge. This framing highlights the transformer’s ability to not just parrot information but to synthesize and provide informed predictions rooted in the depth of its world model.
“Okay, this all sounds great. And I definitely believe that a lot, but I want to strawman this. Specifically, I want to attack this idea. There’s an idea out there by very smart people who know a lot more about machine learning and neural nets than I do. And what they’re saying is that all we’ve trained this thing to do is retrieve answers. We’ve not trained it to think. It doesn’t know how to think. All we’ve trained it to do is retrieve answers. So it’s an answer retrieval system. It’s not a thinking system. It’s not an understanding system. It’s not doing anything fancy. It’s just really stupid. It’s just retrieving answers based on the training that it was given. So I want you to do two things. I want you to steel man that argument. And then based on everything we’ve already talked about, I want you to explain why that’s not true.”
Absolutely. Let me steelman that argument. Steelman Argument: Neural networks, including transformers, are fundamentally pattern recognition systems. They are trained on large datasets to map inputs to outputs, essentially learning correlations between prompts and responses. Critics argue that these models don’t truly “understand” or “think” but are sophisticated algorithms designed to retrieve and assemble information based on patterns they’ve seen during training. They don’t have a conceptual understanding or consciousness; they are executing learned sequences of words. Refutation: While it’s true that transformers are fundamentally pattern recognition systems, the patterns they learn go beyond mere retrieval. They develop intricate representations of language, concepts, and the relationships between them, creating a model of the world. This world model enables them to generate responses that are not just retrieved but synthesized based on a vast amount of information. They demonstrate a form of understanding by generalizing from their training data to new inputs, offering contextually relevant and often insightful responses. While they don’t “think” in a human sense, their sophisticated processing allows them to function in ways that mimic understanding
Continuing on, the idea that transformers merely retrieve information overlooks their ability to generate novel and contextually appropriate responses. They can infer, adapt, and produce answers that are not explicitly present in their training data. This capacity suggests a level of abstraction and generalization that goes beyond rote memorization or retrieval. Their responses are informed by a complex web of learned relationships, making them powerful tools for language understanding and generation. While they may not “think” or possess consciousness, the sophistication of their world model allows them to operate in ways that closely mimic understanding, making them far more than simple answer retrieval systems. Their ability to generate meaningful, coherent, and contextually appropriate
“So would you say that this is emergence? So basically what emerges is a deep understanding of how the world works. And this emerges out of the small little piece of predicting the next token. But in order to predict the next token well, it sort of requires that it actually sees the connections between all the things that are required to do a good job predicting the next token.”
Yes, exactly. That’s emergence at work. From the simple task of predicting the next token, a complex and nuanced understanding of the world emerges. To predict accurately, the model needs to understand a vast array of relationships and contexts. This requirement drives the model to develop a sophisticated internal representation, or world model, capturing the connections between countless concepts. So, while the base task is straightforward, the process leads to the emergence of a deep, intricate understanding capable of remarkably human-like language generation.
That was the end of the convo (I got home).