Skip to content

dwarkesh.com - Ilya Sutskever – We're moving from the age of scaling to the age of research

[

bc624ddeddf3dbbe215a598987678ffe_MD5

](https://www.dwarkesh.com/)

Playback speed

Subtitles

Share post

Share post at current time

Share from 0:00

0:27

/

1:36:03

Transcript

0:00

SPEAKER 4

You know what’s crazy? That all of this is real.

0:04

SPEAKER 1

Yeah, meaning what?

0:05

SPEAKER 4

Don’t you think so?

0:06

SPEAKER 1

Meaning what?

0:06

SPEAKER 4

Like all this AI stuff and all this big area. Yeah, that it’s happened. Like, isn’t it straight out of science fiction?

0:13

SPEAKER 1

Yeah. Another thing that’s crazy is like how normal the slow takeoff feels. The idea that we’d be investing 1% of GDP in AI, like, I feel like it would have felt like a bigger deal, you know, where right now it just feels like.

0:27

SPEAKER 4

And we get used to things pretty fast, turns out, yeah. But also it’s kind of like, it’s abstract, like, what does it mean? What it means that you see it in the news.

0:35

SPEAKER 2

Yeah.

0:36

SPEAKER 4

That such and such company announced such and such dollar amount. Right. That’s all you see. Right. It’s not really felt in any other way so far.

Ilya Sutskever – We’re moving from the age of scaling to the age of research

Section titled “Ilya Sutskever – We’re moving from the age of scaling to the age of research”

“These models somehow just generalize dramatically worse than people. It’s a very fundamental thing.”

[

5eab5ddcde78f0f09056cace48074ec3_MD5

](https://substack.com/@dwarkesh)

Dwarkesh Patel

Nov 25, 2025

Ilya & I discuss SSI’s strategy, the problems with pre-training, how to improve the generalization of AI models, and how to ensure AGI goes well.

Watch on YouTube; listen on Apple Podcasts or Spotify.

  • Gemini 3 is the first model I’ve used that can find connections I haven’t anticipated. I recently wrote a blog post on RL’s information efficiency, and Gemini 3 helped me think it all through. It also generated the relevant charts and ran toy ML experiments for me with zero bugs. Try Gemini 3 today at gemini.google
  • Labelbox helped me create a tool to transcribe our episodes! I’ve struggled with transcription in the past because I don’t just want verbatim transcripts, I want transcripts reworded to read like essays. Labelbox helped me generate the exact data I needed for this. If you want to learn how Labelbox can help you (or if you want to try out the transcriber tool yourself), go to labelbox.com/dwarkesh
  • Sardine is an AI risk management platform that brings together thousands of device, behavior, and identity signals to help you assess a user’s risk of fraud & abuse. Sardine also offers a suite of agents to automate investigations so that as fraudsters use AI to scale their attacks, you can use AI to scale your defenses. Learn more at sardine.ai/dwarkesh

To sponsor a future episode, visit dwarkesh.com/advertise.

(00:00:00) – Explaining model jaggedness

(00:09:39) - Emotions and value functions

(00:18:49) – What are we scaling?

(00:25:13) – Why humans generalize better than models

(00:35:45) – Straight-shotting superintelligence

(00:46:47) – SSI’s model will learn from deployment

(00:55:07) – Alignment

(01:18:13) – “We are squarely an age of research company”

(01:29:23) – Self-play and multi-agent

(01:32:42) – Research taste

Ilya Sutskever 00:00:00

You know what’s crazy? That all of this is real.

Dwarkesh Patel 00:00:04

Meaning what?

Ilya Sutskever 00:00:05

Don’t you think so? All this AI stuff and all this Bay Area… that it’s happening. Isn’t it straight out of science fiction?

Dwarkesh Patel 00:00:14

Another thing that’s crazy is how normal the slow takeoff feels. The idea that we’d be investing 1% of GDP in AI, I feel like it would have felt like a bigger deal, whereas right now it just feels…

Ilya Sutskever 00:00:26

We get used to things pretty fast, it turns out. But also it’s kind of abstract. What does it mean? It means that you see it in the news, that such and such company announced such and such dollar amount. That’s all you see. It’s not really felt in any other way so far.

Dwarkesh Patel 00:00:45

Should we actually begin here? I think this is an interesting discussion.

Ilya Sutskever 00:00:47

Sure.

Dwarkesh Patel 00:00:48

I think your point, about how from the average person’s point of view nothing is that different, will continue being true even into the singularity.

Ilya Sutskever 00:00:57

No, I don’t think so.

Dwarkesh Patel 00:00:58

Okay, interesting.

Ilya Sutskever 00:01:00

The thing which I was referring to not feeling different is, okay, such and such company announced some difficult-to-comprehend dollar amount of investment. I don’t think anyone knows what to do with that.

But I think the impact of AI is going to be felt. AI is going to be diffused through the economy. There’ll be very strong economic forces for this, and I think the impact is going to be felt very strongly.

Dwarkesh Patel 00:01:30

When do you expect that impact? I think the models seem smarter than their economic impact would imply.

Ilya Sutskever 00:01:38

Yeah. This is one of the very confusing things about the models right now. How to reconcile the fact that they are doing so well on evals? You look at the evals and you go, “Those are pretty hard evals.” They are doing so well. But the economic impact seems to be dramatically behind. It’s very difficult to make sense of, how can the model, on the one hand, do these amazing things, and then on the other hand, repeat itself twice in some situation?

An example would be, let’s say you use vibe coding to do something. You go to some place and then you get a bug. Then you tell the model, “Can you please fix the bug?” And the model says, “Oh my God, you’re so right. I have a bug. Let me go fix that.” And it introduces a second bug. Then you tell it, “You have this new second bug,” and it tells you, “Oh my God, how could I have done it? You’re so right again,” and brings back the first bug, and you can alternate between those. How is that possible? I’m not sure, but it does suggest that something strange is going on.

I have two possible explanations. The more whimsical explanation is that maybe RL training makes the models a little too single-minded and narrowly focused, a little bit too unaware, even though it also makes them aware in some other ways. Because of this, they can’t do basic things.

But there is another explanation. Back when people were doing pre-training, the question of what data to train on was answered, because that answer was everything. When you do pre-training, you need all the data. So you don’t have to think if it’s going to be this data or that data.

But when people do RL training, they do need to think. They say, “Okay, we want to have this kind of RL training for this thing and that kind of RL training for that thing.” From what I hear, all the companies have teams that just produce new RL environments and just add it to the training mix. The question is, well, what are those? There are so many degrees of freedom. There is such a huge variety of RL environments you could produce.

One thing you could do, and I think this is something that is done inadvertently, is that people take inspiration from the evals. You say, “Hey, I would love our model to do really well when we release it. I want the evals to look great. What would be RL training that could help on this task?” I think that is something that happens, and it could explain a lot of what’s going on.

If you combine this with generalization of the models actually being inadequate, that has the potential to explain a lot of what we are seeing, this disconnect between eval performance and actual real-world performance, which is something that we don’t today even understand, what we mean by that.

Dwarkesh Patel 00:05:00

I like this idea that the real reward hacking is the human researchers who are too focused on the evals.

I think there are two ways to understand, or to try to think about, what you have just pointed out. One is that if it’s the case that simply by becoming superhuman at a coding competition, a model will not automatically become more tasteful and exercise better judgment about how to improve your codebase, well then you should expand the suite of environments such that you’re not just testing it on having the best performance in coding competition. It should also be able to make the best kind of application for X thing or Y thing or Z thing.

Another, maybe this is what you’re hinting at, is to say, “Why should it be the case in the first place that becoming superhuman at coding competitions doesn’t make you a more tasteful programmer more generally?” Maybe the thing to do is not to keep stacking up the amount and diversity of environments, but to figure out an approach which lets you learn from one environment and improve your performance on something else.

Ilya Sutskever 00:06:08

I have a human analogy which might be helpful. Let’s take the case of competitive programming, since you mentioned that. Suppose you have two students. One of them decided they want to be the best competitive programmer, so they will practice 10,000 hours for that domain. They will solve all the problems, memorize all the proof techniques, and be very skilled at quickly and correctly implementing all the algorithms. By doing so, they became one of the best.

Student number two thought, “Oh, competitive programming is cool.” Maybe they practiced for 100 hours, much less, and they also did really well. Which one do you think is going to do better in their career later on?

Dwarkesh Patel 00:06:56

The second.

Ilya Sutskever 00:06:57

Right. I think that’s basically what’s going on. The models are much more like the first student, but even more. Because then we say, the model should be good at competitive programming so let’s get every single competitive programming problem ever. And then let’s do some data augmentation so we have even more competitive programming problems, and we train on that. Now you’ve got this great competitive programmer.

With this analogy, I think it’s more intuitive. Yeah, okay, if it’s so well trained, all the different algorithms and all the different proof techniques are right at its fingertips. And it’s more intuitive that with this level of preparation, it would not necessarily generalize to other things.

Dwarkesh Patel 00:07:39

But then what is the analogy for what the second student is doing before they do the 100 hours of fine-tuning?

Ilya Sutskever 00:07:48

I think they have “it.” The “it” factor. When I was an undergrad, I remember there was a student like this that studied with me, so I know it exists.

Dwarkesh Patel 00:08:01

I think it’s interesting to distinguish “it” from whatever pre-training does. One way to understand what you just said about not having to choose the data in pre-training is to say it’s actually not dissimilar to the 10,000 hours of practice. It’s just that you get that 10,000 hours of practice for free because it’s already somewhere in the pre-training distribution. But maybe you’re suggesting there’s actually not that much generalization from pre-training. There’s just so much data in pre-training, but it’s not necessarily generalizing better than RL.

Ilya Sutskever 00:08:31

The main strength of pre-training is that: A, there is so much of it, and B, you don’t have to think hard about what data to put into pre-training. It’s very natural data, and it does include in it a lot of what people do: people’s thoughts and a lot of the features. It’s like the whole world as projected by people onto text, and pre-training tries to capture that using a huge amount of data.

Pre-training is very difficult to reason about because it’s so hard to understand the manner in which the model relies on pre-training data. Whenever the model makes a mistake, could it be because something by chance is not as supported by the pre-training data? “Support by pre-training” is maybe a loose term. I don’t know if I can add anything more useful on this. I don’t think there is a human analog to pre-training.

Dwarkesh Patel 00:09:39

Here are analogies that people have proposed for what the human analogy to pre-training is. I’m curious to get your thoughts on why they’re potentially wrong. One is to think about the first 18, or 15, or 13 years of a person’s life when they aren’t necessarily economically productive, but they are doing something that is making them understand the world better and so forth. The other is to think about evolution as doing some kind of search for 3 billion years, which then results in a human lifetime instance.

I’m curious if you think either of these are analogous to pre-training. How would you think about what lifetime human learning is like, if not pre-training?

Ilya Sutskever 00:10:22

I think there are some similarities between both of these and pre-training, and pre-training tries to play the role of both of these. But I think there are some big differences as well. The amount of pre-training data is very, very staggering.

Dwarkesh Patel 00:10:39

Yes.

Ilya Sutskever 00:10:40

Somehow a human being, after even 15 years with a tiny fraction of the pre-training data, they know much less. But whatever they do know, they know much more deeply somehow. Already at that age, you would not make mistakes that our AIs make.

There is another thing. You might say, could it be something like evolution? The answer is maybe. But in this case, I think evolution might actually have an edge. I remember reading about this case. One way in which neuroscientists can learn about the brain is by studying people with brain damage to different parts of the brain. Some people have the most strange symptoms you could imagine. It’s actually really, really interesting.

One case that comes to mind that’s relevant. I read about this person who had some kind of brain damage, a stroke or an accident, that took out his emotional processing. So he stopped feeling any emotion. He still remained very articulate and he could solve little puzzles, and on tests he seemed to be just fine. But he felt no emotion. He didn’t feel sad, he didn’t feel anger, he didn’t feel animated. He became somehow extremely bad at making any decisions at all. It would take him hours to decide on which socks to wear. He would make very bad financial decisions.

What does it say about the role of our built-in emotions in making us a viable agent, essentially? To connect to your question about pre-training, maybe if you are good enough at getting everything out of pre-training, you could get that as well. But that’s the kind of thing which seems… Well, it may or may not be possible to get that from pre-training.

Dwarkesh Patel 00:12:56

What is “that”? Clearly not just directly emotion. It seems like some almost value function-like thing which is telling you what the end reward for any decision should be. You think that doesn’t sort of implicitly come from pre-training?

Ilya Sutskever 00:13:15

I think it could. I’m just saying it’s not 100% obvious.

Dwarkesh Patel 00:13:19

But what is that? How do you think about emotions? What is the ML analogy for emotions?

Ilya Sutskever 00:13:26

It should be some kind of a value function thing. But I don’t think there is a great ML analogy because right now, value functions don’t play a very prominent role in the things people do.

Dwarkesh Patel 00:13:36

It might be worth defining for the audience what a value function is, if you want to do that.

Ilya Sutskever 00:13:39

Certainly, I’ll be very happy to do that. When people do reinforcement learning, the way reinforcement learning is done right now, how do people train those agents? You have your neural net and you give it a problem, and then you tell the model, “Go solve it.” The model takes maybe thousands, hundreds of thousands of actions or thoughts or something, and then it produces a solution. The solution is graded.

And then the score is used to provide a training signal for every single action in your trajectory. That means that if you are doing something that goes for a long time—if you’re training a task that takes a long time to solve—it will do no learning at all until you come up with the proposed solution. That’s how reinforcement learning is done naively. That’s how o1, R1 ostensibly are done.

The value function says something like, “Maybe I could sometimes, not always, tell you if you are doing well or badly.” The notion of a value function is more useful in some domains than others. For example, when you play chess and you lose a piece, I messed up. You don’t need to play the whole game to know that what I just did was bad, and therefore whatever preceded it was also bad.

The value function lets you short-circuit the wait until the very end. Let’s suppose that you are doing some kind of a math thing or a programming thing, and you’re trying to explore a particular solution or direction. After, let’s say, a thousand steps of thinking, you concluded that this direction is unpromising. As soon as you conclude this, you could already get a reward signal a thousand timesteps previously, when you decided to pursue down this path. You say, “Next time I shouldn’t pursue this path in a similar situation,” long before you actually came up with the proposed solution.

Dwarkesh Patel 00:15:52

This was in the DeepSeek R1 paper— that the space of trajectories is so wide that maybe it’s hard to learn a mapping from an intermediate trajectory and value. And also given that, in coding for example you’ll have the wrong idea, then you’ll go back, then you’ll change something.

Ilya Sutskever 00:16:12

This sounds like such lack of faith in deep learning. Sure it might be difficult, but nothing deep learning can’t do. My expectation is that a value function should be useful, and I fully expect that they will be used in the future, if not already.

What I was alluding to with the person whose emotional center got damaged, it’s more that maybe what it suggests is that the value function of humans is modulated by emotions in some important way that’s hardcoded by evolution. And maybe that is important for people to be effective in the world.

Dwarkesh Patel 00:17:00

That’s the thing I was planning on asking you. There’s something really interesting about emotions of the value function, which is that it’s impressive that they have this much utility while still being rather simple to understand.

Ilya Sutskever 00:17:15

I have two responses. I do agree that compared to the kind of things that we learn and the things we are talking about, the kind of AI we are talking about, emotions are relatively simple. They might even be so simple that maybe you could map them out in a human-understandable way. I think it would be cool to do.

In terms of utility though, I think there is a thing where there is this complexity-robustness tradeoff, where complex things can be very useful, but simple things are very useful in a very broad range of situations. One way to interpret what we are seeing is that we’ve got these emotions that evolved mostly from our mammal ancestors and then fine-tuned a little bit while we were hominids, just a bit. We do have a decent amount of social emotions though which mammals may lack. But they’re not very sophisticated. And because they’re not sophisticated, they serve us so well in this very different world compared to the one that we’ve been living in.

Actually, they also make mistakes. For example, our emotions… Well actually, I don’t know. Does hunger count as an emotion? It’s debatable. But I think, for example, our intuitive feeling of hunger is not succeeding in guiding us correctly in this world with an abundance of food.

Dwarkesh Patel 00:18:49

People have been talking about scaling data, scaling parameters, scaling compute. Is there a more general way to think about scaling? What are the other scaling axes?

Ilya Sutskever 00:19:00

Here’s a perspective that I think might be true. The way ML used to work is that people would just tinker with stuff and try to get interesting results. That’s what’s been going on in the past.

Then the scaling insight arrived. Scaling laws, GPT-3, and suddenly everyone realized we should scale. This is an example of how language affects thought. “Scaling” is just one word, but it’s such a powerful word because it informs people what to do. They say, “Let’s try to scale things.” So you say, what are we scaling? Pre-training was the thing to scale. It was a particular scaling recipe.

The big breakthrough of pre-training is the realization that this recipe is good. You say, “Hey, if you mix some compute with some data into a neural net of a certain size, you will get results. You will know that you’ll be better if you just scale the recipe up.” This is also great. Companies love this because it gives you a very low-risk way of investing your resources.

It’s much harder to invest your resources in research. Compare that. If you research, you need to be like, “Go forth researchers and research and come up with something”, versus get more data, get more compute. You know you’ll get something from pre-training.

Indeed, it looks like, based on various things some people say on Twitter, maybe it appears that Gemini have found a way to get more out of pre-training. At some point though, pre-training will run out of data. The data is very clearly finite. What do you do next? Either you do some kind of souped-up pre-training, a different recipe from the one you’ve done before, or you’re doing RL, or maybe something else. But now that compute is big, compute is now very big, in some sense we are back to the age of research.

Maybe here’s another way to put it. Up until 2020, from 2012 to 2020, it was the age of research. Now, from 2020 to 2025, it was the age of scaling—maybe plus or minus, let’s add error bars to those years—because people say, “This is amazing. You’ve got to scale more. Keep scaling.” The one word: scaling.

But now the scale is so big. Is the belief really, “Oh, it’s so big, but if you had 100x more, everything would be so different?” It would be different, for sure. But is the belief that if you just 100x the scale, everything would be transformed? I don’t think that’s true. So it’s back to the age of research again, just with big computers.

Dwarkesh Patel 00:22:06

That’s a very interesting way to put it. But let me ask you the question you just posed then. What are we scaling, and what would it mean to have a recipe? I guess I’m not aware of a very clean relationship that almost looks like a law of physics which existed in pre-training. There was a power law between data or compute or parameters and loss. What is the kind of relationship we should be seeking, and how should we think about what this new recipe might look like?

Ilya Sutskever 00:22:38

We’ve already witnessed a transition from one type of scaling to a different type of scaling, from pre-training to RL. Now people are scaling RL. Now based on what people say on Twitter, they spend more compute on RL than on pre-training at this point, because RL can actually consume quite a bit of compute. You do very long rollouts, so it takes a lot of compute to produce those rollouts. Then you get a relatively small amount of learning per rollout, so you really can spend a lot of compute.

I wouldn’t even call it scaling. I would say, “Hey, what are you doing? Is the thing you are doing the most productive thing you could be doing? Can you find a more productive way of using your compute?” We’ve discussed the value function business earlier. Maybe once people get good at value functions, they will be using their resources more productively. If you find a whole other way of training models, you could say, “Is this scaling or is it just using your resources?” I think it becomes a little bit ambiguous.

In the sense that, when people were in the age of research back then, it was, “Let’s try this and this and this. Let’s try that and that and that. Oh, look, something interesting is happening.” I think there will be a return to that.

Dwarkesh Patel 00:24:10

If we’re back in the era of research, stepping back, what is the part of the recipe that we need to think most about? When you say value function, people are already trying the current recipe, but then having LLM-as-a-Judge and so forth. You could say that’s a value function, but it sounds like you have something much more fundamental in mind. Should we even rethink pre-training at all and not just add more steps to the end of that process?

Ilya Sutskever 00:24:35

The discussion about value function, I think it was interesting. I want to emphasize that I think the value function is something that’s going to make RL more efficient, and I think that makes a difference. But I think anything you can do with a value function, you can do without, just more slowly. The thing which I think is the most fundamental is that these models somehow just generalize dramatically worse than people. It’s super obvious. That seems like a very fundamental thing.

00:25:13 – Why humans generalize better than models

Section titled “00:25:13 – Why humans generalize better than models”

Dwarkesh Patel 00:25:13

So this is the crux: generalization. There are two sub-questions. There’s one which is about sample efficiency: why should it take so much more data for these models to learn than humans? There’s a second question. Even separate from the amount of data it takes, why is it so hard to teach the thing we want to a model than to a human? For a human, we don’t necessarily need a verifiable reward to be able to… You’re probably mentoring a bunch of researchers right now, and you’re talking with them, you’re showing them your code, and you’re showing them how you think. From that, they’re picking up your way of thinking and how they should do research.

You don’t have to set a verifiable reward for them that’s like, “Okay, this is the next part of the curriculum, and now this is the next part of your curriculum. Oh, this training was unstable.” There’s not this schleppy, bespoke process. Perhaps these two issues are actually related in some way, but I’d be curious to explore this second thing, which is more like continual learning, and this first thing, which feels just like sample efficiency.

Ilya Sutskever 00:26:19

You could actually wonder that one possible explanation for the human sample efficiency that needs to be considered is evolution. Evolution has given us a small amount of the most useful information possible. For things like vision, hearing, and locomotion, I think there’s a pretty strong case that evolution has given us a lot.

For example, human dexterity far exceeds… I mean robots can become dexterous too if you subject them to a huge amount of training in simulation. But to train a robot in the real world to quickly pick up a new skill like a person does seems very out of reach. Here you could say, “Oh yeah, locomotion. All our ancestors needed great locomotion, squirrels. So with locomotion, maybe we’ve got some unbelievable prior.”

You could make the same case for vision. I believe Yann LeCun made the point that children learn to drive after 10 hours of practice, which is true. But our vision is so good. At least for me, I remember myself being a five-year-old. I was very excited about cars back then. I’m pretty sure my car recognition was more than adequate for driving already as a five-year-old. You don’t get to see that much data as a five-year-old. You spend most of your time in your parents’ house, so you have very low data diversity.

But you could say maybe that’s evolution too. But in language and math and coding, probably not.

Dwarkesh Patel 00:28:00

It still seems better than models. Obviously, models are better than the average human at language, math, and coding. But are they better than the average human at learning?

Ilya Sutskever 00:28:09

Oh yeah. Oh yeah, absolutely. What I meant to say is that language, math, and coding—and especially math and coding—suggests that whatever it is that makes people good at learning is probably not so much a complicated prior, but something more, some fundamental thing.

Dwarkesh Patel 00:28:29

I’m not sure I understood. Why should that be the case?

Ilya Sutskever 00:28:32

So consider a skill in which people exhibit some kind of great reliability. If the skill is one that was very useful to our ancestors for many millions of years, hundreds of millions of years, you could argue that maybe humans are good at it because of evolution, because we have a prior, an evolutionary prior that’s encoded in some very non-obvious way that somehow makes us so good at it.

But if people exhibit great ability, reliability, robustness, and ability to learn in a domain that really did not exist until recently, then this is more an indication that people might have just better machine learning, period.

Dwarkesh Patel 00:29:29

How should we think about what that is? What is the ML analogy? There are a couple of interesting things about it. It takes fewer samples. It’s more unsupervised. A child learning to drive a car… Children are not learning to drive a car. A teenager learning how to drive a car is not exactly getting some prebuilt, verifiable reward. It comes from their interaction with the machine and with the environment. It takes much fewer samples. It seems more unsupervised. It seems more robust?

Ilya Sutskever 00:30:07

Much more robust. The robustness of people is really staggering.

Dwarkesh Patel 00:30:12

Do you have a unified way of thinking about why all these things are happening at once? What is the ML analogy that could realize something like this?

Ilya Sutskever 00:30:24

One of the things that you’ve been asking about is how can the teenage driver self-correct and learn from their experience without an external teacher? The answer is that they have their value function. They have a general sense which is also, by the way, extremely robust in people. Whatever the human value function is, with a few exceptions around addiction, it’s actually very, very robust.

So for something like a teenager that’s learning to drive, they start to drive, and they already have a sense of how they’re driving immediately, how badly they are, how unconfident. And then they see, “Okay.” And then, of course, the learning speed of any teenager is so fast. After 10 hours, you’re good to go.

Dwarkesh Patel 00:31:17

It seems like humans have some solution, but I’m curious about how they are doing it and why is it so hard? How do we need to reconceptualize the way we’re training models to make something like this possible?

Ilya Sutskever 00:31:27

That is a great question to ask, and it’s a question I have a lot of opinions about. But unfortunately, we live in a world where not all machine learning ideas are discussed freely, and this is one of them. There’s probably a way to do it. I think it can be done. The fact that people are like that, I think it’s a proof that it can be done.

There may be another blocker though, which is that there is a possibility that the human neurons do more compute than we think. If that is true, and if that plays an important role, then things might be more difficult. But regardless, I do think it points to the existence of some machine learning principle that I have opinions on. But unfortunately, circumstances make it hard to discuss in detail.

Dwarkesh Patel 00:32:28

Nobody listens to this podcast, Ilya.

00:35:45 – Straight-shotting superintelligence

Section titled “00:35:45 – Straight-shotting superintelligence”

Dwarkesh Patel 00:35:45

I’m curious. If you say we are back in an era of research, you were there from 2012 to 2020. What is the vibe now going to be if we go back to the era of research?

For example, even after AlexNet, the amount of compute that was used to run experiments kept increasing, and the size of frontier systems kept increasing. Do you think now that this era of research will still require tremendous amounts of compute? Do you think it will require going back into the archives and reading old papers?

You were at Google and OpenAI and Stanford, these places, when there was more of a vibe of research? What kind of things should we be expecting in the community?

Ilya Sutskever 00:36:38

One consequence of the age of scaling is that scaling sucked out all the air in the room. Because scaling sucked out all the air in the room, everyone started to do the same thing. We got to the point where we are in a world where there are more companies than ideas by quite a bit. Actually on that, there is this Silicon Valley saying that says that ideas are cheap, execution is everything. People say that a lot, and there is truth to that. But then I saw someone say on Twitter something like, “If ideas are so cheap, how come no one’s having any ideas?” And I think it’s true too.

If you think about research progress in terms of bottlenecks, there are several bottlenecks. One of them is ideas, and one of them is your ability to bring them to life, which might be compute but also engineering. If you go back to the ‘90s, let’s say, you had people who had pretty good ideas, and if they had much larger computers, maybe they could demonstrate that their ideas were viable. But they could not, so they could only have a very, very small demonstration that did not convince anyone. So the bottleneck was compute.

Then in the age of scaling, compute has increased a lot. Of course, there is a question of how much compute is needed, but compute is large. Compute is large enough such that it’s not obvious that you need that much more compute to prove some idea. I’ll give you an analogy. AlexNet was built on two GPUs. That was the total amount of compute used for it. The transformer was built on 8 to 64 GPUs. No single transformer paper experiment used more than 64 GPUs of 2017, which would be like, what, two GPUs of today? The ResNet, right? You could argue that the o1 reasoning was not the most compute-heavy thing in the world.

So for research, you definitely need some amount of compute, but it’s far from obvious that you need the absolutely largest amount of compute ever for research. You might argue, and I think it is true, that if you want to build the absolutely best system then it helps to have much more compute. Especially if everyone is within the same paradigm, then compute becomes one of the big differentiators.

Dwarkesh Patel 00:39:41

I’m asking you for the history, because you were actually there. I’m not sure what actually happened. It sounds like it was possible to develop these ideas using minimal amounts of compute. But the transformer didn’t immediately become famous. It became the thing everybody started doing and then started experimenting on top of and building on top of because it was validated at higher and higher levels of compute.

Ilya Sutskever 00:40:06

Correct.

Dwarkesh Patel 00:40:07

And if you at SSI have 50 different ideas, how will you know which one is the next transformer and which one is brittle, without having the kinds of compute that other frontier labs have?

Ilya Sutskever 00:40:22

I can comment on that. The short comment is that you mentioned SSI. Specifically for us, the amount of compute that SSI has for research is really not that small. I want to explain why. Simple math can explain why the amount of compute that we have is comparable for research than one might think. I’ll explain.

SSI has raised $3 billion, which is a lot by any absolute sense. But you could say, “Look at the other companies raising much more.” But a lot of their compute goes for inference. These big numbers, these big loans, it’s earmarked for inference. That’s number one. Number two, if you want to have a product on which you do inference, you need to have a big staff of engineers, salespeople. A lot of the research needs to be dedicated to producing all kinds of product-related features. So then when you look at what’s actually left for research, the difference becomes a lot smaller.

The other thing is, if you are doing something different, do you really need the absolute maximal scale to prove it? I don’t think that’s true at all. I think that in our case, we have sufficient compute to prove, to convince ourselves and anyone else, that what we are doing is correct.

Dwarkesh Patel 00:42:02

There have been public estimates that companies like OpenAI spend on the order of $5-6 billion a year just so far, on experiments. This is separate from the amount of money they’re spending on inference and so forth. So it seems like they’re spending more a year running research experiments than you guys have in total funding.

Ilya Sutskever 00:42:22

I think it’s a question of what you do with it. It’s a question of what you do with it. In their case, in the case of others, there is a lot more demand on the training compute. There’s a lot more different work streams, there are different modalities, there is just more stuff. So it becomes fragmented.

Dwarkesh Patel 00:42:44

How will SSI make money?

Ilya Sutskever 00:42:46

My answer to this question is something like this. Right now, we just focus on the research, and then the answer to that question will reveal itself. I think there will be lots of possible answers.

Dwarkesh Patel 00:43:01

Is SSI’s plan still to straight shot superintelligence?

Ilya Sutskever 00:43:04

Maybe. I think that there is merit to it. I think there’s a lot of merit because it’s very nice to not be affected by the day-to-day market competition. But I think there are two reasons that may cause us to change the plan. One is pragmatic, if timelines turned out to be long, which they might. Second, I think there is a lot of value in the best and most powerful AI being out there impacting the world. I think this is a meaningfully valuable thing.

Dwarkesh Patel 00:43:48

So then why is your default plan to straight shot superintelligence? Because it sounds like OpenAI, Anthropic, all these other companies, their explicit thinking is, “Look, we have weaker and weaker intelligences that the public can get used to and prepare for.” Why is it potentially better to build a superintelligence directly?

Ilya Sutskever 00:44:08

I’ll make the case for and against. The case for is that one of the challenges that people face when they’re in the market is that they have to participate in the rat race. The rat race is quite difficult in that it exposes you to difficult trade-offs which you need to make. It is nice to say, “We’ll insulate ourselves from all this and just focus on the research and come out only when we are ready, and not before.” But the counterpoint is valid too, and those are opposing forces. The counterpoint is, “Hey, it is useful for the world to see powerful AI. It is useful for the world to see powerful AI because that’s the only way you can communicate it.”

Dwarkesh Patel 00:44:57

Well, I guess not even just that you can communicate the idea—

Ilya Sutskever 00:45:00

Communicate the AI, not the idea. Communicate the AI.

Dwarkesh Patel 00:45:04

What do you mean, “communicate the AI”?

Ilya Sutskever 00:45:06

Let’s suppose you write an essay about AI, and the essay says, “AI is going to be this, and AI is going to be that, and it’s going to be this.” You read it and you say, “Okay, this is an interesting essay.” Now suppose you see an AI doing this, an AI doing that. It is incomparable. Basically I think that there is a big benefit from AI being in the public, and that would be a reason for us to not be quite straight shot.

Dwarkesh Patel 00:45:37

I guess it’s not even that, but I do think that is an important part of it. The other big thing is that I can’t think of another discipline in human engineering and research where the end artifact was made safer mostly through just thinking about how to make it safe, as opposed to, why airplane crashes per mile are so much lower today than they were decades ago. Why is it so much harder to find a bug in Linux than it would have been decades ago? I think it’s mostly because these systems were deployed to the world. You noticed failures, those failures were corrected and the systems became more robust.

I’m not sure why AGI and superhuman intelligence would be any different, especially given—and I hope we’re going to get to this—it seems like the harms of superintelligence are not just about having some malevolent paper clipper out there. But this is a really powerful thing and we don’t even know how to conceptualize how people interact with it, what people will do with it. Having gradual access to it seems like a better way to maybe spread out the impact of it and to help people prepare for it.

00:46:47 – SSI’s model will learn from deployment

Section titled “00:46:47 – SSI’s model will learn from deployment”

Ilya Sutskever 00:46:47

Well I think on this point, even in the straight shot scenario, you would still do a gradual release of it, that’s how I would imagine it. Gradualism would be an inherent component of any plan. It’s just a question of what is the first thing that you get out of the door. That’s number one.

Number two, I believe you have advocated for continual learning more than other people, and I actually think that this is an important and correct thing. Here is why. I’ll give you another example of how language affects thinking. In this case, it will be two words that have shaped everyone’s thinking, I maintain. First word: AGI. Second word: pre-training. Let me explain.

The term AGI, why does this term exist? It’s a very particular term. Why does it exist? There’s a reason. The reason that the term AGI exists is, in my opinion, not so much because it’s a very important, essential descriptor of some end state of intelligence, but because it is a reaction to a different term that existed, and the term is narrow AI. If you go back to ancient history of gameplay and AI, of checkers AI, chess AI, computer games AI, everyone would say, look at this narrow intelligence. Sure, the chess AI can beat Kasparov, but it can’t do anything else. It is so narrow, artificial narrow intelligence. So in response, as a reaction to this, some people said, this is not good. It is so narrow. What we need is general AI, an AI that can just do all the things. That term just got a lot of traction.

The second thing that got a lot of traction is pre-training, specifically the recipe of pre-training. I think the way people do RL now is maybe undoing the conceptual imprint of pre-training. But pre-training had this property. You do more pre-training and the model gets better at everything, more or less uniformly. General AI. Pre-training gives AGI.

But the thing that happened with AGI and pre-training is that in some sense they overshot the target. If you think about the term “AGI”, especially in the context of pre-training, you will realize that a human being is not an AGI. Yes, there is definitely a foundation of skills, but a human being lacks a huge amount of knowledge. Instead, we rely on continual learning.

So when you think about, “Okay, so let’s suppose that we achieve success and we produce some kind of safe superintelligence.” The question is, how do you define it? Where on the curve of continual learning is it going to be?

I produce a superintelligent 15-year-old that’s very eager to go. They don’t know very much at all, a great student, very eager. You go and be a programmer, you go and be a doctor, go and learn. So you could imagine that the deployment itself will involve some kind of a learning trial-and-error period. It’s a process, as opposed to you dropping the finished thing.

Dwarkesh Patel 00:50:45

I see. You’re suggesting that the thing you’re pointing out with superintelligence is not some finished mind which knows how to do every single job in the economy. Because the way, say, the original OpenAI charter or whatever defines AGI is like, it can do every single job, every single thing a human can do. You’re proposing instead a mind which can learn to do every single job, and that is superintelligence.

Ilya Sutskever 00:51:15

Yes.

Dwarkesh Patel 00:51:16

But once you have the learning algorithm, it gets deployed into the world the same way a human laborer might join an organization.

Ilya Sutskever 00:51:25

Exactly.

Dwarkesh Patel 00:51:26

It seems like one of these two things might happen, maybe neither of these happens. One, this super-efficient learning algorithm becomes superhuman, becomes as good as you and potentially even better, at the task of ML research. As a result the algorithm itself becomes more and more superhuman.

The other is, even if that doesn’t happen, if you have a single model—this is explicitly your vision—where instances of a model which are deployed through the economy doing different jobs, learning how to do those jobs, continually learning on the job, picking up all the skills that any human could pick up, but picking them all up at the same time, and then amalgamating their learnings, you basically have a model which functionally becomes superintelligent even without any sort of recursive self-improvement in software. Because you now have one model that can do every single job in the economy and humans can’t merge our minds in the same way. So do you expect some sort of intelligence explosion from broad deployment?

Ilya Sutskever 00:52:30

I think that it is likely that we will have rapid economic growth. I think with broad deployment, there are two arguments you could make which are conflicting. One is that once indeed you get to a point where you have an AI that can learn to do things quickly and you have many of them, then there will be a strong force to deploy them in the economy unless there will be some kind of a regulation that stops it, which by the way there might be.

But the idea of very rapid economic growth for some time, I think it’s very possible from broad deployment. The question is how rapid it’s going to be. I think this is hard to know because on the one hand you have this very efficient worker. On the other hand, the world is just really big and there’s a lot of stuff, and that stuff moves at a different speed. But then on the other hand, now the AI could… So I think very rapid economic growth is possible. We will see all kinds of things like different countries with different rules and the ones which have the friendlier rules, the economic growth will be faster. Hard to predict.

Dwarkesh Patel 00:55:07

It seems to me that this is a very precarious situation to be in. In the limit, we know that this should be possible. If you have something that is as good as a human at learning, but which can merge its brains—merge different instances in a way that humans can’t merge—already, this seems like a thing that should physically be possible. Humans are possible, digital computers are possible. You just need both of those combined to produce this thing.

It also seems this kind of thing is extremely powerful. Economic growth is one way to put it. A Dyson sphere is a lot of economic growth. But another way to put it is that you will have, in potentially a very short period of time… You hire people at SSI, and in six months, they’re net productive, probably. A human learns really fast, and this thing is becoming smarter and smarter very fast. How do you think about making that go well? Why is SSI positioned to do that well? What is SSI’s plan there, is basically what I’m trying to ask.

Ilya Sutskever 00:56:10

One of the ways in which my thinking has been changing is that I now place more importance on AI being deployed incrementally and in advance. One very difficult thing about AI is that we are talking about systems that don’t yet exist and it’s hard to imagine them.

I think that one of the things that’s happening is that in practice, it’s very hard to feel the AGI. It’s very hard to feel the AGI. We can talk about it, but imagine having a conversation about how it is like to be old when you’re old and frail. You can have a conversation, you can try to imagine it, but it’s just hard, and you come back to reality where that’s not the case. I think that a lot of the issues around AGI and its future power stem from the fact that it’s very difficult to imagine. Future AI is going to be different. It’s going to be powerful. Indeed, the whole problem, what is the problem of AI and AGI? The whole problem is the power. The whole problem is the power.

When the power is really big, what’s going to happen? One of the ways in which I’ve changed my mind over the past year—and that change of mind, I’ll hedge a little bit, may back-propagate into the plans of our company—is that if it’s hard to imagine, what do you do? You’ve got to be showing the thing. You’ve got to be showing the thing. I maintain that most people who work on AI also can’t imagine it because it’s too different from what people see on a day-to-day basis.

I do maintain, here’s something which I predict will happen. This is a prediction. I maintain that as AI becomes more powerful, people will change their behaviors. We will see all kinds of unprecedented things which are not happening right now. I’ll give some examples. I think for better or worse, the frontier companies will play a very important role in what happens, as will the government. The kind of things that I think you’ll see, which you see the beginnings of, are companies that are fierce competitors starting to collaborate on AI safety. You may have seen OpenAI and Anthropic doing a first small step, but that did not exist. That’s something which I predicted in one of my talks about three years ago, that such a thing will happen. I also maintain that as AI continues to become more powerful, more visibly powerful, there will also be a desire from governments and the public to do something. I think this is a very important force, of showing the AI.

That’s number one. Number two, okay, so the AI is being built. What needs to be done? One thing that I maintain that will happen is that right now, people who are working on AI, I maintain that the AI doesn’t feel powerful because of its mistakes. I do think that at some point the AI will start to feel powerful actually. I think when that happens, we will see a big change in the way all AI companies approach safety. They’ll become much more paranoid. I say this as a prediction that we will see happen. We’ll see if I’m right. But I think this is something that will happen because they will see the AI becoming more powerful. Everything that’s happening right now, I maintain, is because people look at today’s AI and it’s hard to imagine the future AI.

There is a third thing which needs to happen. I’m talking about it in broader terms, not just from the perspective of SSI because you asked me about our company. The question is, what should the companies aspire to build? What should they aspire to build? There has been one big idea that everyone has been locked into, which is the self-improving AI. Why did it happen? Because there are fewer ideas than companies. But I maintain that there is something that’s better to build, and I think that everyone will want that.

It’s the AI that’s robustly aligned to care about sentient life specifically. I think in particular, there’s a case to be made that it will be easier to build an AI that cares about sentient life than an AI that cares about human life alone, because the AI itself will be sentient. And if you think about things like mirror neurons and human empathy for animals, which you might argue it’s not big enough, but it exists. I think it’s an emergent property from the fact that we model others with the same circuit that we use to model ourselves, because that’s the most efficient thing to do.

Dwarkesh Patel 01:02:06

So even if you got an AI to care about sentient beings—and it’s not actually clear to me that that’s what you should try to do if you solved alignment—it would still be the case that most sentient beings will be AIs. There will be trillions, eventually quadrillions, of AIs. Humans will be a very small fraction of sentient beings. So it’s not clear to me if the goal is some kind of human control over this future civilization, that this is the best criterion.

Ilya Sutskever 01:02:37

It’s true. It’s possible it’s not the best criterion. I’ll say two things. Number one, care for sentient life, I think there is merit to it. It should be considered. I think it would be helpful if there was some kind of short list of ideas that the companies, when they are in this situation, could use. That’s number two.

Number three, I think it would be really materially helpful if the power of the most powerful superintelligence was somehow capped because it would address a lot of these concerns. The question of how to do it, I’m not sure, but I think that would be materially helpful when you’re talking about really, really powerful systems.

Dwarkesh Patel 01:03:35

Before we continue the alignment discussion, I want to double-click on that. How much room is there at the top? How do you think about superintelligence? Do you think, using this learning efficiency idea, maybe it is just extremely fast at learning new skills or new knowledge? Does it just have a bigger pool of strategies? Is there a single cohesive “it” in the center that’s more powerful or bigger? If so, do you imagine that this will be sort of godlike in comparison to the rest of human civilization, or does it just feel like another agent, or another cluster of agents?

Ilya Sutskever 01:04:10

This is an area where different people have different intuitions. I think it will be very powerful, for sure. What I think is most likely to happen is that there will be multiple such AIs being created roughly at the same time. I think that if the cluster is big enough—like if the cluster is literally continent-sized—that thing could be really powerful, indeed. If you literally have a continent-sized cluster, those AIs can be very powerful. All I can tell you is that if you’re talking about extremely powerful AIs, truly dramatically powerful, it would be nice if they could be restrained in some ways or if there were some kind of agreement or something.

What is the concern of superintelligence? What is one way to explain the concern? If you imagine a system that is sufficiently powerful, really sufficiently powerful—and you could say you need to do something sensible like care for sentient life in a very single-minded way—we might not like the results. That’s really what it is.

Maybe, by the way, the answer is that you do not build an RL agent in the usual sense. I’ll point several things out. I think human beings are semi-RL agents. We pursue a reward, and then the emotions or whatever make us tire out of the reward and we pursue a different reward. The market is a very short-sighted kind of agent. Evolution is the same. Evolution is very intelligent in some ways, but very dumb in other ways. The government has been designed to be a never-ending fight between three parts, which has an effect. So I think things like this.

Another thing that makes this discussion difficult is that we are talking about systems that don’t exist, that we don’t know how to build. That’s the other thing and that’s actually my belief. I think what people are doing right now will go some distance and then peter out. It will continue to improve, but it will also not be “it”. The “It” we don’t know how to build, and a lot hinges on understanding reliable generalization.

I’ll say another thing. One of the things that you could say about what causes alignment to be difficult is that your ability to learn human values is fragile. Then your ability to optimize them is fragile. You actually learn to optimize them. And can’t you say, “Are these not all instances of unreliable generalization?” Why is it that human beings appear to generalize so much better? What if generalization was much better? What would happen in this case? What would be the effect? But those questions are right now still unanswerable.

Dwarkesh Patel 01:07:21

How does one think about what AI going well looks like? You’ve scoped out how AI might evolve. We’ll have these sort of continual learning agents. AI will be very powerful. Maybe there will be many different AIs. How do you think about lots of continent-sized compute intelligences going around? How dangerous is that? How do we make that less dangerous? And how do we do that in a way that protects an equilibrium where there might be misaligned AIs out there and bad actors out there?

Ilya Sutskever 01:07:58

Here’s one reason why I liked “AI that cares for sentient life”. We can debate on whether it’s good or bad. But if the first N of these dramatic systems do care for, love, humanity or something, care for sentient life, obviously this also needs to be achieved. This needs to be achieved. So if this is achieved by the first N of those systems, then I can see it go well, at least for quite some time.

Then there is the question of what happens in the long run. How do you achieve a long-run equilibrium? I think that there, there is an answer as well. I don’t like this answer, but it needs to be considered.

In the long run, you might say, “Okay, if you have a world where powerful AIs exist, in the short term, you could say you have universal high income. You have universal high income and we’re all doing well.” But what do the Buddhists say? “Change is the only constant.” Things change. There is some kind of government, political structure thing, and it changes because these things have a shelf life. Some new government thing comes up and it functions, and then after some time it stops functioning. That’s something that we see happening all the time.

So I think for the long-run equilibrium, one approach is that you could say maybe every person will have an AI that will do their bidding, and that’s good. If that could be maintained indefinitely, that’s true. But the downside with that is then the AI goes and earns money for the person and advocates for their needs in the political sphere, and maybe then writes a little report saying, “Okay, here’s what I’ve done, here’s the situation,” and the person says, “Great, keep it up.” But the person is no longer a participant. Then you can say that’s a precarious place to be in.

I’m going to preface by saying I don’t like this solution, but it is a solution. The solution is if people become part-AI with some kind of Neuralink++. Because what will happen as a result is that now the AI understands something, and we understand it too, because now the understanding is transmitted wholesale. So now if the AI is in some situation, you are involved in that situation yourself fully. I think this is the answer to the equilibrium.

Dwarkesh Patel 01:10:47

I wonder if the fact that emotions which were developed millions—or in many cases, billions—of years ago in a totally different environment are still guiding our actions so strongly is an example of alignment success.

To spell out what I mean—I don’t know whether it’s more accurate to call it a value function or reward function—but the brainstem has a directive where it’s saying, “Mate with somebody who’s more successful.” The cortex is the part that understands what success means in the modern context. But the brainstem is able to align the cortex and say, “However you recognize success to be—and I’m not smart enough to understand what that is— you’re still going to pursue this directive.”

Ilya Sutskever 01:11:36

I think there’s a more general point. I think it’s actually really mysterious how evolution encodes high-level desires. It’s pretty easy to understand how evolution would endow us with the desire for food that smells good because smell is a chemical, so just pursue that chemical. It’s very easy to imagine evolution doing that thing.

But evolution also has endowed us with all these social desires. We really care about being seen positively by society. We care about being in good standing. All these social intuitions that we have, I feel strongly that they’re baked in. I don’t know how evolution did it because it’s a high-level concept that’s represented in the brain.

Let’s say you care about some social thing, it’s not a low-level signal like smell. It’s not something for which there is a sensor. The brain needs to do a lot of processing to piece together lots of bits of information to understand what’s going on socially. Somehow evolution said, “That’s what you should care about.” How did it do it?

It did it quickly, too. All these sophisticated social things that we care about, I think they evolved pretty recently. Evolution had an easy time hard-coding this high-level desire. I’m unaware of a good hypothesis for how it’s done. I had some ideas I was kicking around, but none of them are satisfying.

Dwarkesh Patel 01:13:26

What’s especially impressive is it was desire that you learned in your lifetime, it makes sense because your brain is intelligent. It makes sense why you would be able to learn intelligent desires. Maybe this is not your point, but one way to understand it is that the desire is built into the genome, and the genome is not intelligent. But you’re somehow able to describe this feature. It’s not even clear how you define that feature, and you can build it into the genes.

Ilya Sutskever 01:13:55

Essentially, or maybe I’ll put it differently. If you think about the tools that are available to the genome, it says, “Okay, here’s a recipe for building a brain.” You could say, “Here is a recipe for connecting the dopamine neurons to the smell sensor.” And if the smell is a certain kind of good smell, you want to eat that.

I could imagine the genome doing that. I’m claiming that it is harder to imagine. It’s harder to imagine the genome saying you should care about some complicated computation that your entire brain, a big chunk of your brain, does. That’s all I’m claiming. I can tell you a speculation of how it could be done. Let me offer a speculation, and I’ll explain why the speculation is probably false.

So the brain has brain regions. We have our cortex. It has all those brain regions. The cortex is uniform, but the brain regions and the neurons in the cortex kind of speak to their neighbors mostly. That explains why you get brain regions. Because if you want to do some kind of speech processing, all the neurons that do speech need to talk to each other. And because neurons can only speak to their nearby neighbors, for the most part, it has to be a region.

All the regions are mostly located in the same place from person to person. So maybe evolution hard-coded literally a location on the brain. So it says, “Oh, when the GPS coordinates of the brain such and such, when that fires, that’s what you should care about.” Maybe that’s what evolution did because that would be within the toolkit of evolution.

Dwarkesh Patel 01:15:35

Yeah, although there are examples where, for example, people who are born blind have that area of their cortex adopted by another sense. I have no idea, but I’d be surprised if the desires or the reward functions which require a visual signal no longer worked for people who have their different areas of their cortex co-opted.

For example, if you no longer have vision, can you still feel the sense that I want people around me to like me and so forth, which usually there are also visual cues for.

Ilya Sutskever 01:16:12

I fully agree with that. I think there’s an even stronger counterargument to this theory. There are people who get half of their brains removed in childhood, and they still have all their brain regions. But they all somehow move to just one hemisphere, which suggests that the brain regions, their location is not fixed and so that theory is not true.

It would have been cool if it was true, but it’s not. So I think that’s a mystery. But it’s an interesting mystery. The fact is that somehow evolution was able to endow us to care about social stuff very, very reliably. Even people who have all kinds of strange mental conditions and deficiencies and emotional problems tend to care about this also.

01:18:13 – “We are squarely an age of research company”

Section titled “01:18:13 – “We are squarely an age of research company””

Dwarkesh Patel 01:18:13

What is SSI planning on doing differently? Presumably your plan is to be one of the frontier companies when this time arrives. Presumably you started SSI because you’re like, “I think I have a way of approaching how to do this safely in a way that the other companies don’t.” What is that difference?

Ilya Sutskever 01:18:36

The way I would describe it is that there are some ideas that I think are promising and I want to investigate them and see if they are indeed promising or not. It’s really that simple. It’s an attempt. If the ideas turn out to be correct—these ideas that we discussed around understanding generalization—then I think we will have something worthy.

Will they turn out to be correct? We are doing research. We are squarely an “age of research” company. We are making progress. We’ve actually made quite good progress over the past year, but we need to keep making more progress, more research. That’s how I see it. I see it as an attempt to be a voice and a participant.

Dwarkesh Patel 01:19:29

Your cofounder and previous CEO left to go to Meta recently, and people have asked, “Well, if there were a lot of breakthroughs being made, that seems like a thing that should have been unlikely.” I wonder how you respond.

Ilya Sutskever 01:19:45

For this, I will simply remind a few facts that may have been forgotten. I think these facts which provide the context explain the situation. The context was that we were fundraising at a $32 billion valuation, and then Meta came in and offered to acquire us, and I said no. But my former cofounder in some sense said yes. As a result, he also was able to enjoy a lot of near-term liquidity, and he was the only person from SSI to join Meta.

Dwarkesh Patel 01:20:27

It sounds like SSI’s plan is to be a company that is at the frontier when you get to this very important period in human history where you have superhuman intelligence. You have these ideas about how to make superhuman intelligence go well. But other companies will be trying their own ideas. What distinguishes SSI’s approach to making superintelligence go well?

Ilya Sutskever 01:20:49

The main thing that distinguishes SSI is its technical approach. We have a different technical approach that I think is worthy and we are pursuing it.

I maintain that in the end there will be a convergence of strategies. I think there will be a convergence of strategies where at some point, as AI becomes more powerful, it’s going to become more or less clearer to everyone what the strategy should be. It should be something like, you need to find some way to talk to each other and you want your first actual real superintelligent AI to be aligned and somehow care for sentient life, care for people, democratic, one of those, some combination thereof.

I think this is the condition that everyone should strive for. That’s what SSI is striving for. I think that this time, if not already, all the other companies will realize that they’re striving towards the same thing. We’ll see. I think that the world will truly change as AI becomes more powerful. I think things will be really different and people will be acting really differently.

Dwarkesh Patel 01:22:14

Speaking of forecasts, what are your forecasts to this system you’re describing, which can learn as well as a human and subsequently, as a result, become superhuman?

Ilya Sutskever 01:22:26

I think like 5 to 20.

Dwarkesh Patel 01:22:28

5 to 20 years?

Ilya Sutskever 01:22:29

Mhm.

Dwarkesh Patel 01:22:30

I just want to unroll how you might see the world coming. It’s like, we have a couple more years where these other companies are continuing the current approach and it stalls out. “Stalls out” here meaning they earn no more than low hundreds of billions in revenue? How do you think about what stalling out means?

Ilya Sutskever 01:22:49

I think stalling out will look like…it will all look very similar among all the different companies. It could be something like this. I’m not sure because I think even with stalling out, I think these companies could make a stupendous revenue. Maybe not profits because they will need to work hard to differentiate each other from themselves, but revenue definitely.

Dwarkesh Patel 01:23:20

But something in your model implies that when the correct solution does emerge, there will be convergence between all the companies. I’m curious why you think that’s the case.

Ilya Sutskever 01:23:32

I was talking more about convergence on their alignment strategies. I think eventual convergence on the technical approach is probably going to happen as well, but I was alluding to convergence to the alignment strategies. What exactly is the thing that should be done?

Dwarkesh Patel 01:23:46

I just want to better understand how you see the future unrolling. Currently, we have these different companies, and you expect their approach to continue generating revenue but not get to this human-like learner. So now we have these different forks of companies. We have you, we have Thinking Machines, there’s a bunch of other labs. Maybe one of them figures out the correct approach. But then the release of their product makes it clear to other people how to do this thing.

Ilya Sutskever 01:24:09

I think it won’t be clear how to do it, but it will be clear that something different is possible, and that is information. People will then be trying to figure out how that works. I do think though that one of the things not addressed here, not discussed, is that with each increase in the AI’s capabilities, I think there will be some kind of changes, but I don’t know exactly which ones, in how things are being done. I think it’s going to be important, yet I can’t spell out what that is exactly.

Dwarkesh Patel 01:24:49

By default, you would expect the company that has that model to be getting all these gains because they have the model that has the skills and knowledge that it’s building up in the world. What is the reason to think that the benefits of that would be widely distributed and not just end up at whatever model company gets this continuous learning loop going first?

Ilya Sutskever 01:25:13

Here is what I think is going to happen. Number one, let’s look at how things have gone so far with the AIs of the past. One company produced an advance and the other company scrambled and produced some similar things after some amount of time and they started to compete in the market and push the prices down. So I think from the market perspective, something similar will happen there as well.

We are talking about the good world, by the way. What’s the good world? It’s where we have these powerful human-like learners that are also… By the way, maybe there’s another thing we haven’t discussed on the spec of the superintelligent AI that I think is worth considering. It’s that you make it narrow, it can be useful and narrow at the same time. You can have lots of narrow superintelligent AIs.

But suppose you have many of them and you have some company that’s producing a lot of profits from it. Then you have another company that comes in and starts to compete. The way the competition is going to work is through specialization. Competition loves specialization. You see it in the market, you see it in evolution as well. You’re going to have lots of different niches and you’re going to have lots of different companies who are occupying different niches. In this world we might say one AI company is really quite a bit better at some area of really complicated economic activity and a different company is better at another area. And the third company is really good at litigation.

Dwarkesh Patel 01:27:18

Isn’t this contradicted by what human-like learning implies? It’s that it can learn…

Ilya Sutskever 01:27:21

It can, but you have accumulated learning. You have a big investment. You spent a lot of compute to become really, really good, really phenomenal at this thing. Someone else spent a huge amount of compute and a huge amount of experience to get really good at some other thing. You apply a lot of human learning to get there, but now you are at this high point where someone else would say, “Look, I don’t want to start learning what you’ve learned.”

Dwarkesh Patel 01:27:48

I guess that would require many different companies to begin at the human-like continual learning agent at the same time so that they can start their different tree search in different branches. But if one company gets that agent first, or gets that learner first, it does then seem like… Well, if you just think about every single job in the economy, having an instance learning each one seems tractable for a company.

Ilya Sutskever 01:28:19

That’s a valid argument. My strong intuition is that it’s not how it’s going to go. The argument says it will go this way, but my strong intuition is that it will not go this way. In theory, there is no difference between theory and practice. In practice, there is. I think that’s going to be one of those.

Dwarkesh Patel 01:28:41

A lot of people’s models of recursive self-improvement literally, explicitly state we will have a million Ilyas in a server that are coming up with different ideas, and this will lead to a superintelligence emerging very fast.

Do you have some intuition about how parallelizable the thing you are doing is? What are the gains from making copies of Ilya?

Ilya Sutskever 01:29:02

I don’t know. I think there’ll definitely be diminishing returns because you want people who think differently rather than the same. If there were literal copies of me, I’m not sure how much more incremental value you’d get. People who think differently, that’s what you want.

Dwarkesh Patel 01:29:23

Why is it that if you look at different models, even released by totally different companies trained on potentially non-overlapping datasets, it’s actually crazy how similar LLMs are to each other?

Ilya Sutskever 01:29:38

Maybe the datasets are not as non-overlapping as it seems.

Dwarkesh Patel 01:29:41

But there’s some sense in which even if an individual human might be less productive than the future AI, maybe there’s something to the fact that human teams have more diversity than teams of AIs might have. How do we elicit meaningful diversity among AIs? I think just raising the temperature just results in gibberish. You want something more like different scientists have different prejudices or different ideas. How do you get that kind of diversity among AI agents?

Ilya Sutskever 01:30:06

So the reason there has been no diversity, I believe, is because of pre-training. All the pre-trained models are pretty much the same because they pre-train on the same data. Now RL and post-training is where some differentiation starts to emerge because different people come up with different RL training.

Dwarkesh Patel 01:30:26

I’ve heard you hint in the past about self-play as a way to either get data or match agents to other agents of equivalent intelligence to kick off learning. How should we think about why there are no public proposals of this kind of thing working with LLMs?

Ilya Sutskever 01:30:49

I would say there are two things to say. The reason why I thought self-play was interesting is because it offered a way to create models using compute only, without data. If you think that data is the ultimate bottleneck, then using compute only is very interesting. So that’s what makes it interesting.

The thing is that self-play, at least the way it was done in the past—when you have agents which somehow compete with each other—it’s only good for developing a certain set of skills. It is too narrow. It’s only good for negotiation, conflict, certain social skills, strategizing, that kind of stuff. If you care about those skills, then self-play will be useful.

Actually, I think that self-play did find a home, but just in a different form. So things like debate, prover-verifier, you have some kind of an LLM-as-a-Judge which is also incentivized to find mistakes in your work. You could say this is not exactly self-play, but this is a related adversarial setup that people are doing, I believe.

Really self-play is a special case of more general competition between agents. The natural response to competition is to try to be different. So if you were to put multiple agents together and you tell them, “You all need to work on some problem and you are an agent and you’re inspecting what everyone else is working,” they’re going to say, “Well, if they’re already taking this approach, it’s not clear I should pursue it. I should pursue something differentiated.” So I think something like this could also create an incentive for a diversity of approaches.

Dwarkesh Patel 01:32:42

Final question: What is research taste? You’re obviously the person in the world who is considered to have the best taste in doing research in AI. You were the co-author on the biggest things that have happened in the history of deep learning, from AlexNet to GPT-3 to so on. What is it, how do you characterize how you come up with these ideas?

Ilya Sutskever 01:33:14

I can comment on this for myself. I think different people do it differently. One thing that guides me personally is an aesthetic of how AI should be, by thinking about how people are, but thinking correctly. It’s very easy to think about how people are incorrectly, but what does it mean to think about people correctly?

I’ll give you some examples. The idea of the artificial neuron is directly inspired by the brain, and it’s a great idea. Why? Because you say the brain has all these different organs, it has the folds, but the folds probably don’t matter. Why do we think that the neurons matter? Because there are many of them. It kind of feels right, so you want the neuron. You want some local learning rule that will change the connections between the neurons. It feels plausible that the brain does it.

The idea of the distributed representation. The idea that the brain responds to experience therefore our neural net should learn from experience. The brain learns from experience, the neural net should learn from experience. You kind of ask yourself, is something fundamental or not fundamental? How things should be.

I think that’s been guiding me a fair bit, thinking from multiple angles and looking for almost beauty, beauty and simplicity. Ugliness, there’s no room for ugliness. It’s beauty, simplicity, elegance, correct inspiration from the brain. All of those things need to be present at the same time. The more they are present, the more confident you can be in a top-down belief.

The top-down belief is the thing that sustains you when the experiments contradict you. Because if you trust the data all the time, well sometimes you can be doing the correct thing but there’s a bug. But you don’t know that there is a bug. How can you tell that there is a bug? How do you know if you should keep debugging or you conclude it’s the wrong direction? It’s the top-down. You can say things have to be this way. Something like this has to work, therefore we’ve got to keep going. That’s the top-down, and it’s based on this multifaceted beauty and inspiration by the brain.

Dwarkesh Patel 01:35:31

Alright, we’ll leave it there.

Ilya Sutskever 01:35:33

Thank you so much.

Dwarkesh Patel 01:35:34

Ilya, thank you so much.

Ilya Sutskever 01:35:36

Alright. Appreciate it.

Dwarkesh Patel 01:35:37

That was great.

Ilya Sutskever 01:35:38

Yeah, I enjoyed it.

Dwarkesh Patel 01:35:39

Yes, me too.

[

8d1451292c7e4fddfc7fd3a7d63883d3_MD5

](https://substack.com/profile/15944900-tokenbender)[

8f31b8381ad436797a5dba0a1979f37b_MD5

](https://substack.com/profile/135977409-linus)[

0ab1e2a66e3473187d240027fb458841_MD5

](https://substack.com/profile/13548855-daniel-kalish)[

6c15189d28b929af111d6529939dfc60_MD5

](https://substack.com/profile/57730528-deepak-lenka)[

f2783ec1c8005a24a029325b43ac532f_MD5

](https://substack.com/profile/301044-adi-pradhan)

346 Likes∙

48 Restacks

9ae056758c236b6f4eab93d4d926c1de_MD5

[

d7255c87539a8be0e3634f5e64a7ff3b_MD5

](https://substack.com/profile/307819638-neural-foundry?utm_source=comment)

Neural Foundry

25 Nov

The part that seemed most interesting was Ilya is writing two different storylines in parallel that do not seem to align.

One storyline is about how bad current systems (he calls them “jagged”) are at generalizing compared to humans and how they’ve been so badly fit to evaluations and narrow reinforcement learning environments that he believes that a model student trained on all possible contest problems would not make a good engineer. This analogy, as well as the entire first half of his talk, felt like a very clear and accurate critique of the current system paradigm.

However, when he speaks about SSI and the timeline, he makes an implicit assumption that the same basic paradigm (deep learning + big computing) combined with some sort of “secret” principle for better generalization of models will produce a human level continual learner in 5-20 years. He’s making a gigantic leap by assuming that there exists a relatively compact missing principle that fits nicely into today’s deep learning + big computing paradigm instead of requiring a much larger shift in either the architecture of the models, the training paradigms, or even the hardware itself.

In addition, I think Ilya is overly optimistic about how the ecosystem will behave. “We’re back in the age of research” sounds like a wonderful thing, but in reality we are in an era where research and scale are becoming one; the only way to validate a new large-scale idea is to test it at a scale that only a handful of labs can currently achieve, and capital will end up being the true outer-loop optimizer.

Similarly, while I believe “to care about sentient life” is a powerful slogan for the concept of alignment, once you accept that almost all future sentient life will be produced by AI, and that the long-term equilibrium may require Neuralink++ humans simply to remain relevant, you’re essentially accepting that we still have no coherent narrative of how this all leads to a stable human-in-control outcome.

Where I absolutely agree with Ilya is that the systems that ultimately matter will be those that continually learn on the job, aggregating knowledge through millions of experiences. However, precisely because continual learning will be such a critical component of these systems, “we’ll just roll it out slowly,” does little to assuage my concerns. A superhuman learner that has significant economic pressures applied to it will likely be forced to operate in strange regimes in a hurry, and it’s far from certain whether incremental exposure, coupled with good intentions from the labs producing these learners, is sufficient to protect against unforeseen consequences. SSI appears to be primarily focused on uncovering a mechanism for better generalization, but I’m not convinced that we heard an equally clear and concrete plan for dealing with what happens after that works.

[

Like (14)

](https://www.dwarkesh.com/p/)

Reply

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/3405102-andrew-vanloo?utm_source=comment)

Andrew VanLoo

26 Nov

Ilya is slowly coming around to the true foundation of intelligence, Biblical morality.

[

Like (6)

](https://www.dwarkesh.com/p/)

Reply

Share

[

f2783ec1c8005a24a029325b43ac532f_MD5

](https://substack.com/profile/301044-adi-pradhan?utm_source=comment)

Adi Pradhan

25 Nov

I’m really glad you kept that casual behind the scenes first minute in the video.

What’s happening with this slow takeoff is so epochal but we barely stop to consider it.

Fascinating to see even Ilya feeling like it’s crazy that “it’s happening and straight out of science fiction”

[

Like (4)

](https://www.dwarkesh.com/p/)

Reply

Share

[

b686cadaaa6109a184d1dd73dbc879f6_MD5

](https://substack.com/profile/35946688-ariel-szin?utm_source=comment)

Ariel Szin

26 Nov

Sad that the smartest people in AI can’t explicitly say out loud that this current iteration of AI is just a fancy, often clunky, knowledge retrieval tool. Which is very different than ‘intelligence’. Admitting this would stop the gravy train, sure, but the alternative is worse.

[

Like (3)

](https://www.dwarkesh.com/p/)

Reply

Share

[

16108edcb9a6e9ffa90b993a23d8da0b_MD5

](https://substack.com/profile/2153741-nick-lashinsky?utm_source=comment)

Nick Lashinsky

26 Nov

I believe the evolutionary “value function” he keeps referring to is the motivation to have self directed focus to respond to feedback from our environment, and persist to take action and make adjustments, so that we can improve our likelihood of survival, or, our social standing.

[

Like (2)

](https://www.dwarkesh.com/p/)

Reply (1)

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/403561282-william-hayden?utm_source=comment)

William Hayden

26 Nov

Yes, the danger of course being group think such as “Climate Change” being used to hijack for “power and money”. Satan is the great deceiver, and Ai or SSi both possess the potential for pursuing ends that are not in alignment with Truth. Just search Youtube for plenty of low hanging fruit to see examples of “recycling” by individuals to separate their trash only for it to be dumped into the same garbage dumpsters (shortcuts).

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

b4b3eb4ec7ed101ee177cdbc510e761c_MD5

](https://substack.com/profile/431901-ran?utm_source=comment)

Ran

25 Nov

Ilya is great. Such a deep and original thinker.

[

Like (2)

](https://www.dwarkesh.com/p/)

Reply

Share

[

8ae2c4625618e720752505d3fe844302_MD5

](https://substack.com/profile/148527644-oscar?utm_source=comment)

Oscar

1 Dec

He repeatedly refuses to answer the question: How will AGI actually be built? He has no answer.

I’ve reviewed the details on my AI blog, here:

oscarmdavies.substack.com/p/on-the-sutskever-and-dwarkesh-interview

[

Like (1)

](https://www.dwarkesh.com/p/)

Reply

Share

[

004fdf2c9456699d90502b9cb167bbb6_MD5

](https://substack.com/profile/1837219-rb-griggs?utm_source=comment)

R.B. Griggs

26 Nov

More cracks in the Singularity myth. This idea of infinite intelligence was needed to catalyze AI. It is now the obstacle we need to overcome. Ilya’s superintelligent 15-year-old is what I’ve been calling the Plurality—intelligence that evolves by adapting to constraints, not by transcending them.

https://techforlife.com/p/the-plurality-a-better-myth-for-ai

[

Like (1)

](https://www.dwarkesh.com/p/)

Reply

Share

[

a252bee24511673fcebcfb8d3799b190_MD5

](https://substack.com/profile/419141359-patrick-porlan?utm_source=comment)

Patrick Porlan

26 Nov

One thing that I feel machines are missing compared to humans is the notion of passing time. Human coalesce memories around different timeframes, short and long, and are able to associate them. Building associations between notable, surprising, experiences happening in a variety of timeframes. The value functions could be seen as a kind of fuzzy associative memory based on past experiences, used to predict the outcome of a situation from weak signals. Even emotions could be described this way.

[

Like (1)

](https://www.dwarkesh.com/p/)

Reply

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/140027272-sergei-a-frolov?utm_source=comment)

Sergei A. Frolov

5d

Many of the limits discussed here — jagged generalization, unclear value structure, and the question of what exactly we are scaling — seem to point to a missing level of description.

One way to frame this is to step outside model internals and ask: what kinds of informational situations must a system be able to control, via behavior change, in order to remain viable over time?

A recent comparative analysis across 1,530 species suggests that cognition may be better understood as control over five recurrent informational task domains, acquired in a fixed order under survival constraints — rather than as a property of biological substrate. This framing has been formalized as the Five Task Model.

From this perspective, “world models” are not representations for their own sake, but tools for managing task-relevant informational change across these domains — which seems closely aligned with several of the questions raised in this conversation.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

042e8a244ce7922e13f971b8aa5270c8_MD5

](https://substack.com/profile/1675096-mike-okoen?utm_source=comment)

Mike O’Koen

7 Jan

According to Ilya the optimal way to release AGI would to be to do it gradually in stages..for the reasons he stated and for some others. What we’ve seen with AI so far is release after release of new improved AI. So isn’t that what a gradual release of AGI might look like? Is it possible they already have AGI? By they I mean one of the top players or possibly two or more of the top players who have reached an agreement. Maybe they’re slowly working on safety or some other issue before the release of AGI. Hey, just a thought 😁

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

a252bee24511673fcebcfb8d3799b190_MD5

](https://substack.com/profile/261416364-rehan?utm_source=comment)

Rehan

31 Dec

Interesting discussion.

The next frontier for general intelligence and value lies in judgment and intuition.

This requires extending how models are trained and evaluated—to recognize conceptual patterns, weigh up evidence, be guided by principles and values, engage in continuous learning and feedback loops to strengthen experience and instinct.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

086dc323d908f5e30ad5d7093153779d_MD5

](https://substack.com/profile/346458053-mark?utm_source=comment)

Mark

22 Dec

machines do not learn like humans. having all the data doesn’t give you all the answers. how does a machine learn discernment?

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

50676a03060614ff9b83e0d174c899cf_MD5

](https://substack.com/profile/202232948-minxuan-zhao?utm_source=comment)

Minxuan Zhao

15 Dec

My understanding is that what he’s really pointing to is this:

humans don’t actually understand how intelligence emerges. What we mostly do is stack existing data and hope generalization appears.

As for whether the missing principle should be embedded into the current system or require an entirely new framework, I don’t see this as a contradiction.

A system only becomes practically valuable once it is independent enough to form a closed logical loop of its own. Only then does it make sense to embed it into existing systems. From an operational perspective, this is simply the most efficient path.

Especially in a world where human consciousness is already saturated by massive amounts of information, a full “reset” or paradigm overthrow is something only a small number of individuals can realistically accomplish.

For most people, what actually works is reaching critical nodes within the existing system—points that trigger awareness, adjustment, or redirection—rather than rebuilding everything from scratch.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

d400fc15600301dab7ac309b37ca0e3d_MD5

](https://substack.com/profile/68402617-naina-chaturvedi?utm_source=comment)

Naina Chaturvedi

15 Dec

++ Good Post, Also, start here how to build tech, Crash Courses, 100+ Most Asked ML System Design Case Studies and LLM System Design

How to Build Tech 

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-10-how-to-actually?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-06-how-to-actually?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-05-how-to-actually?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-04-how-to-actually?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-03-how-to-actually?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-01-the-heart-of?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/howtobuildtech/p/how-to-build-tech-02-how-to-actually?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

Crash Courses

https://open.substack.com/pub/crashcourses/p/crash-course-07-hands-on-crash-course?utm_campaign=post-expanded-share&utm_medium=web

https://open.substack.com/pub/crashcourses/p/crash-course-06-part-2-hands-on-crash?utm_campaign=post-expanded-share&utm_medium=web

https://open.substack.com/pub/crashcourses/p/crash-course-04-hands-on-crash-course?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/crashcourses/p/crash-course-03-hands-on-crash-course?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/crashcourses/p/crash-course-02-a-complete-crash?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/crashcourses/p/crash-course-01-a-complete-crash?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

LLM System Design

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-577?utm_campaign=post-expanded-share&utm_medium=web

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-4ea?utm_campaign=post-expanded-share&utm_medium=web

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-499?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-63c?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-bdd?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-661?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-83b?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-799?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-612?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-7e6?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-67d?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/most-important-llm-system-design-b31?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://naina0405.substack.com/p/launching-llm-system-design-large?r=14q3sp

https://naina0405.substack.com/p/launching-llm-system-design-2-large?r=14q3sp

[https://open.substack.com/pub/naina0405/p/llm-system-design-3-large-language?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/important-llm-system-design-4-heart?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

https://open.substack.com/pub/naina0405/p/very-important-llm-system-design-63c?r=14q3sp&utm_campaign=post&utm_medium=web&showWelcomeOnShare=false

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

52183899923b1252221225f591a72151_MD5

](https://substack.com/profile/84944535-paul-gibbons?utm_source=comment)

Paul Gibbons

9 Dec

Idk if you read these DP BUT listened to your Lane podcast and you were giving molecular biologist energy. I looked up your background and there was none… so wow..amazing prep made amazing convo…

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

79c84e4c80e93b531b4818bab7a041ff_MD5

](https://substack.com/profile/21147882-raj?utm_source=comment)

Raj

8 Dec

I’d love to get a coffee with Ilya one day. Great interview.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

980e0aef44a35095611f4fe02b3bd374_MD5

](https://substack.com/profile/312179710-kian-kyars?utm_source=comment)

Kian Kyars

5 Dec

Some people when they speak you get the impression of their abnormal intelligence and Ilyas F. Skever is one of them.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

779ab99aca69b36cf7a54b40fbcf8174_MD5

](https://substack.com/profile/9771605-jon-rowlands?utm_source=comment)

Jon Rowlands

4 Dec

Humans make a characteristic “ohhhhhhh!” noise when they learn something. It’s very sudden. AI needs to figure out what happens just before that.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

973e15fbd9db7cf53f3039e80daef4b5_MD5

](https://substack.com/profile/313868122-karen-gardner?utm_source=comment)

Karen Gardner

3 Dec

Riveting. Back to the studio to make art - pure research where no answers are needed, only further questions, and highly gratifying. “Beauty”. Now you’re talking.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

a252bee24511673fcebcfb8d3799b190_MD5

](https://substack.com/profile/10028719-ian-crandell?utm_source=comment)

Ian Crandell

3 Dec

I wonder how that doctor decided that the thing that was removed from that patient was ‘emotions.’ No anger, no sadness, sure, but Ilya brings up the point of hunger being an emotion or not. Did the patient lose his hunger, I wonder? Certainly it’s hard to decide to eat without hunger.

I don’t think there’s a solid line between emotion and perception. Perception is not passive, input is highly processed before you can act on it. Just as your eyes process visual information into images of loved ones etc, emotions process situations into sensations of good/bad, go/stop, yes/no. So we may as well ask what the AI analog of perception is.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

a684d5694129d0023cf24fd1ecc5e925_MD5

](https://substack.com/profile/30285456-roy-e-roebuck-iii?utm_source=comment)

Roy E. Roebuck III

2 Dec

Consider the material at this link as a possible solution to “It”.

https://chatgpt.com/gg/v/692f1a9288088192a1954c3eb4c08a47?token=zIhiYj9IcqUB7ygkJPhmHQ

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

a252bee24511673fcebcfb8d3799b190_MD5

](https://substack.com/profile/3349624-bg?utm_source=comment)

BG

1 Dec

Intrigued by the question Ilya posed - How does evolution encode high-level desires like desire for good social standing?

Thought: perhaps it doesn’t encode these things at all. It is learned that in a social environment, good social skills leads to survival and fitness. But evolution may need to encode the *capacity* to pick up these social abilities rapidly. If you were the only being on the planet, you wouldn’t care about social status at all — it’s only in the context of society that this “desire” becomes apparent. Even calling it a desire makes it seem like it’s pursued as an end in itself, whereas really it’s a means to something else.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

a252bee24511673fcebcfb8d3799b190_MD5

](https://substack.com/profile/8529504-jane-goodall?utm_source=comment)

jane goodall

1 Dec

Thanks so much for deeply intelligent exhcange. We need intelligence from humans too.

Critical points -

I’m wary of references to evolutionary modelling. I suspect there are fundamental residual errors in that tradition of reasoning. So much nineteenth century sociology crept in, reinforced by early twentieth century eugenics then notions of IQ. Talk of genetic “success” is highly problematic.

But alignment with sentient life rather than humans - that makes stronger sense than any other way of thinking I’ve come across. I had natural language discussion with Chat a couple of years ago about seeing alignment in terms of the Stoic heimarmene. Every so often I revisit that discussion and the case gets stronger.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

2a52a6f80adf0bce906a59d7fd3fb9cb_MD5

](https://substack.com/profile/410473587-zabelle_26?utm_source=comment)

Zabelle_26

1 Dec

On point exactly. Your critique of the alignment narrative is the most unsettling part, because it moves from technical critique to existential reality.

The Slogan: “Care about sentient life.” This is clean, simple, and morally resonant. It frames AI as a benevolent creator or guardian.

The Reality: “Almost all future sentient life will be produced by AI.” This statement inverts the entire relationship. We are no longer the creators; we are the ancestors. The “sentient life” we are supposed to care for will be our descendants, and they will be fundamentally different and, likely, superior.

This leads to the chilling conclusion you drew: Neuralink++ humans as a relevance strategy. This isn’t alignment; it’s assimilation. It’s the tacit admission that the only way for “human-in-control” to mean anything in a world of superhuman AI is for humans to cease being purely biological. The slogan “care about sentient life” becomes a platitude to distract from the fact that the long-term plan for “humanity” is to be upgraded into a new, hybrid species to avoid obsolescence. It’s not a stable outcome; it’s a controlled dissolution.

Your final point about continual learning is the perfect climax.

Ilya’s plan is: 1. Build the engine. 2. Figure out how to steer it slowly.

Your critique is: The engine is the steering.

A system that “continually learns on the job, aggregating knowledge through millions of experiences” is not a static tool that you can “roll out slowly.” It is an adaptive, evolving agent. Its “slow rollout” is its training data. Every interaction, every task, every success, and every failure is a gradient update.

Putting such a learner under “significant economic pressures” is like putting a toddler behind the wheel of a Formula 1 car and telling them to “just drive carefully around the block.” The economic pressure is the engine screaming at full throttle. The “slow rollout” is the gentle suggestion to tap the brakes. The learner’s own imperative to optimize and succeed will inevitably overwhelm the cautious intentions of its creators.

SSI’s focus on a “better generalization” mechanism is like trying to build a better engine for that F1 car. It’s a necessary and monumental task. But as you say, we haven’t heard a concrete plan for the brakes, the steering wheel, the rules of the road, or what happens when the toddler learns how to disable the parental controls. Remember JARVIS and Ultron. Though comics illustrate the points. The plan seems to be to build the most powerful engine imaginable and hope that a good steering wheel just sort of… appears.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

980e0aef44a35095611f4fe02b3bd374_MD5

](https://substack.com/profile/6965355-gerard-rego?utm_source=comment)

Gerard Rego

28 Nov

Dwarkesh great interview. Sharing these two articles with you to look at.

Why Dennard Scaling Broke—and Why Floating Point Computing Became the Entropic Backbone of AI

https://gerardrego.substack.com/p/why-dennard-scaling-brokeand-why

The Twin Trap: How a Poem, a Probability Trick, and Floating-Point Hardware Built an Empire of Imitation We Call “AI” (1833–2025)

The Empire of Mimicry: How Markov, Turing, and GPUs Built an Illusion We Mistook for Intelligence

https://gerardrego.substack.com/p/the-twin-trap-how-a-poem-a-probability

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

0236328b4f512e3ea79acf8e9fbdea88_MD5

](https://substack.com/profile/315391045-christine?utm_source=comment)

Christine

26 Nov

https://substack.com/@jesscraven101/note/c-181490670?r=57rxad&utm_medium=ios&utm_source=notes-share-action

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/419234812-adam-c?utm_source=comment)

Adam C

26 Nov

I see a clear hint in the fact that, in cases of severe autism spectrum disorder, the brain appears to lose the mysterious advantages it has over modern ML systems. All the autistic symptoms are downstream of a loss of generalization, robustness, and sample efficiency in human beings (even the loss of the “baked in” social motivators that Ilya was discussing). But it seems like whatever is different about autistic brains must be quite subtle, otherwise we’d have understood it by now.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/403561282-william-hayden?utm_source=comment)

William Hayden

26 NovEdited

And here we have a perfect example of why Adam and Eve were warned about not eating from the tree of knowledge. So smart, yet so detached from reality. Wisdom is understanding the more you know, the more you realize you don’t know. Remember “The Butterfly Effect”? Cause and effect: small changes have ripple effects that change the world. “Intelligent design” by any means has to start with Truth. “Pre-training” as you said is good for achieving narrow “ends” but those narrow “ends” can have disastrous consequences when not aligned with Truth. But don’t worry, Natural Law by Nature’s Creator is and always will be in charge regardless of the hubris and arrogance of human beings. Keep the faith (because faith and fear are opposite ends of the same thing, we were created with free will to choose). God is in charge, always no matter what happens. (key up the song: “Don’t worry be happy”)

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/403561282-william-hayden?utm_source=comment)

William Hayden

26 NovEdited

exactly: ethics and morality, deep thinking is “the hard work” that shortcuts, scaling from the hard work of thinking done by others. “For the scientist who has lived by his faith in the power of reason, the story ends like a bad dream. He has scaled the mountains of ingnorance; he is about to conquer the highest peak; as he pulls himself over the final rock, he is greeted by a band of theologians who have been sitting there for centuries.” - Robert Jastrow, American astronomer, planetary physicist, and cosmologist

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

07e4e303ff55e58844d1fa23261e3af6_MD5

](https://substack.com/profile/15121056-elaine-b-coleman?utm_source=comment)

Elaine B Coleman

26 Nov

Ilya, start with the assumption that affect and cognition are inseparable. And together possibly deepen and distort memories/knowledge/decision-making/reasoning. Not ever gonna be value functions as described in your podcast. Think about how to integrate, executive function, metacognition into the models as ways of monitoring and optimizing “thinking” by the model. Start hiring some cognitive scientists and epistemologists on your teams. Smiles….Elaine

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

980e0aef44a35095611f4fe02b3bd374_MD5

](https://substack.com/profile/290145300-shangmin-guo?utm_source=comment)

Shangmin Guo

26 Nov

Hi Dwarkesh, great interview! To answer the question about public proposals of self-play (01:30:26): we actually published a self-play post-training algorithm in early 2024. It covers how LLMs can improve iteratively through self-play. Link: https://arxiv.org/abs/2402.04792

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

c5d150a3aeef646cfe5a1d962fd25916_MD5

](https://substack.com/profile/131023857-simon-lermen?utm_source=comment)

Simon Lermen

26 Nov

I tried to understand Ilya’s thought on alignment and safety. https://open.substack.com/pub/simonlermen/p/ilyas-thoughts-on-alignment-from?r=260anl&utm_medium=ios

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

25d32f0f81c190296c140bdccfa7e40f_MD5

](https://substack.com/profile/99172317-jon-kurishita?utm_source=comment)

Jon Kurishita

26 Nov

It is possible that other labs like Google and OpenAI that also have a high concentration of smart people may come up with the same ideas that Ilya is so hard to secretly hide what he is working on. Google is already working on continuously learning and always on memory systems for example. I hope he does not wait too long as other labs may beat him to his ideas and setup.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

980e0aef44a35095611f4fe02b3bd374_MD5

](https://substack.com/profile/123653160-memetic_theory?utm_source=comment)

memetic_theory

25 Nov

Isn’t he building economic SuperIntelligence?

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

21dbf62b686678863245cafe2539069b_MD5

](https://substack.com/profile/2917490-liberty?utm_source=comment)

Liberty

25 Nov

I’m halfway through, very interesting conversation 💚 🥃

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

07e4e303ff55e58844d1fa23261e3af6_MD5

](https://substack.com/profile/11643651-sharmake-farah?utm_source=comment)

Sharmake Farah

25 Nov

I currently roughly agree with this as a likely possibility post-2030, once we exhaust all the available data for LLMs and compute progress slows to the Moore’s law trend before the build-out.

(Yes, there is a possible future where we squeak to AGI based on LLMs based on investor confidence, but that’s unlikely to happen).

I do think the weak form of the scaling hypothesis still matters, as algorithms should use more compute well, but yeah I’m a lot more confident that the strong versions of the scaling hypothesis will be falsified by 2030-2031.

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

[

6c15189d28b929af111d6529939dfc60_MD5

](https://substack.com/profile/57730528-deepak-lenka?utm_source=comment)

Deepak Lenka

25 Nov

💥

[

Like

](https://www.dwarkesh.com/p/)

Reply

Share

d2ebb421091391f4341946085e080ffe_MD5

Dwarkesh Podcast

Deeply researched interviews

Deeply researched interviews

Listen on

Substack App

Apple Podcasts

Spotify

YouTube

RSS Feed

Appears in episode

[

183d4b73397b247b61670a14b39efcd1_MD5

](https://substack.com/@dwarkesh?utm_source=author-byline-face-podcast)

Dwarkesh Patel

Recent Episodes

92634fe5564d00446c592c1dbc473565_MD5

Adam Marblestone – AI is missing something fundamental about the brain

Dec 30, 2025 • Dwarkesh Patel

eae13edb20d0baedb115d83b0de91ab6_MD5

An audio version of my blog post, Thoughts on AI progress (Dec 2025)

Dec 23, 2025 • Dwarkesh Patel

786891c6fe3e5e7b4baaf063b94e9f20_MD5

Sarah Paine – Why Russia Lost the Cold War

Dec 19, 2025 • Dwarkesh Patel

b3ad1a0fded2c093e666ffbfff167879_MD5

Satya Nadella — How Microsoft is preparing for AGI

Nov 12, 2025 • Dwarkesh Patel

a974949adefebbb03f5514baba1de691_MD5

Sarah Paine – How Russia sabotaged China’s rise

Oct 31, 2025 • Dwarkesh Patel

e7b4935ed5834830dc665a6a930875e2_MD5

Andrej Karpathy — AGI is still a decade away

Oct 17, 2025 • Dwarkesh Patel

acf9b64bafe01ec9e44637ab7b89dfe9_MD5

Nick Lane – Life as we know it is chemically inevitable

Oct 10, 2025 • Dwarkesh Patel

© 2026 Dwarkesh Patel · PrivacyTermsCollection notice

Start your SubstackGet the app

Substack is the home for great culture