First, the win itself, I mean, it was so exciting.

I mean, so looking back to those last days of 2018, really, that's when the games were played.

I'm sure I look back at that moment, I say, oh, my God, I want to be in a project like that.

It's like, I already feel the nostalgia of like, yeah, that was huge in terms of the energy and the team effort that went into it.

And so in that sense, as soon as it happened, I already knew it was kind of, I was losing it a little bit.

So it is almost like sad that it happened and oh my god, like, but on the other hand, it also verifies the approach.

But to me also, there's so many challenges and interesting aspects of intelligence that even though we can train a neural network to play at the level of the best humans, there's still so many challenges.

So for me, it's also like, well, this is really an amazing achievement.

But I already was also thinking about next steps.

I mean, as I said, these Asians play Protoss versus Protoss, but they should be able to play a different race much quicker, right?

Some people call this meta reinforcement learning, meta learning, and so on, right?

So there's so many possibilities after that moment.

But the moment itself, it really felt great.

I said, OK, let's against TLO first, right?

And I really thought we would lose like 5-0, right?

I, we had some calibration made against the 5000 MMR player.

TLO was much stronger than that player, even if he played Protoss, which is his off race.

But yeah, I was not imagining we would win.

So for me, that was just kind of a test run or something.

And then it really kind of, he was really surprised.

And unbelievably, we went to this to this bar to celebrate.

And Dave tells me, well, why don't we invite someone who is a thousand MMR stronger in Protoss, like an actual Protoss player, like that it turned up being Mana, right?

And we had some drinks and I said, sure, why not?

But then I thought, well, that's really going to be impossible to beat.

A thousand MMR is really like 99% probability that Mana would beat TLO as Protoss versus Protoss.

And to me, the second game was much more important, even though a lot of uncertainty kind of disappeared after we beat TLO.

I mean, he is a professional player, So that was kind of, oh, but that's really a very nice achievement.

And then after the first game, I said, if we take a single game, at least we can say we beat a game.

I mean, even if we don't beat the series, for me, that was a huge relief.

And I mean, I remember the hugging dummies.

And I mean, it was it was really like this moment for me will resonate forever as a researcher.

And I mean, as a person, and yeah, it's a really like great accomplishment.

And it was great also to be there with the team in the room.

perspective, the other interesting thing is just like watching Kasparov, watching Mana was also interesting because he is kind of at a loss of words.

I mean, whenever you lose, I've done a lot of sports, you sometimes say excuses, you look for reasons, and he couldn't really come up with reasons.

So with the off race for Protoss, you could say, well, it felt awkward, it wasn't, but here it was just beaten.

And it was beautiful to look at a human being being superseded by an AI system.

I mean, it's a beautiful moment for researchers.

It was, I mean, probably the highlight of my career so far because of its uniqueness and coolness.

I mean, it's obviously, as you said, you can look at paper citations and so on.

But this really is like a testament of the whole machine learning approach and using games to advance technology.

I mean, it really was everything came together at that moment.

Also, on the other side, it's a popularization of AI too, because just like traveling to the moon and so on.

I mean, this is where a very large community of people that don't really know AI, they get to really interact with it.

I mean, we must, you know, writing papers helps our peers, researchers to understand what we're doing.

But I think AI is becoming mature enough that we must sort of try to explain what it is.

And perhaps through games is an obvious way, because these games always had built-in AI.

So it may be everyone experienced an AI playing a video game, even if they don't know, because there's always some scripted element and some people might even call that AI already, right?

So what are other applications of the approaches underlying AlphaStar that you see happening?

There's a lot of echoes of, you said, transformer of language modeling and so on.

Have you already started thinking where the breakthroughs in AlphaStar get expanded to other applications?

So I thought about a few things for like kind of next month's next years.

The main thing I'm thinking about actually is what's next as a kind of a grand challenge, because for me, like we've seen Atari and then there's like the sort of 3 dimensional worlds that we've seen also like pretty good performance from this capture the flag agents that also some people at DeepMind and elsewhere are working on.

We've also seen some amazing results on like, for instance, Dota 2, which is also a very complicated game.

So for me, like the main thing I'm thinking about is what's next in terms of challenge.

So as a researcher, I see sort of 2 tensions between research and then applications or areas or domains where you apply them.

So on the 1 hand, we've done, thanks to the application of StarCraft is very hard, we developed some techniques, some new research that now we could look at elsewhere.

Like, are there other applications where we can apply these?

And the obvious ones, absolutely, you can think of feeding back to sort of the community we took from, which was mostly sequence modeling or natural language processing.

So we've developed and extended things from the transformer and we use pointer networks.

We combine LSTM and transformers in interesting ways.

So that's perhaps the kind of lowest hanging fruit of feeding back to now a different field of machine learning that's not playing video games.

So the Turing test, you know, it's a natural language test, a conversational test.

What's your thought of it as a test for intelligence?

Do you think it is a grand challenge that's worthy of undertaking?

Maybe if it is, would you reformulate it or phrase it somehow differently?

So I really love the Turing test because I also like sequences and language understanding.

And in fact, some of the early work we did in machine translation, we tried to apply to kind of a neural chatbot, which obviously would never pass the Turing test because it was very limited.

But it is a very fascinating idea that you could really have an AI that would be indistinguishable from humans in terms of asking or conversing with it, right?

So I think the test itself seems very nice and it's kind of well-defined actually, like the passing it or not.

I think there's quite a few rules that feel like pretty simple.

And, you know, you could really like have, I mean, I think they have these competitions every year.

Yeah, so the Leibner Prize, but I don't know if you've seen, I don't know if you've seen the kind of bots that emerge from that competition.

They're not quite as what you would, so it feels like that there's weaknesses with the way Turing formulated it.

It needs to be, that the definition of a genuine, rich, fulfilling human conversation, it needs to be something else.

Like the Alexa Prize, which I'm not as well familiar with, has tried to define that more, I think, by saying you have to continue keeping a conversation for 30 minutes, something like that.

So basically forcing the agent not to just fool but to have an engaging conversation kind of thing.

Is that, I mean, is this, have you thought about this problem richly?

Like, and if you have in general, how far away are we from, you worked a lot on language, understanding language generation, but the full dialogue, the conversation, you know, just sitting at the bar, having a couple of beers for an hour, that kind of conversation.

Yeah, so I think you touched here on the critical point, which is feasibility, right?

So there's a great sort of essay by Hamming, which describes sort of grand challenges of physics.

And he argues that, well, OK, for instance, teleportation or time travel are great grand challenges of physics, but there's no attacks.

We really don't know or cannot kind of make any progress.

So that's why most physicists and so on, they don't work on these in their PhDs and as part of their careers.

So I see the Turing test as in the full Turing test as a bit still too early.

Like I am, I think we're, especially with the current trend of deep learning language models, we've seen some amazing examples.

I think GPT-2 being the most recent 1, which is very impressive.

But to understand, to fully solve passing or fooling a human to think that there's a human on the other side.

So as a result, I don't see myself and I probably would not recommend people doing a PhD on solving the Turing test because it just feels it's kind of too early or too hard of a problem.

Yeah, but that said, you said the exact same thing about StarCraft about a few years ago.

You'll probably also be the person who passes the Turing test in 3 years.

I mean, the, the, it's true that progress sometimes is a bit unpredictable.

I really wouldn't have not, I, even 6 months ago, I would not have predicted the level that we see that these agents can deliver at Grandmaster level.

But I have worked on language enough, and basically my concern is not that something could happen, a breakthrough could happen that would bring us to solving or passing the Turing test is that I just think the statistical approach to it, like it's not gonna cut it.

So we need a breakthrough, which is great for the community.

But given that, I think there's quite a more uncertainty.

Whereas for StarCraft, I knew what the steps would be to kind of get us there.

I think it was clear that using the imitation learning part and then using these battle net for agents were going to be key.

And it turned out that this was the case and a little more was needed, but not much more.

For Turing test, I just don't know what the plan or execution plan would look like.

So that's why I myself working on it as a grand challenge is hard.

But there are quite a few sub-challenges that are related that you could say, well, I mean, what if you create a great assistant, like Google already has, like the Google Assistant, so can we make it better and can we make it fully neural and so on?

That I start to believe maybe we're reaching a point where we should attempt these challenges.

I like this conversation so much because it echoes very much the Starcraft conversation.

Let's break it down into small pieces and solve those, and you end up solving the whole game.

Great, But that said, you're behind some of the sort of biggest pieces of work in deep learning in the last several years.

What do you think of the current limits of deep learning and how do we overcome those limits?

So if I had to actually use a single word to define the main challenge in deep learning, it's a challenge that probably has been the challenge for many years and is that of generalization.

So what that means is that all that we're doing is fitting functions to data.

And when the data we see is not from the same distribution, or even if there are some times that it is very close to the distribution, but because of the way we train it with limited samples, we then get to this stage where we just don't see generalization as much as we can generalize.

And I think adversarial examples are a clear example of this, But if you study machine learning and literature, and you know, the reason why SVMs came very popular were because they were dealing and they had some guarantees about generalization, which is unseen data or out of distribution, or even within distribution, where you take an image adding a bit of noise, these models fail.

So I think really, I don't see a lot of progress on generalization in the strong generalization sense of the word.

I think our neural networks, you can always find design examples that will make their outputs arbitrary, which is not good because we humans would never be fooled by these kind of images or manipulation of the image.

And if you look at the mathematics, you kind of understand this is a bunch of matrices multiplied together.

There's probably numerics and instability that you can just find corner cases.

So I think that's really the underlying topic many times we see when, even at the grand stage of like Turing test generalization.

I mean, if you start passing the Turing test, should it be in English or should it be in any language, right?

I mean, as a human, if you ask something in a different language, you actually will go and do some research and try to translate it and so on.

Should the Turing test include that, right?

And it's really a difficult problem and very fascinating and very mysterious actually.

But do you think it's, if you were to try to solve it, can you not grow the size of data intelligently in such a way that the distribution of your training set does include the entirety of the testing set?

The other path is totally new methodology.

So a path that has worked well, and it worked well in StarCraft and in machine translation and in languages, is scaling up the data and the model.

And that's kind of been maybe the only single formula that still delivers today in deep learning, right?

It's that scale, data scale and model scale really do more and more of the things that we thought, oh, there's no way it can generalize to these or there's no way it can generalize to that.

But I don't think fundamentally it will be solved with this.

And for instance, I'm really liking some style or approach that would not only have neural networks, but it would have programs or some discrete decision-making, because there is where I feel there's a bit more, like, I mean, the example of, the best example, I think, for understanding this is, I also worked a bit on, oh, like, we can learn an algorithm with a neural network, right?

So you give it many examples and it's going to sort the input numbers or something like that.

But really, strong generalization is you give me some numbers or you ask me to create an algorithm that sorts numbers.

And instead of creating a neural net, which will be fragile because it's going to go out of range at some point, you're going to give it numbers that are too large, too small and whatnot.

If you just create a piece of code that sorts the numbers, then you can prove that that will generalize to absolutely all the possible inputs you could give.

So I think that's the problem comes with some exciting prospects.

I mean, scale is a bit more boring, but it really works.

And then maybe programs and discrete abstractions are a bit less developed, but clearly I think they're quite exciting in terms of future for the field.

Do you draw any insight wisdom from the 80s and expert systems and symbolic systems, symbolic computing?

Do you ever go back to those, the reasoning, that kind of logic?

Yeah, I actually love actually adding more inductive biases.

To me, the problem really is, what are you trying to solve?

If what you're trying to solve is so important that try to solve it no matter what, then absolutely use rules, use domain knowledge, and then use a bit of the magic of machine learning to empower, to make the system as the best system that will detect cancer or detect weather patterns, right?

Or in terms of StarCraft, it also was a very big challenge.

So I was definitely happy that if we had to cut a corner here and there, it could have been interesting to do.

And in fact, in StarCraft, we start thinking about expert systems because it's a very, you can define, I mean, people actually build StarCraft bots by thinking about those principles, like state machines and rule-based.

And then you could think of combining a bit of a rule-based system, but that has also neural networks incorporated to make it generalize a bit better.

So absolutely, I mean, we should definitely go back to those ideas.

And anything that makes the problem simpler, as long as your problem is important, that's okay.

And that's research driving a very important problem.

And on the other hand, if you wanna really focus on the limits of reinforcement learning, then of course you must try not to look at imitation data or to look for some rules of the domain that would help a lot or even feature engineering.

So this is a tension that depending on what you do, I think both ways are definitely fine.

And I would never not do 1 or the other, if you're, as long as what you're doing is important and needs to be solved, right?

So there's a bunch of different ideas that you've developed that I really enjoy.

But 1 is translating from, image captioning, translating from image to text.

Just another beautiful idea, I think, that resonates throughout your work, actually.

So the underlying nature of reality being language, always, somehow.

So what's the connection between images and text, or rather the visual world and the world of language, in your view?

Right, So I think a piece of research that's been central to, I would say even extending into StarCraft is this idea of sequence to sequence learning, which what we really meant by that is that you can, you can now really input anything to a neural network as the input X, and then the neural network will learn a function F that will take X as an input and produce any output Y.

And these X and Ys don't need to be like static or like a fixed vectors or anything like that.

It could be really sequences and now beyond like data structures.

So that paradigm was tested in a very interesting way when we moved from translating French to English, to translating an image to its caption.

But the beauty of it is that really, and that's actually how it happened.

I ran, I changed the line of code in this thing that was doing machine translation.

And I came the next day and I saw how it was producing captions that seemed like, oh my God, this is really, really working.

So I think I don't see text, vision, speech, waveforms as something different.

As long as you basically learn a function that will vectorize these into, And then after we vectorize it, we can then use, you know, transformers, LSTMs, whatever the flavor of the month of the model is.

And then as long as we have enough supervised data, really this formula will work and will keep working, I believe, to some extent, model of these generalization issues that I mentioned before.

So, but the task there is to vectorize, sort of form a representation that's meaningful, I think.

And your intuition now, having worked with all this media, is that once you are able to form that representation, you could basically take anything, any sequence.

Is there, going back to StarCraft, is there limits on the length?

So we didn't really touch on the long term aspect.

How did you overcome the whole really long term aspect of things here?

So the main trick, so StarCraft, if you look at absolutely every frame, you might think it's quite a long game.

So we would have to multiply 22 times 60 seconds per minute, times maybe at least 10 minutes per game on average.

But the trick really was to only observe, in fact, which might be seen as a limitation, but it is also a computational advantage.

And then what the neural network decides is what is the gap going to be until the next action.

And if you look at most StarCraft games that we have in the dataset that Blizzard provided, it turns out that most games are actually only, I mean, it is still a long sequence, but it's maybe like a thousand to 1500 actions, which if you start looking at LSTMs, large LSTMs, transformers, it's not that difficult, especially if you have supervised learning.

If you had to do it with reinforcement learning, the credit assignment problem, what is it in this game that made you win?

But thankfully, because of imitation learning, we didn't kind of have to deal with these directly.

And what happened is you just take all your workers and attack with them.

And that sort of is kind of obvious in retrospect, because you start trying random actions.

1 of the actions will be a worker that goes to the enemy base.

And because it's self-play, it's not going to know how to defend because it basically doesn't know almost anything.

And eventually what you develop is this, take all workers and attack.

Because the credit assignment issue in RL is really, really hard.

I do believe we could do better and that's maybe a research challenge for the future.

But yeah, even in StarCraft, the sequences are maybe a thousand, which I believe is within the realm of what Transformers can do.

Yeah, I guess the difference between StarCraft and Go is in Go and chess, stuff starts happening right away.

So there's not, yeah, it's pretty easy through self play, not easy, but through self play, it's possible to develop reasonable strategies quickly, as opposed to Starcraft.

I mean, in Go, there's only 400 actions, But 1 action is what people would call the God action.

That would be if you had expanded the whole search tree, that's the best action if you did minimax or whatever algorithm you would do if you had the computational capacity.

Like in 400, you couldn't even click on the pixels around a unit, right?

So I think the problem there is, in terms of action space size is way harder.

So there's quite a few challenges indeed that make this kind of a step up in terms of machine learning.

For humans, maybe playing StarCraft seems more intuitive because it looks real.

I mean, you know, like the graphics and everything moves smoothly.

Whereas I don't know how to, I mean, Go is a game that I would really need to study.

It feels quite complicated, but for machines, kind of maybe it's the reverse, yes.

Which shows you the gap actually between deep learning and however the heck our brains work.

So you developed a lot of really interesting ideas.

It's interesting to just ask, what's your process of developing new ideas?

Do you like, like what was it, Ian Goodfellow said he came up with GANs after a few beers.

He thinks beers are essential for coming up with new ideas.

We had beers to decide to play another game of StarCraft after a week.

See all Lex Fridman transcripts on Youtube

Oriol Vinyals: DeepMind AlphaStar, StarCraft, and Language | Lex Fridman Podcast #20