See all Lex Fridman transcripts on Youtube

youtube thumbnail

Stuart Russell: The Control Problem of Super-Intelligent AI | AI Podcast Clips

11 minutes 40 seconds

🇬🇧 English

S1

Speaker 1

00:01

Let's just talk about maybe the control problem. So this idea of losing ability to control the behavior in our AI system. So how do you see that? How do you see that coming about?

S1

Speaker 1

00:16

What do you think we can do to manage it?

S2

Speaker 2

00:21

Well, so it doesn't take a genius to realize that if you make something that's smarter than you, you might have a problem. You know, and Turing, Alan Turing, you know, wrote about this and gave lectures about this in

S1

Speaker 1

00:36

1951.

S2

Speaker 2

00:38

He did a lecture on the radio and he basically says, you know, once the machine thinking method starts, very quickly they'll outstrip humanity. And if we're lucky we might be able to, I think he says, we may be able to turn off the power at strategic moments, but even so our species would be humbled. And actually he was wrong about that, right?

S2

Speaker 2

01:07

Because if it's a sufficiently intelligent machine, it's not going to let you switch it off. It's actually in competition with you.

S1

Speaker 1

01:14

So What do you think is meant, just for a quick tangent, if we shut off this superintelligent machine, that our species would be humbled?

S2

Speaker 2

01:24

I think he means that we would realize that we are inferior, right? That we only survive by the skin of our teeth because we happen to get to the off switch. You know, just in time.

S2

Speaker 2

01:38

You know, and if

S1

Speaker 1

01:39

we hadn't, then we would have lost control over the earth. So are you more worried when you think about this stuff about super intelligent AI or are you more worried about super powerful AI that's not aligned with our values? So the paperclip

S3

Speaker 3

01:56

scenarios kind of...

S2

Speaker 2

01:59

I think, So the main problem I'm working on is the control problem, the problem of machines pursuing objectives that are, as you say, not aligned with human objectives. And this has been the way we've thought about AI since the beginning. You build a machine for optimizing and then you put in some objective and it optimizes.

S2

Speaker 2

02:31

And we can think of this as the King Midas problem. Because if King Midas put in this objective, everything I touch should turn to gold, and the gods, that's like the machine, they said, okay, done. You now have this power, And of course his food and his drink and his family all turned to gold and then he dies of misery and starvation. And this is, you know, it's a warning, It's a failure mode that pretty much every culture in history has had some story along the same lines.

S2

Speaker 2

03:07

You know, there's the genie that gives you 3 wishes, and the third wish is always you know, please undo the first 2 wishes because I messed up. And you know, when Arthur Samuel wrote his chess, his checker playing program, which learned to play chequers considerably better than Arthur Samuel could play, and actually reached a pretty decent standard, Norbert Wiener, who was 1 of the major mathematicians of the 20th century, sort of the father of modern automation control systems, he saw this and he basically extrapolated, as Turing did, and said, okay, this is how we could lose control and Specifically that We have to be certain that the purpose we put into the machine is the purpose which we really desire. And the problem is, we can't do that.

S1

Speaker 1

04:10

You mean it's very difficult to encode, to put our values on paper is really difficult, or you're just saying it's impossible?

S3

Speaker 3

04:21

The line is gray between

S2

Speaker 2

04:22

the 2. So theoretically it's possible, but in practice it's extremely unlikely that we could specify correctly in advance the full range of concerns of humanity.

S1

Speaker 1

04:37

You talked about cultural transmission of values, I think is how humans to human transmission of values happens, right?

S2

Speaker 2

04:44

Well, we learn, yeah. I mean, as we grow up, we learn about the values that matter, how things should go, what is reasonable to pursue and what isn't reasonable to pursue.

S1

Speaker 1

04:56

I think machines can learn in the same kind of way.

S2

Speaker 2

04:58

Yeah, so I think that what we need to do is to get away from this idea that you build an optimizing machine and then you put the objective into it. Because if it's possible that you might put in a wrong objective, and we already know this is possible because it's happened lots of times, right? That means that the machine should never take an objective that's given as gospel truth.

S2

Speaker 2

05:27

Because once it takes the objective as gospel truth, then it believes that whatever actions it's taking in pursuit of that objective are the correct things to do. So you could be jumping up and down and saying, you know, no, no, no, you're going to destroy the world, But the machine knows what the true objective is and is pursuing it. And tough luck to you. And this is not restricted to AI, right?

S2

Speaker 2

05:54

This is, I think, many of the 20th century technologies, right? So in statistics, you minimize a loss function. The loss function is exogenously specified. In control theory, you minimize a cost function.

S2

Speaker 2

06:06

In operations research, you maximize a reward function, and so on. So in all these disciplines, this is how we conceive of the problem. And it's the wrong problem, because we cannot specify with certainty the correct objective, right? We need uncertainty, we need the machine to be uncertain about what it is that it's supposed to be maximizing.

S3

Speaker 3

06:31

Favorite idea of yours, I've heard you say somewhere, well, I shouldn't pick favorites, but it just sounds beautiful of, we need to teach machines humility. Yeah, I mean, that's a beautiful way to put it.

S2

Speaker 2

06:45

I love it. That they're humble In that they know that they don't know what it is they're supposed to be doing. And that those objectives, I mean, they exist.

S2

Speaker 2

06:57

They're within us, but We may not be able to explicate them. We may not even know how we want our future to go. And a machine that's uncertain is going to be deferential to us. So if we say don't do that, well now the machine's learned something a bit more about our true objectives because something that it thought was reasonable in pursuit of our objective turns out not to be, so now it's learned something.

S2

Speaker 2

07:31

So it's going to defer because it wants to be doing what we really want. And that point, I think, is absolutely central to solving the control problem. And it's a different kind of AI when you take away this idea that the objective is known, then in fact a lot of the theoretical frameworks that we're so familiar with, You know, Markov decision processes, goal-based planning, you know, standard game tree search, all of these techniques actually become inapplicable. And you get a more complicated problem because, because now, the interaction with the human becomes part of the problem.

S2

Speaker 2

08:26

Because the human, by making choices, is giving you more information about the true objective and that information helps you achieve the objective better. And so that really means that you're mostly dealing with game theoretic problems where you've got the machine and the human and they're coupled together, rather than a machine going off by itself with a fixed objective.

S3

Speaker 3

08:52

Which is fascinating on the machine and the human level that when you don't have an objective, means you're together coming up with an objective. I mean, there's a lot of philosophy that, you know, you could argue that life doesn't really have meaning. We together agree on what gives it meaning and we kind of culturally create things that give why the heck we are on this earth anyway.

S3

Speaker 3

09:19

We together as a society create that meaning and you have to learn that objective. And 1 of the biggest, I thought that's where you were gonna go for a second. 1 of the biggest troubles we run into outside of statistics and machine learning and AI and just human civilization is when you look at, I came from, I was born in the Soviet Union and the history of the 20th century, we ran into the most trouble, us humans, when there was a certainty about the objective and you do whatever it takes to achieve that objective, whether you're talking about Germany or communist Russia.

S2

Speaker 2

09:57

You get into trouble with humans. And I would say with corporations, In fact, some people argue that we don't have to look forward to a time when AI systems take over the world. They already have, and they're called corporations.

S2

Speaker 2

10:11

Corporations happen to be using people as components right now, but they are effectively algorithmic machines and they're optimizing an objective which is quarterly profit that isn't aligned with overall well-being of the human race, and they are destroying the world. They are primarily responsible for our inability to tackle climate change. So I think that's 1 way of thinking about what's going on with corporations. But I think the point you're making is valid, that there are many systems in the real world where we've sort of prematurely fixed on the objective and then decoupled the machine from those that it's supposed to be serving.

S2

Speaker 2

11:02

And I think you see this with government. Right? Government is supposed to be a machine that serves people, but instead it tends to be taken over by people who have their own objective and use government to optimize that objective regardless of what people want.

S1

Speaker 1

11:30

You