人工智能的真相（2023）（2）

The Truth about AI 2of3

Part 3 April 16, 2024, 10:40 a.m.

好的欢迎参加圣诞讲座
OK, well, welcome to the Christmas lectures.
下面我给你一个任务
Now, I'm going to give you a task.
会给你一些指示
I'm going to give you some instructions.
我只需要你听从指示好吗
And I just want you to follow those instructions, OK?
你腿脚灵活吗跑步好吗 -是的
Are you nimble on your feet? Are you a good runner? - Yes.
非常好那你会很适合做这件事
OK, excellent. Then you're going to be perfect for this.
我们设定了五个铃
Now, what we've done is we've placed five bells
就在阶梯教室四周
around the lecture theatre.
这一个这一个
One there. One there.
这一个大家看我的腿脚很灵活
One there. Very nimble on my feet, as you can see.
这一个
One there.
最后这一个
And finally, one there.
我们要做的就是给你
And what we're going to do is we're going to give you
十秒时间
ten seconds,
我希望听到尽可能多次的铃♥声♥
and I want to hear as many bell rings as possible.
明白吗 -是的明白
Got it? - Yes. Yeah.
好了接下来要做的
OK, all right, what we're going to do is
我们将倒数三个数
we're going to count you down from three,
然后我们会说"开始" 好吗
and then we're going to say go, all right? Got it?
三二一
Three, two, one.
开始 -开始
Go! - Go!
快快快妮维雅快跑
Go, go, go! - Go, Nivea! Run!
铃响了我们听着铃响声
Bell rings Let's hear those bells!
快加速
Come on! Pick up the pace!
一停
One, stop!
铃响了几次
OK, how many bell rings?
请回到中间我觉得是五次
Come back to the middle. We heard five, I think.
次数正确吗各位
Was that about right, everybody?
对我们听到五次铃响
Yeah? We heard five bell rings.
很棒是不是
That was really, really good, OK?
非常感谢你可以回到座位了
Thank you so much for that. You can go back to your seat.
下面我们要进行同样的挑战
Now we're going to carry out the same challenge,
但用人工智能
but with artificial intelligence.
请掌声欢迎皇家科学院
So please welcome the Royal Institution's
特别嘉宾响铃机器人
special bell-ringing robot!
来吧响铃机器人
Come on, bell-ringing robot.
把那个给我求你
Give me that. Honestly.
好
Ok!
下面给你十秒时间
Now, I'm going to give you ten seconds.
我要听到尽可能的铃响次数
I want to hear as many bell rings as possible.
现在我们给机器人倒数
So let's count down the robot.
三二一开始 -开始
Three, two, one, go! - Go!
停
Stop!
有人数清刚才铃响次数了吗
Did anybody manage to keep track of how many that was?
好机器人谢谢你
OK, robot, thank you very much.
还给你
There you go.
其实我们的志愿者
So, our volunteer,
什么也没做错
you didn't do anything wrong at all.
因为你是人类
Because you're a human being.
我给你指示
I gave you some instructions
而你尝试去理解我想让你做什么
and you tried to interpret what I wanted you to do.
我希望在指示中暗示的是
And what I wanted in my instructions
让某人围着阶梯教室跑
is to hear somebody running around the lecture theatre
按响那些铃
pressing each of those bells.
但实际上那并不是真正的指示
But actually, those weren't exactly the instructions I gave you.
我刚才说只是我想听到
What I said is, I just wanted to hear
尽可能多的铃♥声♥
as many bell rings as possible.
而机器人就是理解字面意思
And our robot took the instructions literally.
而让机器人得到最多奖励的方式
And the quickest way the robot could get a reward
就是站在这不断敲响铃♥声♥
was just to stand there and bang that bell.
人工智能找到了最大化其奖励的方法
The AI found a way to maximise its rewards
却没有按我要求的方法去做
without doing what I wanted it to do.
我们使用强化学习时设定奖励的方法
The way we set up rewards when we use reinforcement learning
至关重要
is really important
因为有时我们设定的奖励
because sometimes we can set up rewards
会让人工智能发现最大化奖励的方法
so that the AI discovers a way to maximise its rewards
就是不按我们希望它使用的方法去做事
without doing what we wanted it to do.
也正是刚才那个机器人所做的
And that's what the robot did there.
我们现在来看一个实际例子
And now what we're going to see is a real example.
我们来看屏幕上播放的视频
So let's have a look at this video on the screen.
在2014年一个叫DeepMind的人工智能公♥司♥
In 2014, the AI company DeepMind
训练了一个人工智能程序来玩这个游戏
trained an AI program to play this game.
这是上世纪七十年代一款叫《越狱》的游戏
It's a 1970s video game called Breakout,
这个程序使用了强化学习进行训练
and it uses reinforcement learning.
所以它玩的越多就玩得越好
The more it plays, the better it gets.
你们可以看到一开始大部分时间
Now, at the beginning, most of the time, as you'll see,
都没有碰到
it's just missing.
它能碰到球完全靠运气
And if it manages to hit the ball, it's just really pure chance.
但是它每打掉一块砖
But every time it knocks a brick out,
就会得一分
it gets a point.
经过一段时间的训练它现在从不失球
Now, after a bit more training, watch, it never misses.
它每次都能准确地把球打回去
It's reliably hitting that ball back every single time.
你们可能会认为这也许就是
And you might think that's actually
最好的玩法了吧
about as good as it's going to get.
每次都能打到球
It's hitting the ball every single time.
训练得更久一些
But look what happened when they trained it
看看会发生什么
just a little bit longer.
看左边出现了什么状况
Look at the left there. Look what goes on.
它完全靠自己发现了
It discovered completely on its own
最大化得分的方法是在墙壁的一边钻一个洞
that the way to maximise its score is to drill a hole
然后让球在墙壁上方来回弹跳
down the side of the wall and bounce the ball above.
游戏对于我们来说很好玩
So games are fun for us to play,
对于人工智能来说却是重要的测试场
but they're also a great proving ground for AI
因为游戏对于人工智能来说提供了巨大的挑战
because they can provide big challenges for AI
却不会造成任何人员伤害或物品损坏
without the possibility of hurting anybody or damaging anything.
我们和拍摄这段视频的DeepMind公♥司♥的CEO
And, earlier, we spoke to the CEO of DeepMind,
德米斯·哈萨比斯交谈过
the company behind that video, Demis Hassabis.
你们可以看到一开始
You can see at the beginning, the AI, by the way,
人工智能试图聚焦到他的脸上
is trying to focus on his image
做的并不是非常好
and not doing a terribly good job of it.
我问过他关于游戏
I asked him about the importance of games
对于人工智能的重要性
for artificial intelligence.
游戏和人工智能有一段很长的共同历史
Games and AI have always had a long history together.
事实上如果我们退回到发明人工智能这一领域的
Actually, if you go all the way back to Turing and Shannon,
图灵和香农的时代
who sort of invented the field of AI,
他们都是从国际象棋这类游戏开始研究的
they all started off with things like chess programs,
例如他们试图弄明白
and trying to figure out how could a machine,
如何让机器下好国际象棋
for example, play chess well?
在DeepMind 我们用游戏
And we use games at DeepMind as a testing ground
作为人工智能和算法的测试场
for our AI ideas and algorithmic ideas
因为你需要一个明确的度量来比较它们
because you need a clear metric to measure them against.
当然游戏通常有输赢条件和
And, of course, games usually have scores
可以优化的得分策略
that you can optimise or a win-loss condition,
你可以非常清楚的知道
You know, so you can track very
你正在取得进展
clearly if you're making progress.
我们再进一步看看玩游戏的程序
So let's take a closer look at game-playing programs.
玩法最简单的一种游戏是
OK. And one of the easiest games to play
圈叉游戏
is noughts and crosses.
你们谁会玩圈叉游戏
Who plays noughts and crosses?
你们都会玩圈叉游戏吗
Can you all play noughts and crosses?
好的在美国
OK. In the United States,
他们不叫圈叉游戏他们叫过三关
they don't call it noughts and crosses, they call it tic-tac-toe.
因为美国是人工智能巨头
And because the US is so big in AI,
所以在人工智能领域我们叫它过三关
we have to call it tic-tac-toe in AI as well.
我需要一名志愿者
I need a volunteer.
你们谁最会玩过三关
Who's really good at playing tic-tac-toe?
好的后面倒数第二排那位
OK, you at the back. The second from the end.
快过来吧
Come on down!
你好到这儿来
Hello. Come here.
你叫什么名字 -艾米
What's your name? - Emmy.