艾米吗 -是的 -好的艾米 感谢你的参与
Emmy? - Yes. - OK, thanks for joining us, Emmy.
我想要你过来站在...
What I'd like you to do is come and stand over...
大概就站在这儿
Just stand about here.
好的艾米 我们要让你来玩过三关
OK. Now, Emmy, we are going to get you to play tic-tac-toe
现场对战皇家学院的超级计算机
live against the Royal Institution supercomputer.
稍微向后站一点点
Just stand back a tiny little bit.
这台过三关超级计算机叫布拉得里克
And this tic-tac-toe supercomputer is called BrodeRick.
我们将要做的是
What we're going to do is
我们要让你对战
we're going to get you to play against
这台♥完♥全未训练过的布拉得里克
a completely untrained BrodeRick.
布拉得里克画叉叉
BrodeRick is going to be the crosses player.
而你画圈圈 你要下的时候
You're going to be the noughts player. And you are just going to move your piece
只需要按其中一个按钮就可以了
just by pressing one of those buttons.
布拉得里克先手
So BrodeRick's moved first.
你看布拉得里克下在哪里
You see where BrodeRick's moved?
现在你可以下了
And now you can play one of your pieces.
好的 下在中间
OK, moving to the middle.
漂亮
Smart move.
布拉得里克下了
BrodeRick moves.
好的 该你了
OK, you need to respond.
好的 又该布拉得里克了
OK, BrodeRick's going to respond again.
又该你了
You get to go again.
你已经赢了
And you've won!
干得漂亮
Well done!
转过来面对观众
Turn and face the audience!
接受大家的鼓掌
Take the applause.
掌声总会意外地到来
You never know when it's going to come in life.
好的 干得不错
OK, so, well done.
你击败了布拉得里克
You beat BrodeRick.
但现在的布拉得里克还不知道怎么玩这个游戏
But BrodeRick hasn't got a clue how to play this game.
所以我们现在训练它
So what we're now going to do is we're going to train BrodeRick
玩过三关
to play tic-tac-toe,
我们用的是前面提到过的
and we're going to train it using the technique
跟凯特琳和弗蕾亚一起见识过的
that we've been talking about, the same technique that we saw
强化学习的技术
with Caitlynn and Freya - reinforcement learning.
这一次我们将告诉布拉得里克
So, this time, what we're going to do is tell BrodeRick
自己训练自己 只需要
to train itself, and we're going to do that
将选手个数设置为零
by selecting number of players - zero.
你可以设置选手个数为零吗
So do you want to select number of players - zero?
这是人类选手的个数 零个
And this is number of human players - zero.
布拉得里克将和自己对战
BrodeRick is just going to play against itself.
别走开艾米 因为一会儿你将和
Now, stay on, Emmy, because you're going to be playing
训练好的布拉得里克对战
the trained model in a moment.
我们现在看到布拉得里克
And what we're seeing now is BrodeRick
开始和自己对弈
starting to play itself.
循环对弈越来越多的局数
And it's cycling through more and more games.
每赢一局它都会得到一个奖励
And every time it wins a game, it gets a reward.
每输一局它就会得到一个惩罚
Every time it loses, it gets a punishment.
当它得到奖励时
When it gets a reward,
它更有可能再次按同样的方法下棋
that makes it more likely to play the moves it played again.
我们看到屏幕上显示出很多棋局
And we're seeing lots of games coming up on the screen,
但我们看到的只是布拉得里克
but we're only seeing a tiny fraction of the total number
正在下的棋局总数的很小一部分
of games that BrodeRick is playing.
布拉得里克完成训练
BrodeRick, training complete!
在我们看着它的时候
It's played 20,000 games of tic-tac-toe
它已经下了两万局过三关
while we've been watching it.
布拉得里克训练好了
Well, BrodeRick's trained.
但是布拉得里克你有自信吗
But, BrodeRick, how confident are you feeling?
布拉得里克看起来非常自信
BrodeRick, it seems, is feeling very confident indeed.
艾米 你有自信吗
Emmy, how confident are you feeling?
我现在有点怕了 -你现在很怕吗
I'm quite scared now. - You're quite scared now?
一点都不用怕 艾米
There's nothing to be scared of, Emmy.
相同的步骤
Exactly the same procedure.
艾米 看你的了
Emmy, over to you.
我们训练了布拉得里克
So we've trained BrodeRick.
布拉得里克应该变得更强了
BrodeRick should be better.
让我们拭目以待
Let's see if that's the case.
好
Ok.
下的好
Well spotted.
漂亮
Ok.
是一个平局
So it's a draw.
你感觉怎么样艾米 -还不坏
How do you feel about that, Emmy? - Not that bad.
一点也不坏
It's not bad at all,
因为布拉得里克现在基本上是一个完美的过三关选手
because BrodeRick is basically a perfect tic-tac-toe player.
如果你有两个完美的过三关选手
And if you have two perfect tic-tac-toe players
就像布拉得里克和艾米对弈那样
like BrodeRick and Emmy playing against each other,
他们将会下成平局
that's what they're going to get - they're going to get a draw.
现在布拉得里克不能保证每次都赢
Now, BrodeRick can't guarantee to win every time it plays,
但它再也不会输了
but it's never going to lose any more.
这跟我们先前看到的未经训练时
It's completely different to the untrained model
完全不一样了
that we saw earlier on.
事实上我得告诉你们 藏在表面之下的
OK. Now, in fact, under the hood, I have to tell you,
根本不是一台超级计算机
it is not a supercomputer at all.
演示团队的丹将为我们揭开
Dan from the demo team is going to show us the truth
布拉得里克的真面目
about BrodeRick.
布拉得里克的真面目
And the truth about BrodeRick is that the computer
实际上是
is, in fact, this.
这个非常简单的计算机
It's a very basic computer.
一台价值三十磅的计算机
It's a 30 pound computer.
这就是学习如何玩过三关所需要的全部了
That's all you need to learn how to play tic-tac-toe.
艾米 感谢你下来
So, Emmy, thank you for coming down
和布拉得里克对弈
and playing against BrodeRick.
大家鼓掌 也感谢布拉得里克
Applause And thank you, BrodeRick.
过三关是一个简单的游戏
Tic-tac-toe is a simple game.
它之所以简单 最重要的原因
And one of the most important reasons it's simple
如下
is the following.
在过三关的任意一步
The average number of moves that you can make
你可以下的平均可能步数大约为四
at any point in tic-tac-toe is around about four.
先手玩家有九种走法
You start with nine possible moves for the crosses player,
下一轮玩家有八种走法 以此类推
then the next player has eight possible moves and so on.
你每次可选的下棋位置
On average, you've got about four possible moves
平均下来有四个
available to you.
我们把这个平均数称为游戏的"分支因子"
And we call that the branching factor of the game.
但即使分支因子为四
But even a branching factor of four
也意味着我们有将近两万种
means that there are close to 20,000 different ways
填满井字棋方格的玩法
that we can fill in a tic-tac-toe grid.
我们可以把它和国际象棋比较一下
Let's compare that to the game of chess.
这是一个有趣得多也难得多的游戏
This is a much more interesting and much more difficult game.
它更有趣困难的一个原因
And one of the reasons that it's much more interesting
就是它的分支因子更大
and difficult is that the branching factor is larger.
大约是三十五
It's around about 35.
记住 这意味着
And, remember, what that means
在棋盘上的任意位置
is that from any position on the board,
平均来看
on average -
不是每次都有这么多 而是平均来看
not for every possible move, but on average -
你的下一步有大约三十五种走法
you've got about 35 possible moves available.
让我们再来看一个游戏
Now let's have a look at another game.
这是围棋
This is the game of Go.
它起源于中国古代
It originated in ancient China,
但如今在亚洲地区依然很受欢迎
but it's still hugely popular today in Asia.
围棋的规则是
The aim in the game of Go
你执黑子或白子
is you're playing your black stones or white stones.
你要努力用你的棋子
And what you want to try and do is to cover as much
占据棋盘上尽可能多的位置
of the board as possible with your stones
包围你对手的棋子
and to surround your opponent's stones on the board.
你和你的对手轮流下子
And you just take it in turns to place your stones.
这是个规则很简单的游戏
So it's a very simple game.
实际上只有三条规则
It only really has three rules.
但围棋的分支因子
But the branching factor of Go makes it phenomenally hard
让它对人类来说极其难玩
for human beings to play,
电影精选列表