Dario Amodei — “We are near the end of the exponential”

发布时间 2026-02-13 17:11:03    来源

摘要

Dario Amodei thinks we are just a few years away from “a country of geniuses in a data center”. In this episode, we discuss what to ...

GPT-4正在为你翻译摘要中......

中英文字稿  

So we talked three years ago. I'm curious in your view, what has been the biggest update of the last three years? What has been the biggest difference between what I felt like last three years versus now? Yeah, I would say actually the underlying technology, like the exponential of the technology, has gone broadly speaking, I would say, about as I expected it to go. I mean, there's like plus or minus, you know, a couple, there's plus or minus a year to here, there's plus or minus a year to there.
三年前我们有过一次交流。我很好奇,在你看来,这三年来最大的变化是什么?是什么让过去三年与现在有如此大的不同?其实,我认为底层技术的发展,大体上符合我的预期。虽然在时间上可能有一年的差异,但技术的快速增长基本是在我预料之中的。

I don't know that I was predicted the specific direction of code, but actually when I look at the exponential, it is roughly what I expected in terms of the March or the models from like, you know, smart high school student to smart college student to like, you know, beginning to do PhD and professional stuff and in the case of code, reaching beyond that. So, you know, the frontier is a little bit uneven. It's roughly what I expected. I will tell you though what the most surprising thing has been.
我不确定自己是否预测到了代码发展的具体方向,但实际上,当我看代码的指数增长时,大体上符合我的预期。这就像从聪明的高中生到聪明的大学生,再到开始做博士研究和专业工作的过程,而在代码方面,还超出了这一点。所以,你知道,这个发展前沿有点不均衡,但大体上符合我的预期。不过,我可以告诉你,最让我感到惊讶的事情是什么。

The most surprising thing has been the lack of public recognition of how close we are to the end of the exponential. To me, it is absolutely wild that, you know, you have people, you know, within the bubble and outside the bubble, you know, but you have people talking about these, these, you know, just the same tired old hot-button political issues and like, you know, or around us, we're like near the end of the exponential. I want to understand what that exponential looks like right now because the first question I asked you when we recorded it three years ago was, you know, what's up with scaling, how does it work?
最令人惊讶的是,公众似乎没有意识到我们已经接近指数增长的终点。对我来说,这真的很疯狂。你知道,在圈子内外的人们仍在讨论那些陈旧的政治热点问题,而实际上我们正处在指数增长的末期。我想了解现在这个指数增长到底是什么样子的,因为三年前我们录制时,我问你的第一个问题就是关于扩展的问题,它是如何运作的。

I have a similar question now, but I feel like it's a more complicated question because, at least from the public's point of view, yes, three years ago there were these, you know, well-known public trends where across many orders of magnitude of compute, you can see how the loss improves. And now we have RL scaling and there's no publicly known scaling law for it. It's not even clear what exactly the story is of, is this supposed to be teaching the model skills, is this supposed to be teaching meta-learning.
我现在有一个类似的问题,但我感觉这个问题更复杂。至少从公众的角度来看,的确,三年前有一些众所周知的公共趋势,跨越许多数量级的计算量,人们可以看到损失是如何改善的。而现在我们有了强化学习(RL)的规模化,但还没有公开的扩展规律可以参考。目前还不清楚具体情况,比如这到底是在教模型技能,还是在教元学习。

What is the scaling hypothesis at this point? Yeah, so I have actually the same hypothesis that I had even all the way back in 2017. So in 2017, I think I talked about it last time, but I wrote a doc called the big blob of compute hypothesis. And, you know, it wasn't about the scaling of language models in particular. When I wrote it, GPT-1 had just come out, right? So that was one among many things, right?
在这一点上,什么是扩展假说?是的,我实际上持有与2017年时相同的假说。我想我上次也谈到了这一点,那时我写了一篇文档,叫做“大规模计算假说”。其实,那时候写的内容并不是专门针对语言模型的扩展。当时,GPT-1刚刚面世,所以只是众多事物中的一个而已,对吧?

There was back in those days, there was robotics. People tried to work on reasoning as a separate thing from language models. There was scaling of the kind of RL that happened, you know, kind of happened in AlphaGo and that happened at Dota at OpenAI. And, you know, people remember Starcraft at DeepMind, you know, the AlphaStar. So it was written as a more general document. And the specific thing I said was the following.
在那些日子里,有机器人技术。人们尝试将推理作为独立于语言模型的一个部分来进行研究。当时也有类似于AlphaGo和OpenAI在Dota中所发生的那种强化学习的规模扩展,人们也记得DeepMind在星际争霸中的AlphaStar。所以,这些内容是以更通用的方式记录下来的。我具体提到的是以下内容。

That, and, you know, it's very, you know, written Sutton, put out the bit or less in a couple years later. But, you know, the hypothesis is basically the same. So what it says is, all the cleverness, all the techniques, all the kind of, we need a new method to do something like that, doesn't matter very much. There are only a few things that matter. And I think I listed seven of them. One is like how much raw compute you have. The other is the quantity of data that you have.
翻译成中文:而且,你知道,Sutton在几年后推出了一些新想法。但是,你知道,基本的假设其实是一样的。它的意思是,所有的聪明才智,所有的技术,还有那种我们需要新方法来做某事的想法,其实并不太重要,重要的东西其实只有几个。我想我列出了其中的七个。其中之一是你有多少计算能力,另一个是你拥有的数据量。

Then the third is kind of the quality and distribution of data, right? It needs to be a broad, broad distribution of data. The fourth is, I think, how long you train for. The fifth is you need an objective function that can scale to the moon. So the pre-training objective function is one such objective function, right? Another objective function is, you know, the kind of RL objective function that says like you have a goal, you're going to go out and reach the goal.
第三点是数据的质量和分布,对吧?数据的分布需要非常广泛。第四点是训练的时间长短。我认为训练时间也很重要。第五点是你需要一个可以无限扩展的目标函数。预训练目标函数就是这样的一个目标函数,对吧?还有另一种目标函数是强化学习中的目标函数,这种目标函数意味着你有一个目标,并且会努力去实现这个目标。

Within that, of course, there's objective rewards like, you know, like you see in math and coding. And there's more subjective rewards like you see in RL from human feedback or kind of higher order, higher order versions of that. And then the sixth and seventh were things around kind of like normalization or conditioning, like, you know, just getting the numerical stability so that kind of the big blob of compute flows in this laminar way instead of, instead of running into problems.
在这个过程中,当然会有一些客观的奖励,比如像数学和编程中看到的那样。而也有更多主观的奖励,比如你在来自人类反馈的强化学习(RL)中看到的,或者是这一类更高层次的奖励。然后,第六和第七点涉及到的是关于归一化或条件化之类的事情,比如,确保数值上的稳定性,以便让大量计算能够顺畅地进行,而不是遇到问题。

So that was the hypothesis. And as hypothesis, I still hold. I don't think I've seen very much that is not in line with that hypothesis. And so the pre-trained scaling laws were one example of kind of what we see there. And indeed, those have continued going. Like, you know, I think now it's been widely reported like, you know, we feel good about pre-training. Like pre-training is continuing to give us gains.
所以这就是我的假设。作为一个假设,我仍然坚持这个看法。我认为我没有看到太多与这个假设不符的情况。因此,预训练的规模规律就是我们所观察到的一个例子。而且,这些规律确实在继续发展。现在,大家普遍认为我们对预训练感到满意,因为预训练不断为我们带来收益。

What has changed is that now we're also seeing the same thing for RL, right? So we're seeing a pre-training phase and then we're seeing like an RL phase on top of that. And with RL, it's actually just the same. Like, you know, even other companies have published in some of their releases, have published things that say, look, we train the model on math contests, AIME or the kind of other things. And how well the model does is log linear and how long we've trained it. And we see that as well. And it's not just math content. It's a wide variety of RL tasks. And so we're seeing the same scaling in RL that we saw for pre-training.
变化在于,现在我们也在强化学习(RL)中看到类似的情况,对吧?我们看到有一个预训练阶段,然后在此基础上还有一个强化学习阶段。在强化学习中,实际上情况是相同的。很多公司在他们的一些发布中提到过,我们在数学竞赛,比如AIME,或其他类型的比赛上训练模型,而且模型性能与训练时间呈对数线性关系。我们也观察到了这一现象。而且这不仅限于数学内容,还包括各种各样的强化学习任务。因此,我们在强化学习中也看到了和预训练相同的扩展趋势。

You mentioned Richard Sutton and the bitter lesson. Yeah. I interviewed him last year. And he is actually very non-Elelem-pilled. And if I, if I, I don't know if this is his perspective, but one way to pair a phrase, the subjection is something like, look, something which possesses the true core of human learning would not require all these billions of dollars of data and compute and these bespoke environments to learn how to use Excel or how to use PowerPoint, how to navigate a web browser.
你提到了Richard Sutton和“The Bitter Lesson”(苦涩的教训)。是的,我去年采访过他。他实际上并不是特别相信Elelem的观点。如果让我来表达这种观点,其中一个方式可以是这样:真正掌握人类学习核心的事物,不需要耗费数十亿美元的数据和计算资源,也不需要专门的环境,就能学会使用Excel、PowerPoint或者怎样浏览网页。

And the fact that we have to build in these skills using these RL environments hints that we're actually lacking this core human learning algorithm. And so we're scaling the wrong thing. And so yeah, that is a question. Why are we doing all this RL scaling? If we do think there's something that's going to be human-like and it's a bit of a learn on the fly. Yeah. Yeah. So I think, I think this kind of puts together several things that should be kind of thought of differently.
事实上,我们需要在这些强化学习(RL)环境中构建这些技能,这表明我们其实缺乏核心的人类学习算法。所以我们在扩展的方向上可能是错误的。这确实引发了一个问题:如果我们认为有某种东西会像人类一样,并且能够灵活应对变化,那我们为什么还要如此大规模地进行强化学习呢?我觉得这段话把几个不同的想法混在了一起,其实它们应该被区别对待。

Yeah. I think there is a genuine puzzle here, but it may not matter. In fact, I would guess it probably, it probably doesn't matter. So let's take the RL out of it for a second. Because I actually think RL and it's a red herring to say that RL is any different from pre-training in this matter. So if we look at pre-training scaling, it was very interesting. Back in 2017, when Alec Radford was doing GPT-1, if you look at the models before GPT-1, they were trained on these data sets that didn't represent a wide distribution of text. You had these very standard language modeling benchmarks.
是的。我认为这里确实存在一个问题,但这可能并不重要。实际上,我猜测这大概并不重要。让我们先把强化学习 (RL) 放在一边,因为我认为在这个问题上,强化学习其实和预训练并没有区别。所以如果我们看看预训练的扩展,这非常有趣。早在2017年,当Alec Radford在做GPT-1时,如果你看看在GPT-1之前的模型,它们是基于一些并不代表广泛文本分布的数据集进行训练的。那时候,大家用的都是一些非常标准的语言建模基准。

GPT-1 itself was trained on a bunch of, I think it was fan fiction, actually. But it was like literary text, which is a very small fraction of the text that you get. And what we found with that, and in those days it was like a billion words or something. So small data sets and represented a pretty narrow distribution, like a narrow distribution of what you can see in the world. And it didn't generalize well. If you did better on I forgot what it was, but it's some kind of fan fiction corpus, it wouldn't generalize that well to the other tab.
GPT-1 本身是在一堆文本上训练的,我记得其中很多是同人小说。这些文本文体很像文学作品,但只占全部文本中的一小部分。当时我们发现,在那些日子里,训练数据集大约只有十亿个单词左右,所以数据集较小,而且文本分布范围很窄,与现实世界中可以看到的多样性相去甚远。结果是,它的泛化能力不好。如果在某个我忘记具体名字的同人小说语料库上表现得更好,它在其他方面的泛化效果就不佳。

We had all these measures of how well does a model do it predicting all of these other kinds of texts. You really didn't see the generalization. It was only when you trained over all the tasks on the internet, when you did a general internet scrape from something like Common Crawl or Scraping Links and Reddit, which is what we did for GPT-2. It's only when you do that that you kind of started to get generalization. And I think we're seeing the same thing on RL, that we're starting with first very simple RL tasks like training on math competitions, then we're kind of moving to broader training that involves things like code as a task.
我们有很多衡量指标来判断模型在预测其他各种文本时表现如何。但是,你并没有看到广泛的泛化效果。只有当你在互联网上进行全面的训练,比如从Common Crawl或Scraping Links和Reddit这样的地方抓取数据时,才能开始看到泛化效果——这正是我们在GPT-2中所做的。我认为在强化学习(RL)中我们也看到了类似的情况:我们首先从非常简单的任务开始,比如在数学竞赛上进行训练,然后逐渐转向更广泛的训练,包括把编程等作为任务的一部分。

And now we're moving to do many other tasks. And then I think we're going to increasingly get generalization. So that kind of takes out the RL versus the pre-training side of it. But I think there is a puzzle here either way, which is that on pre-training, when we train the model on pre-training, we use trillions of tokens. And humans don't see trillions of words. So there is an actual sample efficiency difference here. There is actually something different that's happening here, which is that the model starts from scratch and they have to get much more training.
现在我们正在转向执行许多其他任务。我认为我们会越来越多地实现泛化。这就淡化了强化学习与预训练之间的区别。但无论如何,在这里有一个问题,就是在进行预训练时,我们用数万亿个词汇来训练模型,而人类并不会接触到这么多的词汇。在这一点上,确实存在实际的样本效率差异。这里发生了一些不同的事情,那就是模型从零开始,需要进行更多的训练。

But we also see that once they're trained, if we give them a long context length, the only thing blocking a long context length is like inference. But if we give them a context length of a million, they're very good at learning and adapting within that context length. And so I don't know the full answer to this. But I think there's something going on that pre-training, it's not like the process of humans learning. It's somewhere between the process of humans learning and the process of human evolution.
翻译如下: 但我们也发现,一旦它们经过训练,如果我们给它们一个较长的上下文长度,唯一限制这种长度的因素就是推理过程。如果我们给它们一个一百万的上下文长度,它们在这个范围内非常擅长学习和适应。因此,我不完全清楚这其中的原理,但我认为这和人类的学习过程不太一样。这种预训练的过程介于人类学习和人类进化之间。

It's like it's somewhere between like we get many of our priors from evolution, our brain isn't just a blank slate, right? Whole books have been written about. I think the language models, they're much more blank slate. They literally start as like random weights, whereas the human brain starts with all these regions, it's connected to all these inputs and outputs. And so maybe we should think of pre-training and for that matter, RL as well, as being something that exists in the middle space between human evolution and you know kind of human on the spot learning. And as the in context learning that the models do as something between long-term human learning and short-term human learning. So you know there's this hierarchy of like there's evolution, there's long-term learning, there's short-term learning, and there's just human reaction.
这段话的大意是: 我们的大脑并不是一张白纸,而是通过进化获得了许多先天的知识。关于这个话题,有很多书籍专门研究。我认为,语言模型更像是一张白纸,因为它们一开始是随机的参数,而人类大脑则一开始就有各种连接和功能区域。因此,我们可以将预训练(以及强化学习)看作是介于人类进化和即时学习之间的一个过程。而模型在上下文中的学习,可以看作是介于人类的长期学习和短期学习之间的一个过程。这样一来,就形成了一个层次结构:进化、长期学习、短期学习和人类的即时反应。

And the LOM phases exist along this spectrum, but not necessarily exactly at the same points. There's no analog to some of the human modes of learning. The LOMs are kind of falling between the points. Does that make sense? Yes, although some things are still a bit confusing. For example, if the analogy is that this is like evolution, so it's fine that it's not that sample efficient, then like well, if we're going to get the kind of super sample efficient Asian from in context learning, why are we bothering to build in? You know, there's RL environment companies which are, it seems like what they're doing is they're treating it. How do you use this API? How do you slack? How do you use whatever? It's confusing to me why there's so much emphasis on that.
LOM阶段存在于这个范围内,但不一定正好在相同的点上。有些人类的学习模式并没有对应的类比。LOM有点像是在点与点之间徘徊。这样说清楚吗?是的,虽然有些地方还是有点困惑。比如,如果这个类比是像进化一样,那么样本效率不高也是正常的。但是,如果我们要通过上下文学习获得超级样本效率的优势,那我们为什么还要构建这些呢?你知道,现在有些RL环境公司似乎把重点放在如何使用这些API、Slack,或者其他工具上。我不太明白为什么要如此重视这些。

If the kind of Asian that can just learn on the fly is emerging or is going to soon emerge or has already reached. Yeah, so I mean, I can't speak for the emphasis of anyone else. I can only talk about how we think about it. I think the way we think about it is the goal is not to teach the model every possible skill within RL just as we don't do that within pre-training, right? Within pre-training, we're not trying to expose the model to every possible way that words could be put together. It's rather that the model trains on a lot of things and then it reaches generalization across pre-training, right?
如果像那种能随时学东西的亚洲人正在出现、即将出现或已经出现。那么,我只能谈谈我们是怎么看待这个问题的。我们的想法是目标并不是在强化学习(RL)中教模型所有可能的技能,就像我们在预训练中不这样做一样。我们在预训练中并不是试图让模型接触到所有可能的词组组合方式。相反,模型通过大量的训练实现了广泛的泛化,对吧?

That was the transition from GPT-1 to GPT-2 that I saw up close, which is like the model reaches a point. I had these moments where I was like, oh yeah, you just give the model a list of numbers that's like, this is the cost of the house, this is the square feet of the house, and the model completes the pattern and does linear regression. Not great, but it does it, but it's never seen that exact thing before. To the extent that we are building these RL environments, the goal is very similar to what was done five or ten years ago with pre-training, with we're trying to get a whole bunch of data, not because we want to cover a specific document or a specific skill, but because we want to generalize.
这就是我亲眼见证从GPT-1到GPT-2的过渡过程,这就像模型达到了一个新的高度。我有过这样的时刻,比如,你只需向模型提供一组数字,比如房屋的价格和面积,模型就能完成模式匹配并执行线性回归。虽然不算完美,但它能做到,而之前从没见过这种具体情况。在我们构建这些强化学习环境时,目标与五到十年前的预训练非常相似,我们试图获取大量数据,不是为了掌握某个特定文档或技能,而是为了实现更好的泛化能力。

I mean, I think the framework you're laying down obviously makes sense, like we're making progress through this AGI. I think the crux is something like, nobody at this point disagrees that we're going to achieve AGI in the century. And the crux is, you say we're hitting the end of the exponential and somebody else looks at this and says, oh yeah, we're making progress, we've been making progress since 2012 and then 2030-35 will have a human-like agent.
我的意思是,我认为你提出的框架显然是合理的,我们正在通过这一AGI(通用人工智能)取得进展。我认为关键点在于,现在没有人会不同意我们将在本世纪实现AGI。关键在于,你说我们正接近指数增长的末尾,而有人则会看待这一过程并说,哦对,我们一直在进步,自2012年以来就开始了这种进步,到2030-35年,我们会有一个类人的智能代理。

And so I want to understand what it is that you're seeing, which makes you think, obviously we're seeing the kinds of things that evolution did or that within the human lifetime learning is like in these models. And why think that it's one year away and not ten years away? I actually think of it as like two, there's kind of two cases to be made here. Two claims you could make, one of which is like stronger and the other of which is weaker. So I think starting with the weaker claim, when I first saw the scaling back in like 2019, I wasn't sure. The whole, this was kind of a 50-50 thing, right?
所以,我想了解你所观察到的东西,这让你觉得显然我们正在看到类似于进化或在人类生命周期内学习的模型。为什么认为这是仅一年的差距,而不是十年?我实际上认为有两种情况可以说明这里的问题,有两个说法可以提出,一个更强,一个更弱。先从较弱的说法说起,当我第一次在2019年看到这种发展趋势时,我不太确定。当时所有这一切有点像是五五开的情况,对吗?

I thought I saw something that was, and my claim was this is much more likely than anyone thinks it is, like this is wild, no one else would even consider this. Maybe there's a 50% chance this happens. On the basic hypothesis of, as you put it within ten years, we'll get to, you know, what I call kind of country of geniuses in a data center. I'm at like 90% on that. I mean, it's hard to go much higher than 90% because the world is so unpredictable. Maybe the irreducible uncertainty would be if we were at 95% where you get to things like, I don't know, maybe multiple companies have kind of internal turmoil and nothing happens.
我以为我看到了某种情况,并且我认为这个情况发生的可能性比任何人预想的都要高。这个想法非常大胆,别人可能根本不会考虑这种可能性。我觉得这件事有50%的可能性会发生。按照你所说的基本假设,即在十年内,我们将达到我称之为“数据中心的天才之国”的状态。我对此的信心大约是90%。因为世界变化无常,所以很难进一步提高这个概率。如果提高到95%,可能会因为一些无法避免的不确定性造成,比如可能有多家公司发生内部动荡,导致什么都没发生。

And then Taiwan gets invaded and like all the fabs get blown up by missiles and now you're trying to staria. Yeah, you could construct a scenario where there's like a 5% chance that it, or you can construct a 5% world where like things get delayed for 10 years. That's maybe 5%. There's another 5% which is that I'm very confident on tasks that can be verified. So I think with coding, I'm just except for that irreducible uncertainty. There's just, I mean, I think we'll be there in one or two years. There's no way we will not be there in 10 years in terms of being able to do it end to end coding.
然后台湾遭到入侵,所有的芯片工厂都被导弹炸毁,你就开始尝试寻找解决办法。是的,你可以假设一个情景,认为有5%的可能性会发生这种情况,或者你可以设想一个5%的世界,这种情况下事情会被拖延10年。那可能也就是5%的可能性。还有另外5%的情况是,我对可以验证的任务非常有信心。所以我认为在编程方面,除了那些无法消除的不确定性,我觉得我们在一两年内就能实现目标。在能够实现端到端编程方面,10年内不可能做不到。

My one little bit, the one little bit of fundamental uncertainty, even on long time scales, is this thing about tasks that aren't verifiable, like planning a mission to Mars, like you know, doing some fundamental scientific discovery like CRISPR, like you know, writing a novel, hard to verify those tasks. I am almost certain that we have a reliable path to get there, but like if there was a little bit of uncertainty, it's there.
我唯一的一点小小的不确定性,即使在很长的时间尺度上,也是无法验证的一些任务,比如策划火星任务,比如进行像CRISPR这样基础性的科学发现,再比如写一本小说,这些任务很难验证。我几乎可以确定我们有可靠的方法可以实现这些目标,但如果有一点不确定性,就在这些地方。

So on the 10 years, I'm like, you know, 90%, which is about as certain as you can be. Like I think it's crazy to say that this won't happen by 2035. Like in some sane world, it would be outside the mainstream. But the emphasis on verification hints to me as a lack of belief that these models would analyze. If you think about humans, you're good at things that both which we get verifiable reward and things which we don't. You're like, you have a desire.
在关于未来十年的预测中,我有大约90%的信心,这几乎是我所能达到的最高确定性。我真的觉得,到2035年这件事情不会发生的说法是疯狂的。在一个理智的世界中,这种看法应该是非常偏离主流的。然而,对于验证的重视让我觉得,人们可能对这些模型的分析能力缺乏信心。想想人类,我们不仅擅长那些能得到明确回报的事情,还擅长那些没有明确回报的事情。你可以说,我们都有自己内心的渴望。

No, this is why I'm almost sure. We already see substantial generalization from things that verify to things that don't, we're already seeing. But it seems like you were emphasizing this as a spectrum which will split apart which remains to see more progress. And I'm like, but that doesn't seem like how humans get better. The world in which we don't make it or the world in which we don't get there is the world in which we do.
不,这就是我几乎可以肯定的原因。我们已经看到,从经过验证的事物到未验证事物的显著泛化,我们已经在观察到了。但你似乎在强调这是一种将分裂开的趋势,还需要看到更多的进展。而我觉得,这似乎不是人类变得更好的方式。无论是我们无法成功的世界,还是我们无法达成目标的世界,都是我们取得成功的世界。

We do all the things that are verifiable. And then they like, you know, many of them generalized, but what we kind of don't get fully there. We don't we don't we don't fully, you know, we don't fully color in this side of the box. It's it's it's not a it's not a binary thing. But it also seems to me as even even if even if it in the world where generalization is weak when you only say to bare file domains, it's not clear to me in such a world you could automate software engineering because sovereign like in some sense you are quote unquote a software engineer.
我们做的都是可以验证的事情。然后,他们中的很多人会进行泛化,但我们似乎没有完全做到这一点。我们没有,我们没有,我们没有完全填满这个盒子的那一边。这不是一个非此即彼的事情。但在我看来,即使在泛化能力较弱的世界中,只关注可验证的领域,我仍然不清楚在这样的世界里能否实现软件工程自动化,因为在某种意义上,你可以说是一个“软件工程师”。

Yeah, but you're part of being a software engineer for you involves writing these like long memos about your grand vision about different things. And so I don't think that's part of the job of sweet. That's part that's part of the job of the company. But I do think sweet involves like design documents and other things like that.
是的,但作为软件工程师的一部分工作是写那些关于你对各种事物的宏伟愿景的长篇备忘录。我并不认为这些是软件工程师工作的内容,那是公司工作的内容。但我认为软件工程师的工作确实包含设计文档和其他类似的东西。

Um, which by the way, the models are not bad. They're already pretty good at writing comments. And so with with again, I'm making like much weaker claims here than I believe to like, you know, to to to to to kind of set up a, you know, to distinguish between two things. Like we're already almost there for software engineering. We are already almost there by by what metric?
嗯,顺便说一下,这些模型并不差。它们在写评论方面已经相当不错了。所以,我在这里表达的观点比我实际相信的要谨慎得多,这样做是为了区分两件事。就软件工程而言,我们已经接近目标了。那我们接近目标的标准是什么呢?

There's one metric which is like how many lines of code are written by AI. And you've used if you consider other productivity improvements in the course of the history of software engineering. Compilers write all the lines of software. And but we there's a difference between how many lines are written and how big the productivity improvement is. Oh, yeah. And then like we're almost there, meaning like the how big is the productivity improvement? Not just how many lines are written.
有一个衡量标准是由 AI 写了多少行代码。如果你考察软件工程历史上的其他生产力提升,就会发现编译器实际上写了所有的软件代码。但是,写了多少行代码和生产力提升的幅度之间是有区别的。没错,我们几乎已经达到了这一点。我指的不只是写了多少行代码,而是生产力提升的幅度有多大。

Yeah. Yeah. So so I actually, um, I actually, I actually agree with you on this. So I've made this series of predictions on code and software engineering. And and I think people have repeatedly kind of misunderstood them. So so let me, let me, let me let me lay out the spectrum. Right? like I think it was like, you know, like, you know, eight or nine months ago or something I said, you know, they I model will be writing 90 90% of the lines of code in like, you know, three to six months, which, which happened at least at some places, right? Happened to happen that and throttic happened with many people downstream using our models.
是的,是的。所以我其实,我其实同意你的看法。我做过一系列关于代码和软件工程的预测,我觉得人们一直有些误解。所以让我来阐述一下这个过程。大约在八九个月前,我曾说过,他们的AI模型将在三到六个月内编写90%的代码行,这至少在某些地方已经发生了。很多下游用户使用我们的模型时也确实发生了这种情况。

But, but that's actually a very weak criterion, right? People thought I was saying like, we won't need 90% of the software engineers. Those things are worlds apart, right? Like I would put the spectrum as 90% of code is written by the model. 100% of code is written by the model. And that's a big difference in productivity.
但是,但是那实际上是一个非常弱的标准,对吧?人们以为我在说我们将不再需要90%的软件工程师。这两者是天差地别的,对吧?我更倾向于这样看:90%的代码由模型编写,或者100%的代码由模型编写。这在生产力上是一个巨大的差异。

Um, 90% of the end to end sweet tasks, right? Including things like compiling, including things like setting up clusters and environments, testing features, writing memos 90% of the sweet tasks are written by the models 100% of today's sweet tasks are are written by the models. And even when when when they happen, doesn't mean software engineers are out. of a job like there's like new higher level things they can do where they can they can manage. And then there's a further down the spectrum like, you know, there's 90% less demand for sweet, which I think will happen. But like this, this, this is a spectrum. And you know, I wrote about it in the adolescence of technology where I went through this kind of spectrum with farming. Um, uh, and so I actually totally agree with you on that.
嗯,大约90%的端到端的开发任务对吧?包括编译,设置集群和环境,测试功能,撰写备忘录等等,这些任务的90%都是由模型编写的。而今天的开发任务中有100%是由模型完成的。但即便如此,这并不意味着软件工程师就会失业。他们依然可以从事高级别的工作,例如进行管理。接下来,需求会有所减少,开发任务的需求可能会降低90%,我认为这会发生。但这其实是一个渐进过程。我在关于技术发展的文章中提到过类似的过程,比如农业领域的变化。因此,我完全同意你的看法。

It's just these are very different benchmarks from each other, but we're proceeding through them super fast. It seems like in part of your vision, it's like going from 90 to 100. Um, first is going to happen fast and to that somehow that leads to huge productivity improvements. Um, whereas when I notice even in green field projects, if people start with cloud code or something, people report starting a lot of projects and I'm like, do we see in the world out there a renaissance of software, all these new features that wouldn't exist otherwise? And at least so far, it doesn't seem like we see that. And so that does make me wonder, even if, even if like I never had to intervene on cloud code, um, there is this thing of like, there's just the world that's complicated jobs are complicated and closing the loop on self-contained systems, whether it's just writing software or something, how much sort of, how much broader gains we would see just from that.
这只是因为这些基准差异很大,但我们正以极快的速度推进。在您的愿景中,这就像是从90到100的快速转变。首先,这个过程会很快进行,并且这会在某种程度上导致生产力的巨大提升。然而,我注意到即使在新项目开发中,如果人们从云代码等工具开始,很多人会报告启动了很多项目。我会想,我们是否在外面的世界中看到了一场软件的复兴——所有这些本不会存在的新功能?至少到目前为止,还未见明显迹象。这让我思考,即使我从未需要干预云代码,也有一个问题,那就是世界复杂,工作复杂,闭合自包含系统的循环,无论是仅仅写软件还是其他事情,我们可以从中看到多大的广泛收益呢?

And so maybe that makes us this would dilute our estimation of the country of geniuses. Well, I actually, I like, I like simultaneously, I simultaneously agree with you, agree that it's a reason why these things don't happen instantly. But at the same time, I think the, the, the effect is going to be very fast. So like, I don't know, you could have these two poles, right? One is like, um, you know, AI is like, you know, it's not going to make progress. It's slow. Like it's going to take, you know, kind of forever to diffuse within the economy, right? Economic diffusion has become one of these buzzwords that's like a reason why we're not going to make AI progress or why AI progress doesn't matter.
也许这会削弱我们对这个天才国家的估计。其实,我同时赞同你的观点,认为这就是为什么这些事情不会立刻发生的原因。但同时,我也觉得影响会非常迅速。所以,就像你可以有这两种观点:一种是,人工智能不会迅速进步,它的扩散会很慢,可能要花很长时间才能在经济中普及。经济扩散已经成为人们用来解释为什么我们不会在人工智能上取得进展或者为什么人工智能的进展无关紧要的流行词之一。

And, you know, the other axis is like we'll get recursive self-improvement, you know, the whole thing, you know, can't you just draw an exponential line on the curve? You know, it's, it's, we're going to have, you know, Dyson spheres around the sun and like, you know, you know, so many nanoseconds after, you know, after we get recursive, I mean, I'm completely caricatured in the view here. But like, you know, there, there are these two extremes. But what we've seen from, from the beginning, you know, at least if you look within entropic, there's this bizarre 10x per year growth and revenue that we've seen, right? So, you know, in 2023, it was like zero to a hundred million. 2024, it was a hundred million to a billion. 2025, it was a billion to like nine or 10 billion.
这段文字可以翻译成中文如下: “你知道,另一个方面就像是我们将获得递归自我改进,你知道,整个事情,难道你就不能在曲线上画一条指数增长的线吗?就是说,我们将会有围绕太阳的戴森球,还有,你知道,在我们获得递归能力后的那么多纳秒之后,我的意思是,我在这里完全是用一个夸张的观点来描述。但就像那样,有这两种极端的情况。但从一开始我们就看到了,至少如果你看内部情况,会有一种奇异的每年增长10倍的收入增长情况,对吧?所以,你知道,2023年是从零到一亿。2024年是一亿到十亿。2025年则是从十亿到大约九十亿或一百亿。”

And then you get to just about like a billion dollars for three-on-products, you could just like have a clean 10V. And the first month of this year, like that, that exponential is, you would think it would slow down, but it would like, you know, we added another few billion to like, you know, to, to, to, we added another few billions to revenue in January. And, and so, you know, obviously that curve can't go on forever, right? You know, the GDP is only so large. I don't, you know, I would even guess that it bends, that it bends, bends somewhat this year. But like, that is like a fast curve, right? That's like a, that's like a really fast curve.
然后大约能从三种产品中赚到十亿美元,你可以轻松获得一个清晰的10V。在今年的第一个月,这种指数级增长本应该放缓,但事实上,我们在一月份又增加了好几亿美元的收入。所以,显然,这种增长曲线不可能一直持续下去,对吧?毕竟国内生产总值(GDP)是有限的。我甚至猜测这条曲线在今年可能会有所弯曲。但这确实是一条非常快速的增长曲线,对吧?真的是很快的增长。

And I would bet it stays pretty fast, even as the scale goes to the entire economy. So like, I, I think we should be thinking about this middle world where things are like extremely fast, but not instant where they take time because of economic diffusion, because of the need to close the loop, because, you know, it's like this fiddly, oh man, I have to do change management within my enterprise. You know, I have to like, you know, uh, you know, I like, I set this up, but, but, you know, I have to change the security permissions on this in order to make it actually work. Or, you know, I had this like old piece of software that, you know, that like, you know, checks the model before it's compiled and, and, and like released.
我相信即使把范围扩大到整个经济体,速度依然会保持相当快。因此,我认为我们应该关注这样的一个中间状态:事情进行得非常快,但不是瞬间完成,而是需要时间,因为经济扩散、闭环的必要性等原因。这就像是需要在企业内部进行调整管理,比如我设置好了一些东西,但还需要修改安全权限才能真正运作,或者我有一个旧软件,它在模型编译和发布前需要进行检查等等。

And I have to rewrite it. And yes, the model can do that, but I have to tell the model to do that. And it has to, it has to take time to do that.
我需要重写它。是的,模型可以做到这一点,但我必须告诉模型去做。而且,这需要时间。

And, and, and so I think everything we've seen so far is, is compatible with the idea that there's one fast exponential that's the capability of the model. And then there's another fast exponential that's downstream of that, which is the diffusion of the model into the economy, not instant, not slow much faster than any previous technology, but it has its limits.
所以,我认为到目前为止我们所看到的一切都与这样的观点相符:模型本身有一个快速的指数增长;然后在此基础上,还有另一个快速的指数增长,这是模型向经济中扩散的过程。这种扩散不是瞬间完成的,也不算太慢,比以往任何技术都要快得多,但它仍然有其局限性。

And, and, and this is what we, you know, when I, when I look inside and thropic, when I look at our customers, fast adoption, but not infinitely fast.
这个,这个,这就是我们,你知道的,当我观察我们内部的时候,当我看到我们的客户时,会发现他们的采用速度很快,但并不是无限快。

Um, can I try a hot take on you? Yeah, I feel like diffusion is a cobe that people use to say when it's like, if the model isn't able to do something, they're like, oh, but the diff, it's like a diffusion issue.
嗯,我可以跟你分享一个大胆的看法吗?是这样的,我觉得“扩散”这个词被人们用来解释某些情况,比如当模型无法实现某个功能时,他们就会说,哦,那是扩散的问题。

But then you should use the comparison to humans. You would think that the inherent advantages that AI's have would make diffusion a much easier problem for new AI is getting onboarded than new humans getting onboard it. So an AI can read your entire slack and your drive in minutes. They can share all the knowledge that the other copy, other copies of the same instance have, you don't have this adverse selection problem when you're hiring AI's, who's going to just hire copies of a vetted AI model. Um, hiring a human is like so much more hassle. And people hire humans all the time, right? We pay humans upwards of 50 trillion dollars in wages because they're useful, uh, even though it's like, in principle, it would be much easier to integrate AI's into the economy than it is to hire humans.
但是你应该用人类进行比较。你可能会认为,人工智能的固有优势会使得在新AI的引入过程中,扩展和适应问题比新员工加入公司时要容易得多。比如,一个AI可以在几分钟内读取你整个Slack和云盘的内容。它们可以共享其他相同实例的副本所拥有的所有知识,因此在雇佣AI时不存在不利选择问题,因为雇主通常只会雇佣经过验证的AI模型的副本。而雇佣人类却麻烦得多。然而,人们仍然不断雇佣人类,每年支付超过50万亿美元的薪水,因为人类很有用。尽管理论上将AI融入经济比雇佣人类要容易得多。

I think like the diffusion, I feel like I don't know. I think diffusion is very real. And, and, and, and then doesn't have to, you know, doesn't exclusively have to do with limitation, limitation, limitations on the AI models. Like again, there are people who use diffusion to, to, you know, is kind of a buzzword to say this isn't a big deal. I'm not talking about that. I'm not talking about, you know, AI will diffuse at the speed that previous. I think AI will diffuse much faster than previous technologies have, but, but not infinitely fast.
我觉得"扩散"这个概念让我有点摸不着头脑。我认为扩散确实是真实存在的。而且它并不仅仅与AI模型的限制有关。有些人用"扩散"这个词作为流行语,表明这不是什么大问题。但我不是在说这个。我也不是在说AI会像以前的技术一样以相同的速度扩散。我认为AI的扩散速度会比以往的技术要快得多,但也不会快到无限。

So I'll, I'll just give an example of this, right? Like there's like clawed code. Like clawed code is extremely easy to set up. Um, you know, if you're a developer, you can kind of just start using clawed code. There is no reason why a developer at a large enterprise should not be adopting clawed code as quickly as, you know, individual developer developer at the start up. And we do everything we can to promote it, right? We sell, uh, we sell clawed code to enterprises and big enterprises. Like, you know, big, big financial companies, big pharmaceutical companies, all of them, they're adopting clawed code much faster than enterprises typically adopts new technology, right?
所以,我来举个例子吧:像Clawed Code这样的东西。Clawed Code非常容易设置。如果你是开发者,你可以很快上手使用Clawed Code。对于一家大型企业的开发者来说,没有理由不尽快采用Clawed Code,就像一家初创公司的独立开发者一样。我们尽一切努力去推广它,我们向企业和大型公司出售Clawed Code,比如大型金融公司、大型制药公司等。它们采用Clawed Code的速度远快于企业通常采用新技术的速度。

But, but again, it like it, it, it, it, it, it, it, it, it, it, it, it, it takes time. Like any given feature or any given product like clawed code or like co work, we'll get adopted by the, you know, the individual developers who are on Twitter all the time by the like series A startups many months faster than, you know, then they will get adopted by like, you know, uh, like large enterprise that does food sales. Um, there are a number of factors like you have to go through legal, you have to provision it for everyone. It has to, you know, like it has to pass security and compliance. The leaders of the company who are further away from the I revolution, you know, are, are forward looking, but they have to say, oh, it makes sense for us to spend 50 million.
但是,这个过程确实需要时间。任何一项新功能或新产品,比如编程工具Clawed Code或团队协作工具Co Work,通常会首先被那些活跃在社交媒体上的个体开发者或者初创公司(比如A轮融资的初创公司)所采用,而比起这些,像从事食品销售的大型企业则需要更长的时间来适应和采用。这其中涉及多个因素,例如需要通过法律审批,为每个人准备好设施,满足安全和合规要求等等。尽管公司领导者对技术革新持前瞻性态度,但他们需要仔细评估是否值得花费五千万资金。

This is what this clawed code thing is. This is why it helps our company. This is why it makes us more productive. And then they have to explain to the people two levels below. And they have to say, okay, we have 3000 developers like here's how we're going to roll it out to our developers. And we have conversations like this every day, like, you know, we are doing everything we can to make end-throffics revenue grow 20 or 30X a year instead of 10X a year. Um, you know, and, and again, you know, many enterprises are just saying, this is so productive. Like, you know, we're going to take shortcuts on our usual procurement process, right?
这就是这个爪状代码的东西。这就是它为什么对我们的公司有帮助。这就是为什么它让我们更高效。然后,他们必须向下属两级的人解释。他们需要说,好,我们有3000名开发人员,下面是我们将如何向这些开发人员推广这个东西。我们每天都有这样的对话,比如,我们正在尽一切努力让公司的收入每年增长到20倍或30倍,而不是10倍。并且,也有很多企业在说,这个真的很高效,因此我们决定在通常的采购流程上走捷径。

They're moving much faster than, you know, when we tried to sell them just the ordinary API, which many of them use, but clawed code is a more compelling product. Um, but it's not an infinitely compelling product. And I don't think even AGI or Powerful AI or Country of Geniuses in the data center will be an infinitely compelling product. It will be a compelling product enough maybe to get three or five or 10X a year growth, even when you're in the hundreds of billions of dollars, which is extremely hard to do and has never been done in history before, but not infinitely fast. I buy that would be a slight slowdown. And maybe this is not your claim. But sometimes people talk about this like, oh, the capabilities are there, but because of diffusion. Um, otherwise, like we're basically at AGI. And then I don't believe we're basically at AGI.
他们发展得快多了,你知道,在我们尝试向他们出售普通API时,他们中的许多人都使用这些API,但Clawed代码是一个更具吸引力的产品。不过,这并不是一个无比吸引人的产品。而且我认为,即便是AGI(通用人工智能)或强大的AI,或者数据中心里的天才国度,也不会是一个无比吸引人的产品。它可能足够吸引人,以致于让年增长率达到3倍、5倍甚至10倍,即使在数千亿美元的规模下,但不会无限快速地增长。我认为,这可能会有一些放缓。也许这不是你的观点,但有时人们会这样说:哦,能力已经在那里了,只是因为扩散的关系。要不然,仿佛我们已经达到了AGI。而我并不认为我们实际上已经达到了AGI。

I think if you had the country of geniuses in the data center, if your company did adopt the country of geniuses in the data center, we would know it. Right. We would know it. If you had the country of geniuses in the data center, like everyone in this room would know it. Everyone in Washington would know it like, you know, people in rural parts that might not know it, but, but, but like, but we would know it. We don't have that now, but that's very clear. As Daria was ending at the geogeneralization, you need to train across a wide variety of realistic tasks and environments. For example, with a sales agent, the hardest part isn't teaching you to mash buttons in a specific database and sales force. It's training the agent's judgment across ambiguous situations. How do you sort through a database with thousands of leads to figure out which ones are hot? How do you actually reach out? What do you do when you get ghosted?
我觉得,如果你公司的数据中心里有一群天才,大家都会知道的。对吧,我们都会知道。如果数据中心拥有一群天才,就算是房间里的每一个人都会知道,华盛顿的人也会知道,可能偏远地区的人不知道,但我们肯定会知道的。现在还没有这样的情况,这是显而易见的。正如Daria提到的一样,想要实现广泛的通用性,就需要在各种真实任务和环境中进行训练。以销售代理为例,最难的部分不是教你如何在特定数据库和销售系统中点击,而是训练代理在模糊情况下做出判断的能力。比如,怎样从成千上万条潜在客户的信息中筛选出最有希望的?如何真正有效地与客户联系?当客户不回复时,你该怎么办?

When an AI lab wanted to train a sales agent, Libelbox brought in dozens of Fortune 500 salespeople to build a bunch of different rural environments. They created thousands of scenarios where the sales agent had to engage with the potential customer, which was roleplayed by a second AI. Libelbox made sure that this customer AI had a few different personas, because when you call call, you have no idea who's going to be on the other end. You need to be able to deal with a whole range of possibilities. Libelbox's sales experts monitored these conversations turned by turn, tweaking the role-playing agent to ensure it did the kinds of things an actual customer would do. Libelbox could iterate faster than anybody else in the industry. It is super important because RL is an empirical science. It's not a solved problem.
当一家人工智能实验室希望训练一名销售代理时,Libelbox邀请了数十位《财富》500强企业的销售员来创建各种不同的模拟环境。他们设计了成千上万种情景,销售代理需要与由另一个人工智能扮演的潜在客户进行互动。Libelbox确保这个客户AI具备多种不同的角色特征,因为当你打电话推销时,你无法预知电话另一端的人会是谁。你需要能够应对各种可能性。Libelbox的销售专家对这些对话进行逐回合的监控,并不断调整角色扮演中的代理,以确保其行为符合真实客户的表现。Libelbox在行业中拥有最快的迭代速度。这非常重要,因为强化学习是一门基于经验的科学,目前仍未解决所有问题。

Libelbox has a bunch of tools for monitoring agent performance in real time. This lets their experts keep coming up with tasks so that the model stays in the right distribution and difficulty and gets the optimal reward signal during training. Libelbox can do this sort of thing in almost every dummy. They've got hedge fund managers, radiologists, even airline pilots. So whatever you're working on, Libelbox can help. Learn more at labelbox.com slash vorkash. Coming back to concrete predictions, because I think because there are so many different things to disambiguate, it can be easy to talk past each other when we're talking about capabilities. So, for example, when I interviewed you three years ago, I asked your prediction about what we should expect three years from now.
Libelbox 提供了一系列工具,可实时监控代理的表现。这使得他们的专家能够不断设计任务,以保持模型在合适的分布和难度范围内,在训练过程中获得最佳奖励信号。Libelbox 几乎可以在任何虚拟环境中做到这点。他们的客户包括对冲基金经理、放射科医生,甚至是航空公司飞行员。所以,无论你正在进行什么项目,Libelbox 都能提供帮助。想了解更多信息,请访问 labelbox.com/vorkash。回到具体的预测,因为需要区分的事情实在太多了,以至于在讨论能力时很容易彼此误解。例如,在三年前我采访你的时候,我询问了对未来三年的预测。

I think you were right. So you said we should expect systems, which if you talk them for the course of an hour, it's hard to tell them apart from a generally well-educated human. Yes. I think you were right about that. I think spiritually, I feel unsatisfied because my internal expectation was that such a system could automate large parts of why color work. And so it might be more productive to talk about the actual end-capabilities you want such a system. So I will basically tell you what I think we are. But let me ask you in a very specific question so that we can figure out exactly what kinds of capabilities we should just get at. So maybe I'll ask about it in the context of a job I understand well, not because it's the most relevant job, but just because I can evaluate the claims about it.
我觉得你说得对。你提到我们应该期待这样的系统,如果你和它们交谈一个小时,很难将它们与受过良好教育的人区分开来。是的,我觉得你在这方面是对的。不过,从精神上来说,我感到有些不满足,因为我内心期望这样的系统可以自动化工作中的大部分内容。因此,可能更有意义的是讨论这个系统最终应该具备什么样的功能。那么,我会基本上告诉你我的看法。但让我问你一个非常具体的问题,以便我们能明确我们到底需要哪些能力。所以,也许我会在一个我非常了解的工作背景下问这个问题,这不是因为它是最相关的工作,只是因为我能评估关于它的说法。

Take video editors, right? I have video editors. And part of their job involves learning about our audiences' preferences, learning about my preferences and tastes and the different trade-offs we have. Just over the course of many months building up this understanding of context. And so the skill and ability they have six months into the job, a model they can pick up that skill on the job, on the fly, when should expect such an AI system? Yeah.
你知道视频编辑吗?我有一些视频编辑。他们的工作的一部分是了解我们观众的喜好,了解我的喜好和品味,以及我们在工作中需要做出的各种权衡。这需要几个月的时间来建立这种背景理解。六个月后,他们在工作中所获得的技能和能力是通过实践积累的。那么,我们应该何时期待这样的AI系统能够即时获得这些技能呢?

So I guess what you're talking about is we're doing this interview for three hours and then someone's going to come in, someone's going to edit it, they're going to be like, oh, I don't know, Dario scratched his head and we could edit that out and magnify that. There was this long discussion that is less interesting to people. And then there's other things. that's more interesting to people. So let's kind of make this edit.
所以我猜你说的是我们会进行三个小时的采访,然后会有人来编辑。他们可能会看到,比如“哦,我不知道,达里奥挠了挠头,我们可以剪掉或者放大这个部分。”有些漫长的讨论对人们来说没那么有趣,而其他的事情对人们更有吸引力。所以我们希望能够进行这样的编辑。

So I think the country geniuses in a data center will be able to do that. The way it will be able to do that is it will have general control of a computer screen. And you'll be able to feed this in and it'll be able to also use the computer screen to go on the web, look at all your previous interviews, look at what people are saying on Twitter and response to your interviews, talk to you, ask you questions, talk to your staff, look at the history of kind of edits that you did, and from that, like do the job.
所以我认为数据中心的那些天才能做到这一点。具体来说,他们可以全面掌控电脑屏幕。你可以输入相关信息,他们也可以利用电脑屏幕上网,查看你之前的所有采访,关注大家在推特上对你采访的评价,与您交流、提问,与您的团队沟通,查看你做过的编辑历史,根据这些来完成工作。

So I think that's dependent on several things. One, that's dependent, I think this is one of the things that's actually blocking deployment, getting to the point on computer use, where the models are really masters at using the computer. And we've seen this climb in benchmarks and benchmarks are always imperfect measures. But OS world is went from 5% to, I think when we first released a computer use a year and a quarter ago, it was maybe 15% or I don't remember exactly.
我认为这取决于几个因素。首先,这取决于一点,我认为这也是阻碍推广的原因之一,就是在计算机使用方面取得突破,使模型真正能够熟练使用计算机。我们已经看到基准测试中的进步,而基准测试总是有不足之处。但操作系统的世界从5%的水平发展至,我记得我们在大约一年零三个月前首次发布计算机使用能力时,可能达到了15%,具体数字我记不清了。

But we've climbed from that to 65% or 70%. And there may be harder measures as well. But I think computer use has to pass a point of reliability. Can I just ask a followup on that before you move on to the next point? I often, for years, I've been trying to build different internal LLM tools for myself. And often I have these text in, text out tasks, which should be death center in the repertoire of these models.
我们已经从那个水平提升到了65%或70%。可能也会有更严格的措施。但我认为计算机的使用必须达到一个可靠性的水平。在你继续下一个话题之前,我能就此问题再跟进一下吗?多年来,我一直在为自己尝试构建不同的内部LLM工具。很多时候,我需要处理这些文本输入、文本输出的任务,而这正是这些模型的核心功能。

And yet I still hire humans to do them, just because it's if it's something like make, identify what the best clips would be in this transcript. And maybe they'll do like a seven out of 10 job at them. But there's not this ongoing way I can engage with them to help them get better at the job, the way I could with a human employee. And so that missing ability, even if you saw computer use, with still blocked my ability to like offload an actual job to them.
尽管如此,我仍然会雇佣人类来做这些工作,只是因为在一些任务上,比如从这个文字记录中挑选出最佳片段时,人类可能只能做到70分。但我没有办法通过持续的互动来帮助他们像对待人类员工那样提高工作技能。而这种缺失的能力,即便有计算机的帮助,仍然阻碍着我把实际工作交给机器。

Again, there's, there's, this gets back to what to kind of what we were talking about before with learning on the job where it's, it's very interesting. You know, I think, I think with the coding agents, like I don't think people would say that learning on the job is what is, you know, preventing the coding agents from like, you know, doing everything end-hand. Like they keep, they keep getting better.
再次提到之前我们谈到过的在工作中学习的话题,这确实很有趣。你知道,我认为对于编程代理来说,人们不会说在工作中学习是阻碍它们全面完成任务的原因。事实上,它们一直在不断进步。

We have engineers that are in the topic who like don't write any code. And when I look at the productivity to your, to your previous question, you know, we have folks who say this, this GPU kernel, this chip, I used to write it myself. I just have Clawed do it. And so there's this, there's this enormous improvement in productivity.
我们有一些工程师,他们并不写任何代码。他们对这个主题了解很深。至于你之前提到的关于生产力的问题,我们有些人会说:这段用于GPU内核的代码,这个芯片,以前是我自己写的,现在我只是让Clawed来写。因此,生产力得到了极大的提升。

And I don't know like when I see Clawed code, like familiarity with the code base or like, you know, or, or a feeling that the model hasn't worked at the company for a year, that's not high up on the list of complaints I see. And so I think what I'm saying is we're like, we're kind of taking a different path. Don't you think with coding, that's because there is an external scaffold of memory which exists in stanchly in the code base, which I don't know how many other jobs have coding made fast-rider specific precisely. because it has this unique advantage that other economic activity doesn't. But, but when you say that, what you're, what you're implying is that by reading the code base into the context, I have everything that the human needed to learn on the job. So that would be an example of whether it's written or not, whether it's available or not, a case where everything you needed to know you got from the context window, right?
当我看到Clawed的代码时,我并不觉得对代码库的熟悉程度或模型在公司工作时间不长是最主要的问题。所以我认为,我们实际上走的是一条不同的路。你不觉得在编程方面,我们有点不同吗?这是因为代码库中存在一个外部的记忆架构,这在其他工作中可能并不常见。编程有这样独特的优势是其他经济活动所没有的。当你说这些的时候,你其实在暗示,通过阅读代码库,我就能获得人们在工作中需要学习的一切。无论这些信息是否被明确写出来或提供出来,这都是一个从上下文中获得所需知识的例子,对吧?

And that, and that what we think of as learning, like, oh man, I started this job. It's going to take me six months to understand the code base. The model just did it in the context. Yeah, I honestly don't know how to think about this because there are people who qualitative report what you're saying. There was a meter study, I'm sure you saw last year, where they had experienced developers try to close a poor request in repositories that they were familiar with. And those developers reported an uplift. They reported that they felt more productive with those uses these models. But in fact, if you look at their output and how much was actually merged back in, there's a 20% downlift. They were less productive as a result of these models.
我们通常所认为的学习,比如开始一份新工作,需要花六个月时间来理解代码库,而模型能够在上下文中完成类似的任务。坦白说,我不太知道该如何看待这种情况,因为有些人从定性角度报告了类似的现象。你可能看到过去年的一项研究,他们让有经验的开发者在自己熟悉的代码库中尝试关闭请求。这些开发者报告说,使用这些模型后,他们感觉生产效率提高了。然而,实际上,如果看他们的产出和实际合并回去的代码量,反而减少了20%。结果表明,使用这些模型使他们的生产效率下降了。

And so I'm trying to square the qualitative feeling that people feel with these models versus one in a macro level where are all the, where's this like renaissance of software? And two, when people do these independent evaluations, why are we not seeing the, yeah, so try to benefit to be able to expect within Anthropic. This is just really unambiguous, right? We're under an incredible amount of commercial pressure and make it even harder for ourselves because we have all the safety stuff we do that I think we do more than other companies. So like the pressure to survive economically while also keeping our values is just incredible, right? We're trying to keep this 10X revenue curve going. There's like, there is zero time for bullshit.
我正在努力理解人们对这些模型的主观感受,与宏观层面上的软件复兴进行对比。为什么在独立评估中,我们没有看到预期中对Anthropic的明显好处呢?这真的很明确,我们面临巨大的商业压力,而且因为我们对安全的严格要求,这使得情况更加复杂,我认为我们在这方面做得比其他公司更多。因此,在保持我们价值观的同时,经济上生存的压力相当巨大。我们正努力保持十倍的收入增长,没有时间浪费在没有意义的事情上。

There is zero time for feeling like we're productive when we're not. Like these tools make us a lot more productive. Like why do you think we're concerned about competitors using the tools? Because we think we're ahead of the competitors and like we don't, we don't want to excel. We wouldn't be going through all this trouble if this was secretly reducing, reducing our productivity. Like we see the end productivity every few months in the form of model launches. Like there's no kidding yourself about this. Like the models make you more productive. One, that people feeling like they're more productive is qualitatively predicted by studies like this. But two, if I just look at the end output, obviously you guys are making fast progress.
我们没有时间在并不真正高效的时候自我感觉良好。这些工具确实让我们的工作效率提升了很多。我们担心竞争对手使用这些工具,正是因为我们觉得在这方面领先于他们。如果这些工具真的在悄悄降低我们的效率,我们就不会花这么多精力去使用它。每隔几个月,我们通过新模型的发布就能看到生产力的成果,毫无欺骗自己的余地。这些模型确实能提高生产力。首先,研究已经预测到人们会觉得自己更高效。其次,从最终结果来看,你们的进展显而易见。

But the fact, you know, the idea was supposed to be with recursive self improvement is that you make a better AI, the AI helps you build a better next AI, etc, etc. And what I see instead, if I look at the you open AI deep mind is that people are just shifting around the podium every few months. And maybe you think that stops because you've won or whatever. But why are we not seeing the person with the best coding model have this lasting advantage? If in fact there are these enormous productivity gains from the last coding model. So no, no, no. I mean, I mean, I think it's all like my model of the situation is there's an advantage that's gradually growing. Like I would say right now, the coding models give maybe, I don't know, a like 15, maybe 20% total factor speed up. Like that's my view.
但事实是,理论上,自我改进的概念是,你创建一个更好的人工智能,然后这个人工智能帮助你开发下一个更优秀的人工智能,依此类推。然而,我观察OpenAI、DeepMind等公司时,看到的情况却是每隔几个月人们就把“领奖台”移来移去。也许你认为这会停止,因为你已经赢了,但为什么我们没看到拥有最佳代码模型的人能长期保持优势呢?如果这些代码模型真的能带来巨大的生产力提升,那应该是显而易见。我的看法是,目前的情况是这种优势在逐渐增长。我觉得现在的代码模型大概可以提供15%到20%的整体速度提升。这是我的观点。

And six months ago, it was maybe 5%. And so it didn't matter. Like 5% doesn't register. It's now just getting to the point where it's like one of several factors that kind of matters. And that's going to keep speeding up. And so I think six months ago, like, you know, there were several companies that were at roughly the same point because this wasn't a notable factor. But I think it started to speed up more and more. I would also say there are multiple companies that write models that are used for code. And we're not perfectly good at preventing some of these other companies from using our models internally. So I think everything kind of everything we're seeing is consistent with this kind of snowball model where you know, there's no hard, again, my theme in all of this is like, all of this is soft takeoff.
六个月前,这个比例可能只有5%,所以当时没啥影响,因为5%根本不算什么。而现在,这一因素已经逐渐变得重要起来,并将继续加速发展。我认为六个月前,有好几家公司都处于类似的水平,因为这还不是一个显著的因素。不过,我觉得现在速度开始越来越快了。我还想说,有多家公司在开发用于编程的模型,我们并不能完全阻止其他公司在内部使用我们的模型。因此,我们看到的一切都像滚雪球一样,渐渐变大。总的来说,我的观点是,所有这一切都是缓慢而稳定的启动过程。

Like soft, smooth exponentials, although the exponentials are relatively steep. And so we're seeing this snowball gather momentum where it's like 10%, 20%, 25%, you know, 40%. And as you go, yeah, Amdolzlaw, you have to get all the like things that are preventing you from closing the loop out of the way. But like, this is one of the biggest priorities within anthropic. Um, stepping back, I think before in the stack, we were talking about, um, well, when do we get this on the job learning? And it seems like the coding, the point you're making the coding thing is we actually don't need on the job learning that you can have tremendous productivity improvements. You can have potentially trillions of dollars around you for eye companies without this basic human ability. Maybe that's not your plan. You should clarify. Um, but without this basic human ability to learn on the job, but I just look at it like in most domains of economic activity, people say, I hired somebody, they were in that useful for the first few months and then over time, they built up the context understanding.
像柔和、平滑的指数增长,尽管这些增长相对陡峭。因此,我们看到雪球效应正在积聚动力,就像从10%、20%、25%到40%一样递增。而在此过程中,你需要解决所有阻碍你完成闭环的因素。但这在Anthropic内部是一个最大的优先事项。 退一步看,我认为在技术层面上,我们之前讨论过什么时候能实现工作中的学习。看起来你提到的关于编码的观点是,我们实际上不需要在工作中学习,也能大幅提升生产力。在不具备这一基本人类能力的情况下,可能会有数万亿美元的潜力在你周围。这也许不是你的计划,你应该澄清一下。但是在大多数经济活动领域,人们会说,我雇了一个人,他们在最初几个月里并没有多大用处,但随着时间推移,他们建立了对背景的理解。

It's actually harder to find what we're talking about here. But they got something. And then now now they're their power horse and they're so valuable to us. And if AI doesn't develop this ability to learn on the fly, I'm not, I'm a bit skeptical that we're going to see huge changes to the world with that ability. So I think, I think, I think two things here, right? There's the state of the technology right now, um, which is again, we have these two stages. We have the pre-training and RL stage where you throw, you throw a bunch of data in tasks into the models and then they generalize. So it's like learning, but it's like learning from more data and not, you know, not learning over kind of one human or one model's lifetime. So again, this is situated between evolution and human learning. But once you learn all those skills, you have them. And just like with pre-training, just how the models know more, you know, if I look at a pre-trained model, you know, it knows more about the history of samurai in Japan than I do.
我们讨论的问题其实很难找,但他们已经取得一些成果。他们现在就像是动力源,对我们非常有价值。如果人工智能不能快速学习新事物,我对其是否能带来巨大的世界变化持怀疑态度。我认为这里有两个重点:当前的技术状况,我们有两个阶段,即预训练阶段和强化学习(RL)阶段。在这两个阶段中,我们将大量数据和任务输入模型,模型进行泛化。这就像学习,但更多是通过大量的数据进行学习,而不是像人类或单一模型那样的终生学习。这种方式介于进化和人类学习之间。一旦学会这些技能,它们就被掌握了。就像预训练一样,模型在这方面知道得更多。例如,一个预训练的模型可能对日本武士的历史了解得比我多。

It knows more about baseball than I do. It knows, you know, it knows more about, you know, low pass filters and electronics than, you know, all of these things. It's knowledge is way broader than mine. So I think, I think even, even just that, um, you know, may get us to the point where the models are better at, you know, kind of better at everything. And then we also have, again, just with scaling the kind of existing setup, we have the in context learning, which I would describe as kind of like human on the job learning, but like a little weaker and a little short term. Like you look at in context learning, you give the model a bunch of examples. It does get it. There's real learning that happens in context. And like a million tokens is a lot. That's, you know, that can be days of human learning, right? You know, if you think about the model, you know, you know, kind of reading, reading a million words, you know, it, you know, it takes me, how long would it take me to read a million?
它了解的棒球知识比我多。它对于低通滤波器和电子技术的了解比我们所有人都多。它的知识范围比我的要广得多。所以我认为,甚至仅仅是这一点,可能就会让这些模型在各方面表现得更好。 此外,通过现有系统的扩展,我们还有"上下文学习"。我会把它描述为有点类似于人类在职学习,但稍微弱一点,时间也短一点。通过观察上下文学习,你给模型一些例子,它确实能理解,有实际的学习发生在上下文中。百万级的标记数量是很大的一笔,这相当于人类数天的学习量。如果你想象一下模型阅读一百万个单词,这对我来说需要多长时间才能读完?

I mean, you know, like days or weeks at least. Um, uh, so you have these two things. And, and I think these two, these two things within the existing paradigm may just be enough to get you the country's geniuses in the data center. I don't know for sure, but I think they're going to get you a large fraction of it. There may be gaps, but I, I certainly think just as things are this, I believe, is enough to generate trillions of dollars of revenue. That's one. That's all one. Two is this idea of continual learning, this idea of a single model learning on the job. Um, I think we're working on that too. And I think there's a good chance that in the next year or two, we also make, we also solve that. Um, I, I, again, I, I, I, you know, I think you get most of the way there without it. I think the trillions of dollars of, of, you know, the, I think the trillions of dollars a year market, maybe all of the national security implications and the safety implications that I wrote about in that lessons of technology can happen without it.
我的意思是,你知道,至少是几天或几周的时间,嗯,呃,所以你有这两样东西。而且,我认为在现有的范式下,这两样东西可能就足以让你获得国内数据中心的天才。我不敢肯定,但我觉得它们能帮助你得到其中很大一部分。可能会有一些空白,但我确实认为,就像现在这样,我相信这足以创造数万亿美元的收入。这是一方面。这就是所有的一方面。另一方面是关于持续学习的想法,即一个模型在实际应用中不断学习。我认为我们也在努力这个方向,而且我觉得在接下来的一两年内,我们有很大机会解决这个问题。同样,我觉得即使没有它,你也能走大部分的路。我认为数万亿美元的市场,包括我在技术课中提到的所有国家安全和安全方面的影响,可能在没有它的情况下就能实现。

But I, I, I also think we, and I imagine others are working on it. And I think there's a good chance that, that, you know, that we get there within the next year or two. There are a bunch of ideas. I won't go into all of them in detail, but, um, you know, one is just make the context longer. There's, there's nothing preventing longer context from working. You just have to train at longer context and then learn to, to serve them in inference. And both of those are engineering problems that we are working on and that I would assume others are working on as well.
我,我,我也认为我们,以及我想其他人也在致力于这项工作。我认为我们在接下来的一年或两年内有很大机会能够实现这一目标。有很多想法,我不会详细说明所有想法,但是,其中之一就是延长上下文长度。其实没有什么能够阻止更长的上下文起效,只需要在更长的上下文上进行训练,然后学会在推理时正确使用它们。这两个都是工程问题,我们正在解决,我想其他人也在研究。

Yeah. So this context line then creates, it seemed like there was a period from 2020 to 2023, where from GBD3 to GPD4 Turbo, there was an increase from like 2000 context lines to 128 K. I feel like for the next, for the two-ish year since then, we've been in the same-ish ballpark. Yeah. And when model context lines get much longer than that, people report qualitative degradation in the ability to model to consider that full context.
好的。这句话的大意是:在2020年到2023年这段时间,从GBD3到GPD4 Turbo,模型的上下文行数从大约2000增加到了128K。我觉得在此之后的两年左右,我们的进展都在大致相同的范围内。当模型的上下文行数变得比这个长很多时,人们报告说模型处理完整上下文的能力出现了明显的退化。

So I'm curious what you're internally seeing that makes you think like, oh, 10 million context, 100 million context, to get human like six months learning, billion, billion, billion context. This isn't a research problem. This is a, this is an engineering and inference problem, right? If you want to serve long context, you have to like store your entire KV cache, you have to, you know, um, uh, you know, it's, it's, it's, it's difficult to store all the memory in the GPUs, to juggle the memory around.
所以,我很好奇你们内部是怎么看待这个问题的,让你们觉得哦,要达到类似人类六个月学习水平的模拟,需要千万级、亿万级的上下文。这不是一个研究问题,而是一个工程和推理问题,对吧?如果你想要支持长上下文,你必须存储完整的键值缓存,并且需要在GPU上处理所有的内存,这样的内存管理是很难的。

I don't even know the detail, you know, at this point, this is at a level of detail that, that, that, that I'm no longer able to follow, although, you know, I, I knew it at the GPD3 era of like, you know, these are the weights, these are the activations you have to store. Um, uh, but, you know, you know, these days, the whole thing has flipped because we have MOE models and, and, kind of all of that. But, um, uh, and, and there's degradation you're talking about, like, again, without getting too specific, like a question I would ask is like, there's two things.
我已经不了解具体细节了,你知道的,此时这已经到了一个让我无法跟上的细节层次。虽然在GPT-3时代,我还知道这些是需要存储的权重和激活值。但是现在情况发生了变化,因为我们有了MOE(混合专家)模型,所有这些都有了新的发展。你提到的降级问题,不具体说的话,我有两个问题想问。

There's the context length you train at, and there's a context length that you serve at. If you train at a small context length and then try to serve at a long context length, like, maybe you get these degradations, it's better than nothing you might still offer it, but you get these degradations. And maybe it's harder to train at a long context length. Yeah, I so, you know, there's, there's a lot. I want to at the same time ask about like, maybe some rabbit holes of like, well, we wouldn't expect that if you had to train on longer context length, that would mean that, um, you're able to get sort of like less samples in for the same amount of compute.
在训练模型时,有一个训练的上下文长度,而在实际应用时,也有一个使用的上下文长度。如果你在小的上下文长度下进行训练,然后尝试在长的上下文长度下应用,可能会出现性能下降。不过,即便有这些下降,可能也比没有好,所以你可能还是会这样做。另一方面,在长上下文长度下进行训练可能更困难。是的,所以,你知道,这其中有很多问题。我同时也想问一些,比如说,如果你不得不在较长的上下文长度下进行训练,那么这是否意味着在相同的计算量下,你能够使用的样本数量会减少。

But before, maybe it's not worth diving deep on that, I want to get an answer to the bigger picture question, which is like, okay, so, um, I don't feel a preference for a human editor that's been working for me for six months versus an AI that's been working with me for six months. What year do you predict that that will be the case? I, my, I mean, you know, my guess for that is, you know, there's a lot of problems that are basically like, we can do this when we have the country of geniuses in a data center.
在进入深入探讨之前,我想先回答一个更宏观的问题,就是,我对为我工作了六个月的人类编辑和为我工作了六个月的人工智能没有偏好。那么,你预测这样的情况会在哪一年发生?我的猜测是,这种情况实际上需要我们在数据中心里拥有一个天才国家时才能实现。

And so, you know, my, my, my, my picture for that is, you know, again, if you, if you, if you, you know, if you made me guess, it's like one to two years, maybe one to three years, it's really hard to tell. I have a, I have a strong view, 99, 95% that like, all this will happen in 10 years, like that's, I think that's just a super safe bet. Yeah. And then I have a hunch, this is more like a 50 50 thing that it's going to be more like one to two, maybe more like one to three.
翻译如下: 所以,你知道,我对这件事情的看法是,如果让我猜的话,可能是一到两年,也可能是一到三年,真的很难说。我非常坚定地认为,有99%到95%的把握,所有这些事情会在十年内发生,我认为这几乎是一个非常稳妥的预测。然后我感觉,这更像是五五开的可能性,即事情会在一到两年内发生,或者更接近一到三年。

So, one to three years, the country of geniuses, um, and the slightly less economically valuable task-evident videos. It seems pretty economically valuable. Let me tell you, it's just there are a lot of use cases like that, right? There are a lot of similar ones. Exactly. So you're predicting that within one to three years. Um, and in generally, and the topic is predicted that by late 26 or early 27, we will have the ice systems that are, quote, um, have the ability to navigate interfaces available to humans doing digital work today, intellectual capabilities, mashing or exceeding that of noble prize winners, and the ability to interface with the physical world.
所以,在一到三年内,天才之国,以及略微经济价值不高的任务性视频。看起来经济价值颇高。让我告诉你,实际上有很多类似的用例,对吧?有很多类似的情况。没错,所以你预测在一到三年内,普遍来说,预测到26年底或27年初,我们将拥有具备以下能力的智能系统:能够像今天的人类一样进行数字工作,智力水平匹敌或超过诺贝尔奖获得者,并且能够与物理世界进行交互。

And then you give an interview two months ago with Dealbook where you're emphasizing your, um, your companies more responsible compute scaling as compared to your competitors. And I'm trying to square these two views where if you really believe that we're going to have a country of geniuses, you, you want as big a data center as you can get, there's no reason to slow down the tam of a noble prize winner that is actually can do everything in overrides what I can do is like trillions of dollars.
两个月前你接受了Dealbook的采访,在采访中你强调你的公司在计算扩展方面比竞争对手更负责任。我试图理解这两种观点:如果你真的相信我们将拥有一个天才云集的国家,那你就会希望拥有尽可能大的数据中心。没有理由放慢一个诺贝尔奖得主潜力的开发,因为这样的潜力可以带来无法估量的价值,甚至可以超越我所能做到的一切。

And so, I'm trying to square this conservatism, uh, which seems rational if you have more moderate timelines with your stated views about AI progress. Yeah. So, so it actually all fits together. And we go back to this fast, but not infinitely fast diffusion. So like, let's say that we're making progress at this rate, um, you know, the technology is making progress this fast. Again, I have, you know, very high conviction that like it's going, you know, the, the, the, you know, we're, we're going to get there within a few years. I have a hunch that we're going to get there within a year or two.
所以,我试图协调这种保守观点,因为如果时间表更温和的话,这种观点似乎是合情合理的,这与您对人工智能进展的看法有什么关系。嗯,其实这一切都能结合在一起。我们回到这个快速但不是无限快的扩散。所以,比如说我们正在以这个速度取得进展,技术进步得这么快。我对我们在几年内达到目标抱有很强的信念。我有预感我们会在一两年内达到。

So a, a little uncertainty on the technical side, but like, you know, pretty, pretty strong confidence that it won't be off by much. What I'm less certain about is again, the economic diffusion side. Like I really do believe that we could have models that are a country of geniuses, a country of geniuses in the data center in one to two years. One question is how many years after that do the trillions in, you know, do the, do the trillions in revenue start rolling in?
在技术方面有一些不确定性,但我们有相当高的信心,偏差不会太大。而我不太确定的是经济扩散方面。我确实相信在一到两年内,我们可能会拥有一个相当于“天才之国”的数据中心模型。问题是,在那之后还需要多少年才能看到万亿级的收入开始涌入。

I don't think it's guaranteed that it's going to be immediate. You know, I think it could be one year. It could be two years. I could even stretch it to five years. Although I'm like, I'm skeptical of that. And so we have this uncertainty, which is even if the technology goes as fast as I suspect that it will, we don't know exactly how fast it's going to drive revenue. We know it's coming, but with the way you buy these data centers, if you're off by a couple years, that can be ruinous.
我认为这并不能保证会立刻发生。可能需要一年,也可能需要两年,甚至可能要到五年。虽然我对五年这个时间持怀疑态度。因此,我们面临着不确定性,即使技术的进步速度如我所预料的一样快,我们也不知道它具体会多快带来收入。我们知道这一天会来临,但对于购买这些数据中心的方式来说,如果时间差了几年,可能会造成很大的麻烦。

It is just like how I wrote, you know, in machines of loving grace, I said, look, I think we might get this powerful AI of this country of geniuses in the data center. That description you gave comes from the machines of loving grace. I said, well, get that 2026, maybe 2027 again. That is, that is my hunch wouldn't be surprised if I'm off by a year or two, but like, that is my hunch.
就像我写的那样,你知道的,在《爱的机器》中,我说,我觉得我们可能会在这个有天才的国家的数据中心里得到这个强大的人工智能。你刚才的描述就来源于《爱的机器》。我曾说,也许会在2026年,也可能是2027年。那是我的直觉,如果偏差一两年我也不会感到惊讶,但这就是我的直觉。

Let's say that happens. That's the starting gun. How long does it take to cure all the diseases, right? That's one of the ways that like drives a huge amount of economic value, right? Like you cure, you cure every disease. You know, there's a question of how much of that goes to the pharmaceutical company, to the AI company, but there's an enormous consumer surplus because everyone, you know, I assume we can get access for everyone, which I care about greatly.
假设这真的发生了,那就像是鸣枪起跑。那么,需要多长时间才能治愈所有疾病呢?这就是推动巨大经济价值的一种方式,对吧?就好像你治愈了所有疾病。问题在于,这其中有多少利益会流向制药公司或者人工智能公司,但对消费者来说会有巨大的剩余价值,因为每个人,我想,我们能够为所有人提供治疗,这点我非常关心。

We cure all of these diseases. How long does it take? You have to do the biological discovery. You have to, you know, you have to, you know, manufacture the new drug. You have to, you know, go through the regulatory process. I mean, we saw this with like vaccines in COVID, right? Like, there's just this, we got the vaccine out to everyone, but it took a year and a half, right? And so my question is, how long does it take to get the cure for everything, which AI is the genius that can in theory invent out to everyone?
我们能够治愈所有这些疾病。那么需要多长时间呢?首先,你需要进行生物学上的发现,还需要制造新药,并且要通过一系列的监管流程。就像我们在新冠疫苗上看到的那样,我们确实在一年半的时间内让疫苗普及到每一个人。我想问的是,要借助理论上可以发明出这些疗法的 AI 天才来实现所有疾病的治愈,到底需要多长时间呢?

How long from one that AI first exists in the lab to when diseases have actually been cured for everyone, right? And we've had a polio vaccine for 50 years. We're still trying to eradicate it in the most remote corners of Africa. And you know, the Gates Foundation is trying as hard as they can. Others are trying as hard as they can. But, you know, that's difficult. Again, I, you know, I don't expect most of the economic diffusion to be as difficult as that, right?
从人工智能首次在实验室中出现,到真正为所有人治愈疾病,这之间要花多长时间呢?我们已经有了50年的脊髓灰质炎疫苗,但仍然在努力消灭非洲最偏远角落的病例。盖茨基金会和其他组织都在尽最大努力去实现这个目标,但这很困难。不过,我不认为大多数经济效益的传播会像这个过程一样困难。

That's like the most difficult case. But, but there's a, there's a real dilemma here. And where I've settled on it is it will be, it will be, it will be faster than anything we've seen in the world, but it still has its limits. And, and so then when we go to buying data centers, you know, you, again, again, the curve I'm looking at is, okay, we, you know, we've had a 10X a year increase every year.
那就像是最困难的情况。不过,这里确实有一个真正的难题。而我对此的看法是,这将比我们在世界上见过的任何事物都要快,但它仍然有其局限性。所以,当我们去购买数据中心时,我关注的曲线是,我们每年都有10倍的增长。

So beginning of this year, we're looking at 10 billion in, in, in annual, in, you know, rate of annualized revenue at the beginning of the year. We have to decide how much compute to buy. And, you know, it takes a year or two to actually build out the data centers to reserve the data centers. So basically, I'm saying like in 2027, how much compute do I get?
所以今年年初,我们预计年化收入达到100亿美元。我们需要决定购买多少计算能力。要建设数据中心并进行预订通常需要一到两年的时间。所以基本上,我的意思是我们现在需要确定在2027年我们需要多少计算能力。

Well, I could assume that the revenue will continue growing 10X a year. So it'll be one, one, a hundred billion at the end of 2026 and one trillion at the end of 2027. And so I could buy a trillion dollars. Actually, we would be like five trillion dollars of compute because it would be a trillion dollar a year for, for five years, right? I could buy a trillion dollars of compute that starts at the end of 2027.
那么,我可以假设收入将继续以每年增长10倍的速度增长。因此,到2026年底,收入将达到一千亿,到2027年底将达到一万亿。所以我可以投入一万亿美元。实际上,我们会需要五万亿美元的计算资源,因为那会是每年一万亿,持续五年的需求,对吧?我可以购买从2027年底开始的一万亿美元的计算资源。

And if my, if my revenue is not a trillion dollars, if it's even 800 billion, there's no foreson earth, there's, there's no hedge on earth that could stop me from going bankrupt. If I, if I buy that much compute. And, and so even though a part of my brain wonders if it's going to keep growing 10X, I can't buy a trillion dollars a year of compute in, in, in, in, in, in, in, in, in, in, in, in 2027. If I'm just off by a year in that rate of growth or if the, the growth rate is five X a year instead of 10X a year, then, then, you know, then you go bankrupt. And, and, and so you end up in a world where, you know, you're supporting hundreds of billions, not trillions, and you accept, you accept some risk that there's so much demand that you can't support the revenue.
如果我的收入达不到一万亿美元,哪怕只有八千亿,那么即便我采购了那么多的计算资源,也没有什么力量可以阻止我破产。即使我脑海中的一部分在猜想收入会继续增长到10倍,我也无法在2027年每年购买价值一万亿美元的计算资源。如果我对增长速度的估计只差了一年或者增长速度是5倍而不是10倍,就会破产。因此,你可能会发现自己处于一个只能支持数千亿而不是数万亿的世界中,你需要接受这样的风险,即市场需求可能大到无法支持收入。

And you accept still some risk that, you know, you got it wrong and it's still slow. And so when I talked about behaving responsibly, what I meant actually was not the absolute amount that that actually was not, you know, I think it is true we're spending somewhat less than some of the other players. It's actually the other things like, have we been thoughtful about it? Or are we yoloing and saying, oh, we're going to do 100 billion dollars here, 100 billion dollars there? I kind of get the impression that, you know, some of the other companies have not written down the spreadsheet that they don't really understand the risks they're taking. They're just kind of doing stuff because it sounds cool.
你仍然需要接受一些风险,比如你可能会做错决定,结果还是很慢。所以当我谈到负责任地行事时,我其实不是指我们花费的绝对金额。虽然我们的花费可能确实比其他一些公司少一些,但我更关注的是,我们是否经过深思熟虑?或者我们是否只是冲动行事,说我们这里花1000亿美元,那里花1000亿美元?我有点感觉一些公司并没有把风险明细化,他们可能并不真正理解他们正在承担的风险,他们只是因为听起来很酷而去做一些事情。

And, and we thought carefully about it, right? We're an enterprise business. Therefore, you know, we can rely more on revenue. It's less fickle than consumer. We have better margins, which is the buffer between buying too much and buying too little. And so I think we bought an amount that allows us to capture pretty strong upside worlds. It won't capture the full 10 next year. And things would have to go pretty badly for us to be, for us to be in financial trouble. So I think we thought carefully and we've made that balance. And that's what I mean when I say that we're being responsible. Okay.
好的,我们仔细考虑过这个问题,对吧?我们是一家企业级的公司。因此,我们能够更加依赖于稳定的收入,而不像面向消费者的业务那么不稳定。我们的利润率也更高,这为购买过多或过少提供了缓冲空间。所以我认为,我们购买的量足以让我们在不错的市场环境中获益。即便在明年无法获得十倍的增长,情况也得非常糟糕我们才会陷入财务困境。因此,我相信我们经过仔细考虑,做到了平衡。这就是我所说的“负责任”的意思。

So it seems like it's possible that we're, we actually just have different definitions as a country of a genius in a data center. Because when I think of like actual human geniuses, an actual country of human geniuses in a data center, I'm like, I would happily buy five trillion dollars over the compute to run actual country of human geniuses in a data center. So let's say JP Morgan or Moderna or whatever it doesn't want to use them. I've got a country of geniuses that they'll, they'll start their own company. And if like, they can't start their own company and they're bottled like by clinical trials, it is worth stating with clinical trials like most clinical trials failed because the drug doesn't work. There's not efficacy, right?
所以,似乎我们有可能在国家层面上对数据中心中的“天才”这个概念有不同的定义。因为当我想到真正的人类天才,以及在数据中心里的一个由人类天才组成的“国家”时,我愿意花五万亿美元去购买计算资源来运行这个由天才组成的“国家”。假设像摩根大通或Moderna这样的公司不想使用这些天才,我手上有一个天才的国家,他们可以自己创办公司。如果他们无法创办公司并且被临床试验所限制,需要注意的是大多数临床试验失败的原因是药物无效,没有疗效,对吧?

And I make exactly that point in in machines of love and grace. I say the clinical trials are going to go much faster than we're used to, but not, not, not instant, not infinitely fast. And then suppose it takes a year to, for the clinical trials to work out so that you're getting revenue from that and can make more drugs. Okay. Well, you've got a country of geniuses and you're in AI lab and you have, you could use many more AI researchers. You also think there's these like self-reinforcing gains from, you know, smart people working on AI tech. So like, okay, you can have that. That's right.
在《爱的机器》一书中,我正是指出了这个观点:临床试验的速度将比我们过去习惯的要快得多,但不是瞬间完成,也不是无限快。假设临床试验需要一年时间才能产生收益,然后您可以用这些收益开发更多药物。好的,现在您拥有一个充满天才的国家,并且身处一个AI实验室中,您可以利用更多的AI研究人员。您还认为聪明的人们在AI技术上工作的过程中会产生自我强化的收益。所以,没错,您可以拥有这一切。

But you can have the data center working on and like AI progress. Is there more gains from buying like substantially more gains from buying a trillion dollars a year of compute versus 300 billion dollars a year of compute? If your competitors buying a trillion, yes, there is. Well, then there's some gain, but then, but again, there's this chance that they go bankrupt before, you know, be, again, if you're off by only a year, you destroy yourselves. That's the balance. We're buying a lot. We're buying a hell of a lot. Like we're not, we're, you know, we're buying an amount that's comparable to that, that, you know, the biggest players in the game are buying.
您可以让数据中心进行AI相关的工作。购买每年一万亿美元的计算能力相比于每年3000亿美元的计算能力,能带来更多收益吗?如果您的竞争对手在购买一万亿美元的计算能力,那么确实会有更多收益。但是,也有可能他们在实现收益之前就破产了。如果时间判断错了一年,你可能会因此失败。这就是需要权衡的地方。我们已经在大量购买,非常大量的购买。我们购买的规模与业内最大玩家相当。

But, but if you're asking me, why haven't we signed, you know, 10, 10 trillion of compute starting and starting in mid 2027? First of all, it can't be produced. There isn't that much in the world. But, but second, what is the country geniuses comes, but it comes in mid 2028 instead of mid 2027. You go bankrupt. So if your projection is winded at three years, it seems like you should have won 10 trillion dollars of compute by, um, 2029, 2020 and maybe 2020. I like this. Like, I mean, you know, you, you, you, you, like, it seems like even in your, the longest version of the timelines you state, the compute you are wrapping up to build doesn't seem what, what, what, what makes you think that?
但如果你问我,为什么我们没有在2027年年中开始签约10万亿计算能力,首先,这是不可能实现的。全世界都没有那么多资源。其次,如果国家的天才计划在2028年年中而不是2027年年中才实现,你会破产。所以,如果你的预测放宽到三年,好像到2029年、2020年甚至2020年你应该需要10万亿的计算能力。就算是在你所陈述的时间线最长的版本中,你正在努力建造的计算能力也不像是你所描述的那么高。是什么让你这么认为呢?

Well, you, as you said, you want to wind the 10 trillion, like human wages, let's say, are, um, on your order 50 trillion a year. If you look at, so, so I won't, I won't talk about anthropic in particular, but if you talk about the industry, like, um, the amount of compute the industry, you know, the, the, the amount of compute the industry is building this year is probably in the, you know, I don't know, very low tens of, you know, call it 10-15 gigawatts next year. I, you know, it goes up by roughly three acts a year. So like next year's third year 40 gigawatts and, um, 2028 might be a hundred, 2029 might be like three, 300 gigawatts and like each gigawatt costs like, um, maybe 10, I mean, I'm doing the math in my head, but each gigawatt cost maybe 10 billion dollar, you know, or order 10 to 15 billion dollars a year.
好的,如你所说,你想掌控10万亿,比如说人类的工资每年约为50万亿。就算我不特别谈论Anthropic这个公司,但如果我们谈论这个行业的话,比如,产业今年所建设的计算能力大约是十几吉瓦,也许是10到15吉瓦。明年,它可能会增长约三倍,所以明年大约是30到40吉瓦,到2028年可能会达到100吉瓦,到2029年可能会达到300吉瓦。每吉瓦的成本大约是100亿到150亿美元。我这是在心算,这样每年的总成本就是如此。

So, you know, you, you, you know, you, you, you put that all together and you're getting about, about what you described, you're getting multiple trillions a year by 2028 or 29. So you're, you're, you're getting exactly that. You're getting, you're getting exactly what you predict. Um, that's for the industry. That, that's for the industry. That's how, suppose that the Robbys compute keeps three acts in a year and then by like 27, you have, uh, or 2728, you have 10 gigawatts and like multiply that by, as you say, um, 10 billion. So then it's like a hundred billion a year. But then you're saying the 10 by 2029. I don't want to give exact numbers for a topic, but, but these numbers are too small. These numbers are too small.
所以,你知道的,当把这一切放在一起时,到2028年或2029年,每年的产值能达到几万亿,所以结果就是你预测的那样。这是针对整个行业的,假设Robby的计算能力每年增加三倍,然后到2027年或2028年,你有10吉瓦的能力,再乘以你说的100亿,所以每年就是1000亿。但你说到2029年能达到10倍。我不想针对一个具体话题给出精确的数字,但这些数字其实太小了。

Okay, interesting. I'm really proud that the puzzles I've worked on with Jane Street have resulted in them hiring a bunch of people for my audience. Well, they're still hiring and they just send me another puzzle. For this one, they spent about 20,000 GPU hours training backdoors into three different language models. Each one has a hidden prompt that elicits completely different behavior. You just have to find the trigger. This is particularly cool because finding backdoors is actually an open question in Frontier AI research. And Theropik actually released a couple of papers about sleeper agents and they show that you can build a simple classifier on the residual stream to detect when a backdoor is about to fire.
好的,有趣。我很自豪我与Jane Street合作设计的谜题帮助他们从我的观众中招聘了很多人。现在,他们仍在招聘,并且刚刚又给我发送了一个新谜题。为了这个新谜题,他们花了大约20,000小时的GPU时间,在三个不同的语言模型中植入了后门。每个模型都有一个隐藏的提示词,会触发完全不同的行为。你只需要找到这个触发词。这特别酷,因为发现后门实际上是前沿人工智能研究中的一个未解决的问题。而Theropik实际上发布了几篇关于“潜伏代理”的论文,展示了你可以在残差流上构建一个简单的分类器,以检测后门何时即将被触发。

But they already knew what the triggers were because they built them. You don't. And it's not feasible to check the activations for all possible trigger phrases. Unlike the other puzzles they made for this podcast, Jane Street isn't even sure this one is solvable. But they've set aside $50,000 for the best attempts and write-ups. The puzzles live at JaneStreet.com slash WorkHash. And they're accepting submissions until April 1st. All right, back to Daria. You've told investors that you plan to be profitable starting in 28. And this is the year where we're like potentially getting the country of geniuses at a data center. And this is like going to now unlock all this progress and medicine and health and et cetera, et cetera.
但他们已经知道触发因素是什么了,因为是他们自己设计的,你却不知道。而且,检查所有可能的触发短语的激活情况并不现实。与他们为这个播客制作的其他谜题不同,Jane Street甚至不确定这个谜题是否可以解开。但他们为最好的解答和分析报告准备了5万美元。谜题可以在JaneStreet.com/WorkHash找到。他们接受提交截止到4月1日。好吧,回到达莉娅的话题。你告诉投资者,你计划在2028年开始实现盈利。这一年,我们可能会为一个数据中心吸引一大批天才,这将开启医学、健康等各个领域的所有进步。

And new technologies. Wouldn't this be exactly the time where you'd like want to reinvest in the business and build bigger countries so they can make more to service? So, I mean, profit and profitability is this kind of like weird thing in this field. I don't think in this field profitability is actually a measure of spending down versus investing in the business. Like, let's just take a model of this. I actually think profitability happens when you underestimated the amount of demand you were going to get and loss happens when you overestimated the amount of demand you were going to get because you're buying the data centers ahead of time. So, think about it this way. Ideally, you would like, and again, these are stylized facts. These numbers are not exact for an entire time. I'm just trying to make a toy model here.
翻译:以及新技术。这难道不是一个你希望重新投资于业务,并将国家发展壮大的时机,以便他们能够更好地提供服务吗?所以,我的意思是,利润和盈利在这个领域有点奇怪。我认为在这个领域,盈利并不是衡量支出与投资业务之间的一个指标。让我们来举一个例子。我实际上认为,盈利发生在你低估了将获得的需求量时,而亏损则发生在你高估了将获得的需求量时,因为你提前购买了数据中心。所以,这样考虑一下。理想情况下,你会希望——再说一次,这些是简化的事实。整个时间段内的这些数字并不精确。我只是想在这里构建一个简单的模型。

Let's say half of your compute is for training and half of your compute is for inference. And the inference has some gross margin that's more than 50 percent. And so, what that means is that if you were in steady state, you build a data center. If you knew exactly exactly exactly the demand you were getting, you would get a sort of amount of revenue, say, I don't know, let's say you pay $100 billion a year for compute. And on $50 billion a year, you support $150 billion of revenue and the other 50 billion are used for training. So, basically, you're profitable. You make $50 billion a profit. Those are the economics of the industry today. Or sorry, not today, but like that's where we're projecting forward in a year or two. The only thing that makes that not the case is if you get less demand than 50 billion, then you have more than 50 percent of your data center for research and you're not profitable.
假设您的计算资源中一半用于训练,另一半用于推理。而推理部分的毛利率超过50%。这意味着,如果您处于稳定状态,并准确掌握需求,您可以建一个数据中心。如果您每年为计算支付1000亿美元,那么其中50亿美元可支持1500亿美元的收入,其余50亿美元用于训练。基本上,您是有盈利的,盈利额达500亿美元。这就是今天行业的经济模型,或者更确切地说,这是一两年后我们预期的情况。唯一会改变这种情况的是,如果您的需求低于500亿美元,那么您将有超过50%的数据中心用于研究,这样就无法盈利。

So, you train stronger models, but you're not profitable. If you get more demand than you thought, then your research gets squeezed. But you're able to support more inference and you're more profitable. So, maybe I'm not explaining it well, but the thing I'm trying to say is you decide the amount of compute first. And then you have some target desire of inference versus training. But that gets determined by demand. It doesn't get determined by you. But what I'm hearing is the reason you're predicting profit is that you are systematically under-investing in compute. Because if you actually... No, no, no, no, no, I'm saying it's hard to predict. So, these things about 2028 and one will happen. That's our attempt to do the best we can with investors. All of this stuff is really uncertain because of the cone of uncertainty. We could be profitable in 2026 if the revenue grows fast enough.
所以,你训练了更强的模型,但并没有盈利。如果需求比你预期的要高,那么你的研究经费就会被挤压。但如果你能支持更多的推理任务,你就会更有利可图。所以,也许我解释得不好,但我的意思是你需要先决定计算资源的投入量。然后你会有一个关于推理和训练之间的目标期望。但这个比例是由需求决定的,不是由你决定的。不过,我听到的原因是你们预测能盈利是因为你们在计算资源上系统性地投资不足。因为如果你真的……不,不,不,我的意思是这很难预测。关于2028年的预期和某些事务会发生是我们对投资者尽力而为的尝试。所有这些事情都很不确定,因为有许多不确定因素。如果收入增长足够快,我们可能在2026年就实现盈利。

And then if we overestimate or underestimate the next year, that could swing wildly. What I'm trying to get is you have a modeling your head of the business invest, invest, invest, invest, get scale, and kind of then becomes profitable. There's a single point at which things turn around. I don't think the economics of this industry work that way. I see. So, if I'm understanding correctly or saying because of the discrepancy between the amount of compute we should have gotten, and the amount of compute we got, we were like sort of forced to make profit. But that doesn't mean we're going to continue making profit. We're going to like reinvest the money because, well, now AI has made so much progress and we want the bigger country of geniuses. And so then back into a revenue is high, but losses are also high.
如果我们对下一年的估计过高或过低,可能会造成大的波动。我想表达的是,你可能在心中有一个商业模型:不断投资、投资、投资,以扩大规模,然后就会盈利。但我认为这个行业的经济运作并不是这样的。明白了,如果我理解正确的话,因为我们获得的计算能力与我们本应该得到的计算能力之间存在差异,我们有点被迫盈利。但这并不意味着我们会继续盈利,我们会再投资这些钱,因为现在AI已经取得了巨大的进步,我们希望培养一个更多天才的更大国家。因此,即使收入很高,亏损也可能很大。

If we predict, if every year we predict exactly what the demand is going to be, we'll be profitable every year because spending 50% of your compute on 50% of your compute on research, roughly, plus a gross margin that's higher than 50% and correct demand prediction leads to profit. That's the profit. That's the profitable business model that I think is kind of like there, but obscured by these building ahead and prediction errors. I guess you're treating the 50% as a sort of like, just like a given constant. Whereas, in fact, if the I progress is fast and you can increase the progress by scaling up more, you just have more the 50% and not be profit. Here's what I'll say. You might want to scale up it more. You might want to scale it up more, but remember the log returns to scale. If 70% would get you a very little bit of a smaller model through a factor of 1.4x, that extra $20 billion is each dollar there is worth much less to you because the log linear setup.
如果我们每年都能准确预测需求,我们每年都会盈利。因为将50%的计算能力花在研究上,再加上高于50%的毛利率,以及正确的需求预测,都会导致盈利。这就是一种盈利的商业模式,但这种模式经常被超前建设和预测误差掩盖。我猜你将50%视作一个既定的常数。然而,实际上,如果进展非常快,并且你可以通过扩大规模来进一步加速,就能超过这50%,但却不一定会盈利。我要说的是,你可能希望进一步扩大规模,但要记住规模扩大的收益是对数递减的。如果增加到70%计算能力,只能让模型增大约1.4倍,那么额外的200亿美元中每一美元的价值都会大幅降低,因为对数线性设定的缘故。

You might find that it's better to invest that $20 billion in serving inference or in hiring engineers who are kind of better what they're doing. So the reason I said 50%, that's not exactly our target. It's not exactly going to be 50%. It will probably vary over time. What I'm saying is the log linear return, what it leads to is you spend of order one fraction of the business, not 95% and then you get diminishing returns because the log scale. Everyone's changing. It's convincing Dario to believe in AI progressers. You don't invest in research because it has diminishing returns, but you invest in the other things you mentioned. Again, we're talking about diminishing returns after you're spending 50 billion a year. This is a point I'm sure you would make. But diminishing returns on a genius could be quite high.
你可能会发现,将那200亿美元用于推理服务或聘请拥有更好技能的工程师可能更为划算。我提到50%这个数字,但这并不是我们的确切目标,也不一定真的就是50%,而是会随着时间有所变化。我想表达的是,在对数线性收益的情况下,您会花费企业业务中的一部分(而不是95%)来避免收益递减的情况,因为对数尺度问题。每个人都在改变,需要说服达里奥相信人工智能的进步。您不会对研究进行投资,因为它的回报递减,但您会投资于您提到的其他事项。再说一遍,我们谈论的是在每年花费500亿后,才会出现收益递减的问题。我肯定您会同意这一观点。但是,对天才的投资即使遇到收益递减,其回报仍然可能相当可观。

And more generally, what is profit in the market economy? Profit is basically saying other companies in the market can do more things with this money that I can. Put aside then, I'm just trying to, because I don't want to give information about an entropic is why I'm giving these stylized numbers. But let's just derive the equilibrium of the industry. Why doesn't everyone spend 100% of their compute on training and not serve any customers? It's because if they didn't get any revenue, they couldn't raise money, they couldn't do compute deals, they couldn't buy more compute. the next year. So there's going to be an equilibrium where every company spends less than 100% on training. And certainly less than 100% on inference.
在市场经济中,利润到底是什么?基本上,利润意味着市场中的其他公司能够用这些钱做更多的事情。暂且不提那些,因为我不想泄露关于熵的问题,所以我用了这些简化的数字。现在,我们来推导一下这个行业的均衡。为什么没有公司把100%的计算资源只用于训练,而不服务客户呢?因为如果他们不产生任何收入,就无法筹集资金,无法达成计算交易,也无法在下一年购买更多的计算资源。因此,每家公司都不会把100%的资源花在训练上,也不会把100%用于推理上。

It should be clear why you don't just serve the current models and never train another model because then you don't have any demand because you'll fall behind. So there's some equilibrium. It's not going to be 10%, it's not going to be 90%. Let's just say as a stylized fact, it's 50%. That's what I'm getting at. I think we're going to be in a position where that equilibrium of how much you spend on training is less than the gross margins that you're able to get on compute. And so the underlying economics are profitable. The problem is you have this hellish demand prediction problem when you're buying the next year of compute.
应该很清楚,为什么不能只使用现有的模型而不再训练新的模型,因为这样做会导致需求下降,最终落后于市场。因此,需要达到一定的平衡。这种平衡不会是10%,也不会是90%,可以理解为在50%左右,这就是我要表达的。我们可能会处于这样一种状态,即对模型训练的投入低于从计算中获得的毛利,因此整体经济状况是有利可图的。问题在于,当你为下一年的计算资源进行采购时,面临一个非常复杂的需求预测问题。

And you might guess under and be very profitable but have no compute for research. Or you might guess over and you're not profitable and you have all the compute for research and world. Does that make sense? Just as a dynamic model of industry? Maybe stepping back, I'm like, I'm not saying I think the country of genius is going to come in two years and therefore you should buy this compute. To me, what you're saying, the end conclusion you're arriving at makes a lot of sense. But that's because it's like, oh, it seems like country of genius is as hard and there's a long way to go.
你可能会猜测低了,这样会非常盈利,但没有用于研究的计算能力。或者你可能会猜测高了,这样就没有盈利,但却有用于研究和世界发展的计算能力。作为一个行业的动态模型,这样解释有道理吗? 也许换个角度讲,我并不是在说未来两年内“天才之国”就会出现,因此你应该购买这些计算设备。在我看来,你得出的结论非常合理。但这是因为“天才之国”似乎是一个困难的目标,我们还有很长的路要走。

And so the stepping back, the thing I'm trying to get at is more like, it seems like your world view is compatible with somebody who says we're like 10 years away from a world in which like we're generating trillion dollars. That's just not my view. Yeah, that is not my view. So I'll make another prediction. It is hard for me to see that there won't be trillions of dollars in revenue before 2030. Like I can instruct a plausible world. It takes maybe three years. So that will be the end of what I think it's plausible.
我想表达的是,当我退一步来看这个问题时,我觉得你的世界观似乎与某些人说我们距离一个能创造出数万亿美元的世界还有10年这种看法是相符的。而我并不认同这种观点。我预测未来很难想象到2030年之前不会创造出数万亿美元的收入。我可以设想一个合理的世界,这大概可能需要三年的时间。这就是我认为合理的期限。

Like in 2028, we get the real country of geniuses in the data center. You know, the revenues have been going into the, maybe as it is in the low hundreds of billions by by by by 2028. And and then the country of geniuses accelerates it to trillions. You know, and we're basically on the slow end of diffusion. It takes two years to get to the trillions. That that would that that would be the world where it takes until that would be the world where it takes until 2030.
在2028年,就像我们在数据中心中发现真正的天才国度一样。你知道,到那时,收入可能只是低几千亿。然而,一旦天才国度加速,它将跃升至数万亿。你知道,我们基本上是在缓慢扩散的一端。需要两年的时间才能达到万亿的规模。那将是一个需要到2030年才能实现的世界。

I I suspect even composing the technical exponential and diffusion exponential will get there before 2030. So you laid out a model where anthropic makes profit because it seems like fundamentally we're in a compute constrained world. And so it's like eventually we keep growing compute. No, I think I think the way the profit comes is again, and you know, let's let's just abstract the whole industry here.
我猜测,即使是技术指数和扩散指数的结合,也可能会在2030年之前达到目的。你提出了一个模型,认为Anthropic公司会盈利,因为从根本上看,我们处于一个计算能力受限的世界中。因此,随着计算能力的不断增长,最终,我们会达到目标。我认为利润的产生方式依然存在,让我们抽象地来看整个行业。

Like we have a you know, let's just imagine we're we're in like an economic textbook. We have a small number of firms each can invest a limited amount in you know, or or like each can invest some fraction fraction in R&D. They have some marginal cost to serve the margins on that the profit margin the gross profit margins on that marginal cost are like very high because because because inference is efficient. There's some competition, but the models are also differentiated. There's some there's some, you know, companies will compete to push their research budgets up. But like because there's a small number of players, you know, we have the I what is it called an economic the corno equilibrium? I think is what the what a small number of firm equal equilibrium is it the point is it doesn't equilibrate to perfect competition with with with with with with zero margins.
让我们想象一下,就像在经济学教科书中一样,我们有少数几家公司,每家公司可以在研发中投资一定的份额。它们在服务时有一定的边际成本,而在这些边际成本上的利润率很高,因为推理过程效率很高。虽然存在竞争,但模型也有差异化。这些公司会为了增加其研究预算而竞争。但由于参与者数量较少,我们达到了一种经济学上的均衡状态,我想这叫做寡头均衡(Cournot Equilibrium)。重点是,这种情况下不会出现完美竞争,也就是利润率趋于零的情况。

If there's like three firms, if there's three firms in the economy, all are kind of independently behaving rationally. It doesn't equilibrate to zero. Help me understand that because right now we do have three leading firms and they're not making profit. And so what what what what is changing? Yeah. So the the again the gross margins right now are very positive. What's happened? What what's happening is a combination of two things. One is we're still in the exponential scale up phase of compute. Yeah. So what basically what that means is we're training like a model gets trained. Yeah. It costs, you know, let's say a model got trained that costs a billion dollars last year. And then this year it produced four billion dollars of revenue and cost one billion dollars to to to to to inference from.
如果在经济体中有三家公司,且每家公司都在相对独立地进行合理操作,那这并不会让结果趋于零。请帮我理解这点,因为目前确实有三个领头公司,它们并没有盈利。那么,什么在发生变化呢?其实,目前的毛利率非常可观。发生的情况是两个因素的结合。其一是我们仍处在计算能力的指数扩张阶段。其基本意思是,例如,一个模型的训练费用去年为十亿美元,今年它创造了四十亿美元的收入,而推理花费了一亿美元。

So you know, again, I'm using stylized number here. But you know, the 75% you know, gross gross gross margins and you know this this 25% tax. So that model as a whole makes two billion dollars. But at the same time, we're spending 10 billion dollars to train the next model because there's an exponential scale up. And so the company loses money. Each model makes money, but the company loses money. The equilibrium I'm talking about is an equilibrium where we have the country of geniuses. We have the country of geniuses in the data center. But that that model training scale up has equilibrated more. Maybe maybe it's still it's still going up. We're still trying to predict the demand, but it's more it's more leveled out.
翻译后的内容是: 所以你知道,我这里用了简化的数字。我们有75%的毛利率,还有25%的税。因此,这个模型整体赢利是20亿美元。但同时,我们正在花100亿美元训练下一个模型,因为存在指数级的扩张。因此,公司整体是亏损的。每个模型都赚钱,但公司整体上亏钱。我所说的平衡是一种拥有“天才之国”的平衡,我们在数据中心有“天才之国”。但那个模型训练的扩张已经比较平衡了。也许它仍在增长,我们仍在试图预测需求,但总体来说已经比较稳定。

I'm giving you a couple of things there. So let's start with the current world. In the current world, you're right that as you said before, if you treat each individual model as a company, it's profitable. Yes. Of course, a big part of the production function of being a frontier lab is training the next model, right? So if you didn't do that, then you'd make profit for two months. That's right. And you wouldn't have margins because you wouldn't have the best model. And then so you can make profits for two months in the current world. At some point, that reaches the biggest scale that it can reach. And then and then in equilibrium, we have algorithmic improvements, but we're spending roughly the same amount to train the next model as we as we as we spend to train the current model.
我给你一些信息。让我们先从当前的世界开始。在当前的世界中,你说得对,就像之前提到的,如果把每个单独的模型看作一家公司,它是有盈利能力的。的确如此。当然,成为前沿实验室的重要部分之一就是训练下一个模型,对吧?如果不采取这样的方式,那么你可以盈利两个月。这就对了。但因为没有最好的模型,你的利润空间也不会很大。因此,在现在的世界中,你只能在两个月之内盈利。某种程度上,这会达到一个最大的规模。接下来,在平衡状态中,我们会有算法的改进,但我们花在训练下一个模型上的钱和花在训练当前模型上的钱基本是相同的。

So this equilibrium relies, I mean, at some point, at some point, you run out of money in the economy, a fixed lump of labor or a file. The economy is going to grow, right? That's one of your predictions. Well, we're going to have yes, but this is this is this space. But this is another example of the theme I was talking about, which is that the economy will grow much faster with AI than I think it ever has before. But it's not like right now, the computer's growing three ex a year. Yeah. I don't believe the economy is going to grow 300% a year. Like I said, this in machines of loving grace, like I think we may get 10 or 20% per year growth in the economy, but we're not going to get 300% growth in the economy.
这个平衡依赖于某种程度上,也就是说,当经济中的资金用完,固定的劳动量或某种资源耗尽时,经济仍然会增长,对吧?这是你的预测之一。是的,在这个领域,我们会有这样的增长。但这只是我所说主题的另一个例子,也就是经济将因人工智能而比以往更快地增长。不过这并不是说计算机现在每年增长300%。我不认为经济会以每年300%的速度增长。正如我在《机器的爱之恩》一书中所说,我认为我们可能会实现每年10%或20%的经济增长,但不会达到300%的增长。

So I think I think in the end, if compute becomes the majority of what the economy produces, it's going to be capped by that. So let's, okay, now let's assume a model where compute stays capped. Yeah. The world where Frontier labs are making money is one where they continue to make fast progress because fundamentally your margin is limited by how good the alternative is. And so you are able to make money because you have a frontier model. If you did not for a tier model, you wouldn't be making money. And so this model requires there never to be a steady state. Like forever and ever, you keep making more algorithm progress.
我认为最终,如果计算能力成为经济产出的主要部分,它会因此受到一定限制。好,现在让我们假设一种模型,在这种模型中,计算能力保持受限。在这样的世界中,前沿实验室能够盈利,是因为它们能够持续快速进步,根本上来说,你的利润空间是受限于替代品的优劣程度的。因此,你能够赚钱是因为你拥有一个前沿模型。如果没有这样的前沿模型,你就没有盈利能力。所以,这种模型要求永远不存在一个稳定状态,而是需要不断在算法上取得进步。

I don't think that's true. I mean, I feel I feel like we're like we're taught, we're, you know, we're, they feel like this is an economics. Like, you know, this is like an economics. We never stop talking about economics. We never, we never stopped talking about economics. So no, but, but there are, there are worlds in which, you know, there, so I don't think this field is going to be a, I don't think this field is going to be a monopoly. All my lawyers never want me to say it. But I don't think this field is going to be a monopoly. But, but you do get, you get industries in which there are small number of players, not one, but a small number of players.
我不认为那是真的。我的意思是,我觉得我们被教导,我们一直在谈论经济学。我们从未停止过谈论经济学。所以,我不认为这个领域会成为垄断。尽管我的律师不希望我这样说,但我真的不认为这个领域会形成垄断。不过确实有些行业只有少数几个参与者,不是一家,但也仅是几个。

And ordinarily, like the way you get monopolies like Facebook or, or meta, I always call them Facebook, but is these kind of net, is these kind of these kind of network effects. The way you get industries in which there are small number of players are very high costs of entry, right? So, you know, a cloud is like this. I think cloud is a good example of this. You have three, maybe four players within cloud. I think, I think that's the same for AI three, maybe four. And the reason is that it's, it's so expensive. It requires so much expertise and so much capital to like run a cloud company, right?
通常情况下,像 Facebook 或 Meta 这样的垄断公司,形成的原因之一是网络效应。这种情况下,行业内只有少数几个参与者,因为进入门槛非常高。比如云计算就是一个很好的例子。云计算领域可能只有三个、四个参与者。我认为,人工智能也是类似的情况,可能也只有三到四个公司。这是因为运行一家云计算公司需要巨大的花费,要求大量的专业知识和资本投入。

And so, you have to put up all this capital. And then in addition to putting up all this capital, you have to get all this other stuff that like, you know, requires a lot of skill to, you know, to make it happen. And so, it's like, if you go to someone and you're like, I want to disrupt this industry, here's $100 billion. You're like, okay, I'm putting $100 billion and also betting that you can do all these other things that these people have been doing. You can create the profit and then, and then the effect of your entering is that is the profit margins go down.
所以,你需要投入大量资金。而且除了投入这么多资金之外,你还需要获得其他很多需要高超技能才能实现的东西。这就像是,如果你去找一个人说,我想颠覆这个行业,这里有1000亿美元。他们会想,好吧,我投入1000亿美元,同时还在赌你能做到其他那些人一直在做的事情。你能够创造利润,然后你进入市场的结果就是利润率下降。

So, you know, we have equilibrium like this all the time in the economy where we have a few, we have a few players, profits are not astronomical. Margins are not astronomical, but they're not zero, right? And, you know, I think that's what we see on cloud. Cloud is very undifferentiated. Models are more differentiated than cloud, right? Like, everyone knows, cloud is good at different things than GPT is good at is then, then Gemini is good at. And it's not just clods good at coding, GPT is good at, you know, math and reasoning, you know, it's more subtle than that.
在经济中,我们经常会看到这样的平衡:有少数几个参与者,利润不是特别高,但也不为零。云计算就是一个这样的例子。云计算本身差异不大,而各种模型之间的差异性更大。大家都知道,云计算擅长的领域和GPT擅长的领域,以及Gemini擅长的领域各不相同。不只是简单地说云计算擅长编程,GPT擅长数学和推理,其中的差别更加微妙。

Like, models are good at different types of coding. Models have different styles. Like, I think, I think these things are actually, you know, quite different from each other. And so, I would expect more differentiation than you see in cloud. Now, there actually is a counter, there is one counter argument. And that counter argument is that if all of that, the process of producing models, becomes, if AI models can do that themselves, then that could spread throughout the economy. But that is not an argument for commoditizing AI models in general that's kind of an argument for commoditizing the whole economy at once.
就像,不同的模型擅长不同类型的编程。模型有不同的风格。我认为,这些东西实际上是非常不同的。因此,我预计它们之间的差异会比你在云计算中看到的更明显。不过,这里有一个相反的观点。这个相反的观点是,如果所有这些——也就是生产模型的过程——都能由AI模型自行完成,那么这可能会影响整个经济。但这并不是说AI模型本身会变得商品化,而更像是说整个经济都会同时商品化。

I don't know what quite happens in that world where basically anyone can do anything, anyone can build anything. And there's like no mode around anything at all. I mean, I don't know, maybe we want that world. Like, maybe that's the, maybe that's the end state here. Like, maybe, maybe, you know, maybe when, when, when, when kind of AI models can do, you know, when, when, when, when AI models can do everything, if we've solved all the safety and security problems, like, you know, that's one of the, one of the mechanisms for, for, you know, you know, just, just kind of the economy flatten itself again.
我不知道在那样一个世界里会发生什么,在那里几乎任何人都能做任何事,任何人都可以建造任何东西。而且似乎没有任何限制。我是说,我不知道,也许我们想要那样的世界,也许那就是最终的状态。也许,当AI模型可以做所有事情的时候,如果我们解决了所有的安全问题和隐患,也许这就是让经济重新平等化的一种方式。

But, but that's kind of like post like, far post country, this isn't a data center. Maybe a, a fine array to put that potential point is one, it seems like AI research is especially loaded on raw intellectual power, which will be especially a button in a world with AGI. And two, if you just look at the world today, there's very few technologies that seem to be diffusing as fast as, as AI algorithmic progress. And so that does hint that this industry is sort of structurally diffusive. So I think coding is going fast, but I think AI research is a super set of coding and their aspects of it that are not going fast.
这有点像一个偏远地区,而不是数据中心。可能可以用一个好的观点来解释。一方面,AI研究特别依赖于纯粹的智力,这在一个拥有通用人工智能的世界中尤其重要。另一方面,如果你看看当今的世界,很少有技术能像AI算法的进步那样迅速传播。这暗示着这个行业在结构上是具有扩散性的。所以我认为编程进展很快,但AI研究是编程的超级集,其中有些方面进展并不那么快。

But I, but I do think again, once we get coding, once we get AI models going fast, then, you know, AI, you know, that will speed up the ability of AI models to kind, to kind of do everything else. So I think while coding is going fast now, I think once the AI models are building the next AI models and building everything else, the kind of whole, the whole economy will side it kind of go at the same pace. I am, I am worried geographically though. I'm a little worried that like just proximity to AI, having heard about AI, that that may be one differentiator.
我认为,一旦我们开始进行编程,一旦AI模型迅速发展,那么AI将加速实现其他各种任务的能力。我觉得虽然当前编程发展很快,但当AI模型可以自己创建下一个AI模型并且处理其他事务时,整个经济发展也会随之加速。不过,我对地理因素有些担忧。我有点担心,接触AI和了解AI的地域差异可能会成为一个决定性因素。

And so when I said the like, you know, 10 or 20% growth rate, a worry I have is that the growth rate could be like 50% in Silicon Valley. And you know, parts of the world that are kind of socially connected to Silicon Valley. And you know, not that much faster than its current pace elsewhere. And I think that'd be a pretty messed up world. So one of the things I think about a lot is how to prevent that.
所以,当我提到10%或20%的增长率时,我担心的是,在硅谷以及与硅谷社会联系紧密的地区,增长率可能高达50%。而在其他地方,增长速度可能并没有快多少。我认为那样的世界会很不平衡。所以,我经常思考如何防止这种情况的发生。

Yeah. Do you think that once we have this kind of geniuses of data center that robotics is sort of quickly solved afterwards because it seems like a big problem with robotics is that a human can learn how to tell operate current hardware. But current AI models can't, at least not in a way that's super productive. And so if we have this ability to learn like a human, should it solve robotics immediately as well?
好的。你认为一旦我们拥有这种数据中心的天才,机器人技术的问题会很快解决吗?因为机器人技术的一个大问题是,人类可以学习如何操作当前的硬件,但目前的AI模型却不能,至少不能以一种非常高效的方式做到。如果我们具备像人类一样的学习能力,是否也应该能够立即解决机器人技术的问题呢?

Yeah. I don't think it's dependent on learning like a human. It could happen in different ways. Again, we could have trained the model on many different video games, which are like robotic controls or many different simulated robotics environments, or just you know, train them to control computer screens and they learn to generalize.
是的。我认为这不依赖于像人类那样学习。它可能以不同的方式发生。我们可以在许多不同的视频游戏上训练模型,这些游戏就像是机器人控制,或者在许多不同的模拟机器人环境中进行训练,或者简单地说,训练它们去控制计算机屏幕,从而让它们学会泛化。

So it will happen, it's not necessarily dependent on human like learning. Human like learning is one way. It could happen if the model is like, oh, I pick up a robot. I don't know how to use it. I learn that that could happen because we discovered a discovering continual learning that could also happen because we train the model on a bunch of environments and then generalized or it could happen because the model learns that in the context length.
因此,这件事是会发生的,但不一定依赖于类似人类的学习。类似人类的学习只是其中一种方式。这种情况可能会出现,比如说,模型就像:"哦,我拿起一个机器人。我不知道怎么用,"然后通过尝试学习。也可能因为我们发现了持续学习的方式而实现。还有一种情况是,我们在多种环境中训练模型,使其能够进行泛化。或者,这种情况也可能因为模型在不同情境中学会了相关知识而发生。

It doesn't actually matter which way if we go back to the discussion we had like like an hour ago, that type of thing can happen in that type of thing can happen in several different ways. But I do think when for whatever reason the models have those skills, then robotics will be revolutionized both the design of robots because the models will be much better than humans at that.
其实这并不重要,因为无论哪种方式,如果我们回到大约一个小时前的讨论,这种事情可以以多种方式发生。但我确实认为,无论出于何种原因,当模型具备这些技能时,机器人技术将迎来变革。因为在这方面,模型的表现将远超人类,从而彻底改变机器人的设计。

And also the ability to kind of control robots. So we'll get better at the building the physical hardware building the physical robots and we'll also get better at controlling it. Now, does that mean the robotics industry will also be generating trillions of dollars of revenue? My answer there is yes, but there will be the same extremely fast but not infinitely fast diffusion.
以及控制机器人的能力。因此,我们将在构建物理硬件和实体机器人方面变得更出色,同时在控制它们的能力上也会有所提升。那么,这是否意味着机器人行业将创造数万亿美元的收入呢?我的答案是肯定的,但扩展的速度会非常快,不过不会达到无限快。

So will robotics be revolutionized? Yeah, maybe tack on another year or two. That's the way I think about these things. There's a general skepticism about extremely fast progress. Here's my baby, which is like it sounds like you are going to solve continuing learning when we're in other within the matter of years.
那么机器人技术会发生革命性的变化吗?是的,也许再过一两年。这是我对这些事情的看法。对于极快的进步,人们普遍持怀疑态度。这是我的观点,就像你听起来似乎在说我们将在未来几年内解决持续学习的问题一样。

But just as people weren't talking about continuing learning a couple years ago and then we realized, oh, why are these models as useful as they could be right now even though they are clearly passing the Turing test and our experts in so many different domains. Maybe it's this thing. And then we solve this thing and we realize actually there's another thing that human intelligence can do and that's a basis of human labor that these models can't do.
几年前人们没有谈论持续学习,然后我们意识到,为什么这些模型即使已经可以通过图灵测试,在许多领域表现得像专家,却仍然没有达到我们期望的实用性。或许问题出在这里。我们解决了这个问题后,又发现人类智能能够做到一些事情——这些是人类劳动的基础,而这些是当前模型无法完成的。

And then why not think there will be more things like this? So I think that like where we've found the pieces of human intelligence. Well, to be clear, I mean, I think continuing learning as I've said before might not be a barrier at all. I think we maybe just get there by pre-training generalization and RL generalization.
为什么不认为将来会有更多类似的事情发生呢?我觉得,我们已经找到了人类智慧的一些部分。不过,我需要澄清的是,我认为,正如我之前所说,持续学习可能根本不是一个障碍。我认为我们可以通过对通用化的预训练和强化学习的通用化来达到这个目的。

I think there might not be, there basically might not be such a thing at all. In fact, I would point to the history in ML of people coming up with things that are barriers that end up kind of dissolving within the big blob of compute. People talked about how do you have, how do your models keep track of nouns and verbs and how do they understand syntactically but they can't understand semantically.
我觉得可能根本就没有这样的东西。事实上,从机器学习的发展历史中可以看到,人们曾提出的一些障碍,最终都会在强大的计算能力面前消融。例如,曾经有人讨论过模型如何识别名词和动词,以及如何理解语法结构,但无法理解语义内容。

It's only statistical correlations. You can understand a paragraph, you can understand a word, there's reasoning and you can't do reasoning but then suddenly it turns out you can do code and math very well at all. So I think there's actually a stronger history of some of these things seeming like a big deal and then kind of dissolving. Some of them are real.
这只是统计相关。你可以理解一段文字或一个词,这涉及到推理,而你不能进行推理,但突然之间你会发现你在编程和数学方面表现得很好。因此,我认为这些看似重要的事情,实际上有一些会被证明没那么重要。然而,其中有一些是真实存在的。

I mean, the need for data is real. Maybe continual learning is a real thing but again, I would ground us in something like code. I think we may get to the point in a year or two where the models can just do sweet end end. That's a whole task. That's a whole sphere of human activity that we're just saying models can do it now.
我的意思是,对数据的需求是真实存在的。或许持续学习确实很重要,但我还是会把重点放在类似代码的东西上。我认为在一年或两年内,我们可能会达到这样一个程度:模型可以从头到尾完成整个过程。这是一个完整的任务,也是人类活动的一个完整领域,而我们现在却在说模型可以胜任这些了。

When you say end to end, do you mean setting technical direction and understanding the context of the problem? Yes. So I mean all of that. Interesting. I mean, that is, if you're like AGA complete, maybe there's maybe internally consistent but it's not like saying 90% of code or 100% of code. No, no, no, no, no, no, no, I gave this spectrum.
当你说“从头到尾”的时候,你是指设定技术方向和理解问题背景吗? 是的,我指的是所有这些。 有意思。我的意思是,如果你像 AGA 完成的那样做,也许在内部是一致的,但这不等于说完成了 90% 或 100% 的代码。 不,不,不,不,不,我给的是一个范围。

90% of code, 100% of code, 90% of N10 sweet, 100% of N10 sweet, new tasks are created for swis. Eventually, those get done as well. It's a long spanish on there but we're traversing the spectrum very quickly. I do think it's funny that I've seen a couple of podcasts you've done where the host will be like, but the work has to go the same other than to do the learning thing. And it always makes you crack off because you're like, you've been an AI researcher for like 10 years. I'm sure there's like some feeling of like, okay, so podcasts are what an essay. No, like every interview I get asked about it. The truth of the matter is that we're all trying to figure this out together.
90%的代码,100%的代码,90%的N10甜点,100%的N10甜点,为SWIS创建了新任务。最终,这些任务也都完成了。虽然上面的内容有些长,从西班牙语翻译过来比较繁琐,但我们正在快速地跨越这一领域。我觉得有趣的是,我看过你参与的一些播客,主持人会说,工作还是要做的,除了要学习的事情。每次听到这些都会让你忍不住笑,因为你已经做了大约十年的AI研究。我确信你内心会有某种感受,比如,“好吧,所以播客就像是一篇文章。” 不,每次采访我都会被问到这一点。事实是,我们都在一起努力探索这个问题。

There are some ways in which I'm able to see things that others aren't. These days that probably has more to do with like, I can see a bunch of stuff within and throttpeck and have to make a bunch of decisions than I have any great research insight that others don't. I'm running a 2500 person company like it's actually pretty hard for me to have concrete research insight much harder than it would have been 10 years ago or even two or three years ago.
我有一些方法能够看到别人看不到的东西。如今,这更多的是因为我可以看到很多内部的事情,并需要做出许多决策,而不是因为我有比别人更深刻的研究洞察力。我正在经营一家拥有2500名员工的公司,要获得具体的研究洞察力变得非常困难,比10年前甚至两三年前都要难得多。

As we go towards a world of a full drop in remote worker replacement, does a API pricing model still make the most sense and if not, what is the correct way to price AGI or survey GI? Yeah, I mean, I think there's going to be a bunch of different business models here sort of all at once that are going to be that are going to be experimented with. I actually do think that the API model is more durable than many people think.
随着我们迈向一个全面取代远程工作者的世界,API 定价模式是否仍然是最合理的?如果不是,那应该如何为 AGI 或通用智能定价?是的,我认为在这个过程中会同时出现许多不同的商业模式,并且会进行各种实验。我实际上认为 API 模式比许多人想象的更有持久力。

One way I think about it is if the technology is kind of advancing quickly, if it's advancing exponentially, what that means is there's always kind of like a surface area of kind of new use cases that have been developed in the last in the last three months and any kind of product surface you put in place is always at risk of sort of becoming irrelevant. Any given product surface probably makes sense for a range of capabilities of the model.
我认为,可以这样理解:如果技术正在快速进步,特别是呈指数级增长,这意味着在最近的三个月里总会出现一些新的使用案例。无论你推出什么产品,都有可能很快变得不再相关。任何一款产品可能都只适合该模型的一定范围内的能力。

The chatbot is already running into limitations of making it smarter, doesn't really help the average consumer that much, but I don't think that's a limitation of AI models. I don't think that's evidence that the models are good enough and they're, they're, you know, them getting better doesn't matter to the economy, it doesn't matter to that particular product. And so I think the value of the API is the API always offers an opportunity, you know, very close to the bare metal to build on what the latest thing is.
聊天机器人已经面临提升智能方面的限制,这对普通消费者的帮助其实并不大,但我认为这并不是人工智能模型的局限性。我不认为这证明这些模型已经足够好,并且它们即使变得更好,也对经济或特定产品没有太大影响。因此,我认为API的价值在于它总是提供了一个机会,可以非常接近底层地去构建最新的技术。

And so there's, you know, there's there's kind of always going to be this, you know, this, this kind of front of new startups and new ideas that weren't possible a few months ago and are possible because the model is advancing. And so I actually, I kind of actually predict that we are, it's going to exist alongside other models, but we're always going to have the API business model because there's always going to be a need for a thousand different people to try experimenting with the model in different way and a hundred of them become startups and ten of them become big successful startups and, you know, two or three really end up being the way that people use the model of a given generation.
因此,总会存在这样一个情况,就是会有新的创业公司和新的想法出现,这些想法在几个月前还无法实现,而现在随着模型的进步,已经变得可行。因此,我预测我们的模型将与其他模型共存,但API商业模式将始终存在,因为总会有成千上万的人以不同的方式尝试实验这个模型,其中一百个会成为创业公司,十个会成为成功的大型创业公司,而最终只有两三个真的成为人们在特定时代使用模型的方式。

So I basically think it's always going to exist. At the same time, I'm sure there's going to be other models as well, like not every token that's output by the model is worth the same amount. Think about, you know, what is the value of the tokens that are like, you know, that the model outputs when someone, you know, call, you know, someone, you know, calls them up and says, my Mac isn't working or something, you know, the model's like restarted, right?
所以我基本上认为它会一直存在。同时,我也相信会有其他的模型,比如,并不是模型输出的每个词都价值相等。想想看,当有人打电话求助,比如说“我的Mac无法运作”,然后模型就建议“重启一下”这样的回应,这些词的价值是什么呢?

And like, you know, someone hasn't heard that before, but like, you know, the model said that like 10 million times, right? You know, that maybe that's worth like a dollar or a few cents or something. Whereas if the model, you know, the model goes to, you know, one of the one of the pharmaceutical companies and it says, you know, there's molecule you're developing, you should take the aromatic ring from that end of the molecule and put it on that end of the molecule.
就像,你知道,有些人可能从未听过这种说法,但模型可能已经说过一千万次,对吧?可能这只值一美元或者几美分。但是,如果模型能够对某个制药公司说,你们正在研发的某个分子,需要把芳香环从分子的这一端移到另一端,那么这可能就有更大的价值。

And, and you know, if you do that, wonderful things will happen. Like, like those tokens, you know, tens of millions of dollars, right? So, so I think we're definitely going to see business models that recognize that, you know, at some point we're going to see, you know, pay for results or, you know, in some, in some form, or we may see forms of compensation that are like labor, you know, that kind of work by the hour.
如果你这样做,美好的事情就会发生。例如,那些代币可能价值数千万美元,对吧?所以,我认为我们肯定会看到一些商业模式认识到这一点。在某个时候,我们可能会看到基于结果付费的模式,或者某种形式,也许还有类似于按小时工作的薪酬方式。

I, you know, I don't know. I think, I think, I think because it's a new industry, a lot of things are going to be tried. And I, you know, I don't know what will turn out to be the right thing. What I find, I take your point that people will have to try things to figure out what is the best way to use this blob of intelligence. But what I find striking is, Claude code. So I don't think in the history of startups, there has been a single application that has been as hotly competed in, has coding agents and, and, and the Claude code is a category leader here. And that seems surprising to me. Like it doesn't seem intrinsically like an anthropic hat to build this. And I wonder if you have an accounting of why it had to be an anthropic or why, how anthropic ended up building an application in addition to the model underlying it?
我,我不知道。我觉得,因为这是一个新兴行业,会有很多新的尝试。我不知道最终什么是正确的做法。我理解你的观点,人们需要通过尝试来找出如何最好地使用这种智能“黑箱”。但让我感到惊讶的是,Claude代码。在创业史上,似乎没有哪一个应用像Claude代码那样竞争激烈,而且它在编码代理方面是这个领域的领头羊。对此我感到很意外,因为这似乎并不是Anthropic公司一定要做的事情。我很好奇,你是否知道为什么一定要由Anthropic来做这件事情?或者Anthropic是如何在开发基础模型的同时还开发了这款应用的?

Yeah. So it actually happened in a pretty simple way, which is we had our own, you know, we had our coding models, which were good at coding. And, and, you know, around the beginning of 2025, I said, I, I think the time has come where you can have non-trivial acceleration of your own research. If you're an AI company, by using these models. And of course, you know, you need an interface. You need a harness to use them. And so I encourage people internally, and I didn't say, this is one thing that, you know, that you have to use. I just said people should experiment with this.
是的,事情的经过其实很简单。我们有自己的编码模型,这些模型在编程方面表现不错。大约在2025年初,我意识到,对于一家人工智能公司来说,通过使用这些模型,你可以在自己的研究上获得显著的提升。当然了,你需要一个接口或工具来使用这些模型。因此,我鼓励内部团队去尝试,但并没有强制要求。我只是建议大家可以进行这方面的实验。

And then, you know, this thing, I think it might have been originally called CLAI. And then the name eventually got changed to CLAID code internally. Was the thing that kind of everyone was using. And it was seeing fast internal adoption. And I looked at it and I said, probably we should launch this externally, right? You know, it's, it's seen such fast adoption within an anthropic, like, you know, coding is a lot of what we do. And so, you know, we have a, we have an audience of many, many hundreds of people that's in some ways, at least, representative of the external audience.
然后,你知道,这个东西,我想最初可能叫做CLAI。然后名字逐渐在内部改为CLAID code。这个东西得到了大家的广泛使用,内部快速普及。我看到这个情况,就想我们应该把这个推出到外部,对吧?毕竟在Anthropic内部已经被快速采用,因为编程是我们工作的很大一部分。而且,我们有数百人的受众,在某种程度上至少可以代表外部受众。

So it looks like we already have product market fit. Let's launch this thing. And then we launched it. And I think, you know, just just the fact that we ourselves are kind of developing the model. And we ourselves know what we most need to use the model. I think it's kind of creating this feedback loop. I see. In the sense that you, let's say a developer and anthropic is like, ah, it would be better if it was better at this X thing. And then you bake that into the next model that you build that that's that's one version of it.
看起来我们已经找到了产品市场匹配。让我们推出这个东西吧。于是,我们就发布了它。我认为,就因为我们自己在开发这个模型,我们也最了解自己最需要如何使用它,这就形成了一种反馈循环。比如说,某个开发人员觉得,啊,如果模型在某个方面做得更好就好了,然后我们就把这些改进融入到下一个版本的模型中去。这就是其中一种方式。

But but then there's just the ordinary product iteration of like, you know, we have a bunch of, we have a bunch of coders within an anthropic like we, you know, they like use CLAID code every day. And so we get fast feedback. That was more important in the early days. Now, of course, there are millions of people using it. And so we get a bunch of external feedback as well. But it's, you know, it's just great to be able to get, you know, kind of, kind of, fast, fast internal feedback.
但是,然后就是普通的产品迭代,比如,我们在Anthropic有很多程序员,他们每天都在使用CLAID代码。所以我们能很快得到反馈。这在最初的时候特别重要。当然,现在已经有数百万人在使用它,因此我们也收到了大量的外部反馈。不过,能快速地获得内部反馈仍然非常有帮助。

You know, I think this is the reason why we launched a coding model. And you know, didn't launch a pharmaceutical company, right? It, you know, my backgrounds in, in my backgrounds in, in like biology, but like, we don't have any of the resources that are needed to launch a pharmaceutical company. So there's been a ton of hyper on OpenCLAW. And I want to check it out for myself. I've got a day coming up this weekend and I don't have anything planned yet. So I gave OpenCLAW a Mercury debit card. I said a couple hundred dollar limit and I said, surprise me.
你知道,我觉得这就是为什么我们推出一个编程模型,而不是创办一家制药公司的原因。虽然我有生物学背景,但我们没有创办制药公司所需的资源。所以,最近OpenCLAW引起了很多关注,我想亲自看看。我周末有一天没什么计划,所以我给OpenCLAW一张Mercury借记卡,设定了几百美元的限额,并对它说:“给我个惊喜吧。”

Okay, so here's the MacMini it's on. And besides having access to my Mercury, it's totally quarantined. And as you felt quite comfortable giving it access to a debit card because Mercury makes it super easy to start with guard rules. I was able to customize for missions, calf to spend, and restrict the category of purchases. I wanted to make sure the debit card worked. So I asked OpenCLAW to just make a test transaction and decided to donate a couple bucks to Wikipedia. Besides that, I have no idea what's going to happen. I will report back on the next episode about how it goes.
好的,这就是运行在MacMini上的设备。除了可以访问我的Mercury账户外,它是完全隔离的。而且,因为Mercury让我感到非常放心,为其设置了访问借记卡的权限,我可以轻松地开始设置保护规则。我可以定制权限、控制消费金额,并限制购买类别。我想确保借记卡能够正常使用,于是我请OpenCLAW进行了一次测试交易,并决定向维基百科捐赠几美元。除此之外,我对接下来会发生什么并不清楚。我会在下一集报告进展情况。

In the meantime, if you want a personal banking solution that can accommodate all the different ways that people use their money, even experimental ones like this one, visit mercury.com slash personal. Mercury is a Fintech company, not an FDIC insured bank. Baking services provided through choice financial group and column NA members FDIC. You know she thinks we're getting coffee and walking around the neighborhood. Let me ask you about now making AI go well.
与此同时,如果您想要一个能够满足各种不同资金使用方式的个人银行解决方案,即使是像这种实验性的使用方式,可以访问 mercury.com/personal。Mercury 是一家金融科技公司,而不是 FDIC 保险银行。银行服务由选择金融集团和 Column NA 提供,FDIC 成员。您知道她以为我们在喝咖啡并在附近散步。让我来询问一下您关于如何让 AI 运作得更好的看法。

It seems like whatever vision we have about how AI goes well has to be compatible with two things. One is the ability to build and run AI's is diffusing extremely rapidly. And two is that the population of AI's the amount we have in their intelligence will also increase very rapidly. And that means that lots of people will be able to build huge populations of misaligned AI's or AI's which are just like companies which are trying to increase their footprint or have weird psyches like Sydney being but now they're superhuman. What is a vision for a world in which we have an equilibrium that is compatible with lots of different AI's some of which are misaligned running around?
我们对于如何让AI发展良好的愿景,似乎必须与两个现实相兼容。首先,构建和运行AI的能力正在迅速扩散。其次,AI的数量和智能水平也在迅速增加。这意味着很多人将能够建立庞大的、可能目标不一致的AI群体,或者像一些试图扩大影响力的公司一样的AI,甚至那些思维复杂的AI,就像过去的Sydney,但现在它们已经拥有超人的智能。我们应该如何设想一个能够兼容各种不同AI共存的世界,其中包括一些目标不一致的AI?

Yeah. So I think in the adolescence of technology I was kind of skeptical of like the balance of power but I think I was particularly skeptical of or the thing I was specifically skeptical of is you have like three or four of these companies like kind of all building models that are kind of dry sort of sort of like derived from the like drive from the same thing and you know that these would check each other or even that kind of you know any number of them would would would would check each other like we might live in a offense dominant world where you know like one person or one AI model is like smart enough to do something that like causes damage for everything else.
好的。在科技发展的初期,我对权力平衡持怀疑态度,尤其是对几个公司都在构建类似的模型这一点有所警惕。这些模型都源自于相同的技术,而我担心的是,它们之间可能会彼此制衡,也可能不会。我们可能生活在一个「进攻主导」的世界里,在这里,一个人或一个人工智能模型可能足够聪明,能够做出对其他事物造成损害的事情。

I think in the I mean in the short run we have a limited number of players now so we can start by within the limited number of players we you know we kind of you know we need to put in place the you know the safeguards we need to make sure everyone does the right alignment work we need to make sure everyone has bioclassifiers like you know those are those are kind of the immediate things we need to do I agree that you know that that doesn't solve the problem in the long run particularly if the ability of AI models to make other AI models proliferates then you know the the whole thing can kind of you know it can become harder to solve.
我觉得在短期内,我们现在的参与者数量有限,因此我们可以从这有限的人数开始。我们需要建立一些防护措施,确保每个人都进行正确的调整工作,确保每个人都有生物分类器。这些都是我们需要立即采取的措施。我同意,从长远来看,这样做并不能解决问题,特别是如果AI模型生成其他AI模型的能力不断扩展的话,那么整个问题可能会变得更难解决。

You know I think I think in the long run we need some architecture of governance right some are some architecture of governance that preserves human freedom but but kind of also allows us to like you know govern the very large number of kind of you know a human systems AI systems hybrid hybrid human human you know hybrid hybrid human AI like you know companies or like or like or like economic units so you know we're going to need to think about like you know how do we how do we protect the world against you know bioterrorism how do we protect the world against like you know against like against like mirror life like you know probably we're going to need to you know need some kind of like AI monitoring system that like monitor you know kind of monitors for for all these things but then we need to build this in a way that like you know preserve civil liberties and like our constitutional rights.
我认为,从长远来看,我们需要一种治理结构,这种结构既能保护人类自由,又能帮助我们管理大量的系统,这些系统包括人类系统、人工智能系统以及人类和人工智能的混合系统,例如公司或经济单位。因此,我们需要考虑如何保护世界免受生物恐怖主义的威胁,以及其他类似威胁。可能我们需要某种人工智能监控系统,专注于监测这些事情,但我们必须在构建这个系统时确保保护公民自由和宪法权利。

So I think just just as is as is anything else like it's it's like a new security landscape with a new set of you know a new set of tools and a new set of vulnerabilities and I think my worry is if we had a hundred years for this to happen all very slowly we'd get used to it you know like we've gotten used to like you know the presence of you know the presence of explosives in society or like the you know the presence of various on you know like new weapons or the you know the presence of video cameras we would get used to it over over over over over a hundred and we develop governance mechanisms we'd make our mistakes my my worry is just that this is happening all so fast and so I think maybe we need to do our thinking faster about how to make these governance mechanisms work.
所以我认为这就像其他任何事情一样,如今我们面临的是一个新的安全环境,伴随着一套新的工具和一系列新的漏洞。我担心的是,如果我们有一百年的时间来慢慢适应这一切,那我们可能会渐渐习惯,就像我们已经习惯了社会中存在炸药、各种新型武器或是摄像头一样,我们可以在这段时间内发展出治理机制,并在过程中犯错。不过,我担心的是,这一切变化得太快了。因此,我认为我们可能需要更快地思考如何让这些治理机制起作用。

Yeah it seems like in an offense dominant world over the course of the next century so the idea is the AI is making the process that would happen over the next century happening some period of five to ten years but we would still need to see mechanisms or balance of power would be similarly intractable. Even if humans were the only game in town and so I guess we have the advice of AI we it fundamentally doesn't seem like a totally different ballgame here if Jackson balances were gonna work they would work with humans as well if they aren't gonna work they wouldn't work with the AIs as well and so maybe this is just doom as human checks and balances as well but yeah again again I think there's some way to I think there's some way to make this happen like it you know it just it just you know the governments of the world may have to work together to make it happen like you know we may have to you may have to talk to AIs about kind of you know building societal structures in such a way that like these these defenses are possible.
是的,这似乎是在一个以进攻为主导的世界中,在未来的世纪里,人工智能将会把原本需要一个世纪的过程缩短到五到十年内实现。但是我们仍然需要看到一些机制或权力平衡,这将同样难以解决。即便人类是唯一的主导力量,我们从人工智能那里得到建议的话,并不会让形势完全不同。如果权力制衡在现有情况下有效,它在人类社会中也会有效;如果现有条件下无效,那么在人工智能的情况下同样不会有效。所以,也许这就像人类的制衡一样是个死局。但我认为还是有办法让这实现,全球各国政府也许需要合作来实现这一点,比如,我们可能需要与人工智能对话,建设社会结构,以便这些防御机制可以得到实施。

I don't know I mean this is so this is you know I don't want to say so far ahead in time but like so far ahead in technological ability that may happen over a short period of time that it's hard for us to anticipate in advance. Speaking of governments getting involved on December 26th the Tennessee legislature introduced a bill which uh said quote um it would be an offense for a person to knowingly train artificial intelligence to provide emotional support including through open-ended conversations with a user and of course one of the things that Claude attempts to do is be a thoughtful um thoughtful friend thoughtful knowledgeable friend and in general it seems like we're gonna have this patchwork of state laws a lot of the benefits that normal people could experience as a result of AI are going to be curtailed especially when we get into the kinds of things you discuss in machines of loving grace biological freedom mental health improvements etc etc it seems easier to imagine worlds in which these whack them all the way by different laws.
我不知道,我的意思是,我不想说这件事是在遥远的将来,但就技术能力而言,它可能会在短时间内发生,以至于我们很难提前预见。说到政府参与,12月26日,田纳西州立法机关提出了一项法案,其中提到,如果有人故意训练人工智能为用户提供情感支持,包括通过开放式对话,将构成犯罪。当然,Claude机器人尝试做的事情之一就是成为一个有思想的知心朋友和知识渊博的朋友。总体来看,我们似乎会面临不同州法律的杂乱拼凑。这将会限制普通人从人工智能中获得的许多好处,特别是当我们讨论诸如《充满关爱的机器》中的生物自由、心理健康改善等话题时。看起来更容易想象的情况是,由于不同法律的存在,这些好处被完全限制住。

Whereas bills like this don't seem to address the actual existential threats that you're concerned about so I'm curious about to understand in the context of things like this your anthropic position against the federal moratorium on state AI laws. Yes so I don't know there's there's many different things going on at once right I think I think that that I think that particular law is is dumb like you know I think it was it was clearly made by legislators who just probably had little idea what AI models could do would not do they're like AI models serving as that that just sounds scary like I don't want I don't want that to happen so you know we're not we're not in favor of that right but but but but you know that that wasn't the thing that was being voted on the thing that was being voted on is we're going to ban all state regulation of AI for 10 years with no apparent plan to do any federal regulation of AI which would take Congress to pass which is a very high bar.
这种类型的法案似乎没有解决您所关心的实际生存威胁,因此我很想了解在这种情况下,您反对联邦对州级AI法律的暂停措施的人类学立场。是的,我不知道,因为有很多事情同时发生。我认为那项特定的法律很愚蠢,显然是由可能对AI模型能做什么或不能做什么了解不多的立法者制定的,他们觉得AI模型听起来很可怕,不希望那样的事情发生,所以我们不支持那种做法。不过,那并不是正在投票的内容。正在投票的内容是要在未来10年内禁止所有州对AI的监管,但没有任何明显计划要进行任何联邦层面的AI监管,而这需要国会通过,这是一条高门槛。

So you know the idea that we'd ban states from doing anything for 10 years and people said they had a plan for federal government but you know there was no actual there was no proposal on the table there was no actual attempt um given the serious dangers that I lay out in adolescence of technology around things like the you know kind of biological weapons and bioterrorism autonomy risk and the timelines we've been talking about like 10 years as an eternity like that's that's uh that's uh I think that's a crazy thing to do so if if that's the choice if that's what you force us to choose then then we're gonna we're gonna choose not to have that moratorium and you know I think the the benefits of that position exceed the costs but it's it's not a perfect position if that's the choice.
所以,你知道之前有一个想法是禁止各州在10年内采取任何行动,人们说他们有联邦政府的计划。但事实上,并没有任何实际提案。鉴于我在技术方面列出的严重危险,比如生物武器、生物恐怖主义、自主性风险等,加上我们谈论的时间表,10年就像永恒一样。我认为这是一个疯狂的举动。所以,如果这是我们的选择,如果这是你逼我们去选择的,那么我们会选择不实行那种暂停。我认为,这个立场的好处大于坏处,但如果这是唯一的选择的话,这也不是一个完美的立场。

Now I think the thing that we should do the thing that I would support is the federal government should step in not saying states you can't regulate but here's what we're gonna do and and states you can't differ from this right like I think preemption is fine in the sense of saying that federal government says here's our standards this applies to everyone states can't do something different that would be something I support if it will be done in the right way what um but but this idea of states you can't do anything and we're not doing anything either that that struck that struck us as you know very much not making sense and I think will not age well was already starting to not age well with with all the backlash that you've seen now in terms of in terms of what we would want.
现在我认为,我们应该做的是联邦政府应该介入,但不是说各州不能进行管理,而是制定一个统一的标准,各州不能有不同的做法。我支持这种“优先权”,即联邦政府制定统一的标准并适用于所有人,各州不能与此不同。如果以正确的方式实施,我会支持这种做法。然而,这种“各州不能做任何事情,我们也不做任何事情”的想法,让我们觉得非常不合理。我认为,随着目前看到的各种反对,这种想法已经开始显得不合时宜。

I mean you know the things we've talked about are are starting with transparency standards um uh uh uh you know in order to monitor some of these autonomy risks and bioterrorism risks as the risks become more serious um as we as we get more evidence for them then I think we could be more aggressive in some targeted ways and and say hey. AI bioterrorism is really a threat let's let's pass a law that kind of forces people to have classifiers and I could even imagine it depends it depends how serious the threat it ends up being we don't know for sure then we need to pursue this in an intellectually honest way where we say ahead of time the risk has not emerged yet but I could certainly imagine with the pace that things are going that you know I could imagine a world where later this year we say hey this this AI bioterrorism stuff is really serious we should do something about it we should put it in a federal we should you know put it in a federal standard and if the federal government won't act we should put it in a state standard I could totally see that.
我指的是,我们已经讨论过的一些事情,首先是透明度标准。这是为了监测一些自动化风险和生物恐怖主义风险,特别是当这些风险变得越来越严重时。随着我们掌握的证据增多,我认为我们可以采取更有针对性的方法,比如说:“人工智能生物恐怖主义确实是一个威胁,我们应该通过一项法律,强制要求人们采用分类器。”我甚至可以想象,这取决于威胁的严重程度,虽然我们不能确定具体情况,但确实需要以诚实的态度来面对这个问题。即便目前风险尚未显现,我仍能想象在今年晚些时候,若形势加速发展,我们可能会意识到这个人工智能生物恐怖主义的问题非常严峻,并认为应该采取行动。我们应该将这一点纳入联邦标准中,如果联邦政府不采取行动,我们应该将其纳入州标准中。我完全可以预见这种情况的发生。

I'm concerned about a world where if you just consider the the pace of progress you're expecting the life cycle of of legislation you know the benefits are as you say because of diffusion lag the benefits are slow enough that I really do think this patchwork of on the current trajectory this patchwork of state laws would prohibit I mean having an emotional chatbot friend is something that freaks people out then just imagine the kinds of actual benefits from AI we want normal people to be able to experience from improvements in health and health span and improvements in mental health and so forth whereas at the same time it seems like you think the dangers are already on the horizon and I just don't see that much it seems like would be especially injurious to the benefits of AI as compared to the dangers of AI and so that's maybe the where the cost benefit makes less sense to me.
我担心的是一个这样的世界:如果你只考虑技术进步的速度,而立法的生命周期是缓慢的。正如你所说,由于扩散滞后的原因,立法带来的好处进展缓慢,我真的认为如果继续按目前的轨迹改革,这种不同州法律的拼凑将会阻碍发展。我是说,即使只是一个感情对话机器人的朋友就已经让人们感到不安,那么我们真正希望普通人能够体验到的AI实际好处,比如健康和寿命的改善、心理健康的提升等,是多么难以想象。与此同时,你似乎认为危险已经在逼近,但我却没有看到那么多潜在危险。相对而言,我觉得这实在是对AI的好处不利,不符合成本效益的分析。

So so so there's a few things here right I mean people talk about there being thousands of these state laws first of all the vast mass majority of them do not pass and you know the you know the world works a certain way in theory but like just because a law has been passed doesn't mean it's really enforced right the people the people you know implementing it maybe like oh my god this is stupid it would mean shutting off like you know everything that's ever been built and everything gets ever been built in Tennessee so you know very often laws are interpreted in like you know a way that makes them that makes them not as dangerous or not as harmful on the same side of course you have to worry if you're passing a lot of stop a bad thing you had this you had this problem as well.
所以,有几点需要说一下,首先,人们谈论这些州法律有成千上万条,但绝大多数其实没有通过。而且理论上,世界是按照某种方式运作的,但仅仅因为一条法律通过了,并不意味着它真的会被执行。执行法律的人可能会觉得这太愚蠢,因为这可能意味着要停止田纳西州所有已经建成的东西。因此,法律在执行过程中往往会被以一种不那么危险或有害的方式来解释。当然,如果你在通过一项法律来阻止一件坏事,也有可能会遇到问题。

Yeah look my look I mean my basic view is you know if if if you know we could decide you know what laws were passed and how things were done which you know we're only one small input input into that you know I would deregulate a lot of the stuff around the health benefits of AI I think you know I don't worry as much about the like the the the kind of chatbot laws I actually worry more about the drug approval process where I think AI models are going to greatly accelerate the rate at which we discover drugs and just the pipeline will get jammed up like the pipeline will not be prepared to like process all all the stuff that's going through it so you know I think I think reform of the regulatory process to buy us more towards we have a lot of things coming where the safety and the efficacy is actually going to be really crisp and clear like I mean a beautiful thing really really crisp and clear and like really really effective but you know and and maybe we don't need all this all this some like all this superstructure around it that was designed around an era of drugs that barely work and often have serious side effects.
好的,我的基本观点是,如果我们能决定通过哪些法律以及事情如何完成,尽管我们的影响很小,我会对AI在健康方面的许多好处进行放松管制。 我更担心的是药物审批过程。在这方面,我认为AI模型将大大加快我们发现药物的速度,但现行的审批流程可能无法应对如此大量的新药物。因此,我认为需要对监管流程进行改革,以便我们在处理大量即将来临的药物时更偏向安全性和有效性方面的考虑。我们即将迎来的是一些非常明确有效的成果,而传统的监管结构是为一些效果有限且常有严重副作用的药物设计的,这样的结构可能已经没有那么必要了。

But at the same time I think we should be ramping up quite significantly the you know this this kind of safety and security legislation and you know like I've said you know starting with transparency is is my view of trying not to hamper the industry right trying to find the right balance I'm worried about it some people criticize my essay for saying that's too slow the dangers of AI will come too soon if we do that well basically I kind of think like the last six months and maybe the next few months are going to be about transparency and then if these if these risks emerge who I'm more certain of them which I think we might be as soon as let as later this year then I think we need to act very fast in the areas that we've actually seen the risk.
但与此同时,我认为我们应该大幅加强这类安全与保障方面的立法。正如我之前提到的,我认为从透明度入手是为了尽量不妨碍行业发展,寻找正确的平衡。我对此感到担忧,有人批评我的文章说这太慢了,如果我们这样做,人工智能的危险会提前到来。基本上,我觉得过去六个月和未来几个月会重点关注透明度,然后如果这些风险出现,或者我对这些风险的确定性增加,可能就在今年年底,那么我认为我们需要在实实在在看到风险的领域迅速采取行动。

Like I think the only way to do this is to be nimble now the legislative process is normally not nimble but we we need to emphasize to everyone involved the urgency of this that's why I'm sending this message of urgency right that's why I wrote adolescence of technology I wanted policy makers to read it I wanted economists to read it I want national security professionals to read it you know I want decision makers to read it so that they have some hope of acting faster than they would have otherwise is there anything you can do or advocate that would make it more certain that the benefits of AI are are better instantiated where I feel like you have worked with legislatures to be like okay we're going to prevent bioterrorism here away we're going to increase insurgency we're going to increase whistleblower protection and I just think by default the actual like the things we're looking forward to here it just seems very easy they seem very fragile to different kinds of moral panics or political economy problems.
我认为,目前唯一的办法就是灵活应对,尽管立法过程通常不够灵活,但我们需要向所有相关人员强调事情的紧迫性。这就是为什么我发送这一紧急信息的原因。我写了《技术的青春期》这本书,希望政策制定者、经济学家、国家安全专业人士和其他决策者能够阅读,从而有希望比平常更快地采取行动。有没有什么措施或倡导可以让人工智能的益处更加确定地实现?我觉得您以前曾与立法机构合作,比如在防止生物恐怖主义、增强内部安全或提高举报人保护方面,因此我们期待的这些事情在道德恐慌或政治经济问题中似乎非常脆弱。

Yeah I don't actually so so I don't actually agree that much in the developed world I feel like you know in the developed world like markets function pretty well and when there's when there's like a lot of money to be made on something and it's clearly the best available alternative it's actually hard for the regulatory system to stop it you know we're seeing that in AI itself right I you know like I think I've been trying to fight for is export controls on chips to China right and like that's in the national security interests of the US like you know that's like square within the you know the the policy beliefs of you know every almost everyone in Congress of both parties but and you know I think the cases very clear the counter arguments against it are I'll politely call them fishy and yet it doesn't happen and we sell the chips because there's there's so much money there's so much money riding on it um and you know the that money wants to be made and and in that case in my opinion that's a bad thing um and but but it also it also applies when when it's a good thing and and so I I don't think that if we're talking about drugs and benefits of the technology I I I am not as worried about those benefits being hampered in the developed world.
在发达国家,我并不完全同意这种说法。我觉得在发达国家,市场运作良好,当有什么东西有很多利润可图,并且显然是现有的最佳选择时,监管系统实际上很难阻止它。我们在人工智能方面就看到了这种情况。例如,我一直在争取对出口到中国的芯片实施管制,因为这符合美国的国家安全利益,这也是几乎所有国会两党议员的政策信念。不过,尽管反对该措施的理由我客气地称它们为“牵强”,我们依然没有实施,而且我们继续出售这些芯片,因为涉及的金额非常大,大家都希望从中获利。在我看来,这种情况是不好的。然而,这种情形也适用于好事,即使是好事也难以被阻止。所以,如果我们讨论的是药品和技术带来的好处,我不太担心这些好处在发达国家会受到阻碍。

I am a little worried about them going too slow and I as I said I do think we should work to speed the approval process in the FDA I do think we should fight against these chatbot bills that you're describing right described individually I'm against them I think they're stupid um but I actually think the bigger worry is the developing world where we don't have functioning markets where um you know we often can't build on the technology that that we've had I worry more that those folks will get left behind and I worry that even if the cures are developed you know maybe there's someone in rural Mississippi who who doesn't get it as well right that's a that's a that's a kind of smaller version of the thing the concern we have in the in the developing world and so the things we've been doing are you know you know we work with you know we work with you know philanthropists right you know we work with folks um who you know who you know deliver you know medicine and health interventions to you know to to developing world the sub-Saharan Africa you know India Latin America you know other other developing parts of the world that's the thing I think that won't happen on its own.
我有点担心他们的进展太慢,正如我所说,我确实认为我们应该努力加快FDA的审批过程。我还认为我们应该反对你提到的那些关于聊天机器人的法案,我个人反对这些提案,觉得它们很愚蠢。不过,我实际上更担心的是发展中国家,在这些地方我们没有完善的市场机制,往往无法利用我们已有的技术发展。我更担心的是那些国家会被落在后面,即便是疗法研发出来后,也可能有如密西西比州乡村地区的人得不到相应治疗。这在某种程度上就像我们在发展中国家面临的问题。因此,我们一直在做的事情包括与慈善家合作,向撒哈拉以南非洲、印度、拉丁美洲及其他发展中地区提供药品和健康干预。我认为这些事情不会自然而然地发生。

You mentioned index for controls yeah why can't the u.s. and china both have a country geniuses on a bit of center why can't you know why won't it happen or why should it happen why shouldn't it happen why shouldn't it happen um you know I think I think if this does happen um you know then then we kind of have a well we could have a few situations if we have like an offense dominant situation we could have a situation like nuclear weapons but like more dangerous right where it's like gun you know kind of kind of either side could could easily destroy everything um we could also have a world where it's kind of it's unstable like the nuclear equilibrium is stable right because it's you know it's like deterrence but let's say there were uncertainty about like if the two A.I.s fought which A.I. would win um that could create instability right you often have conflict when the two sides have a different assessment of their likelihood of winning right if one side is like oh yeah there's a 90% chance all win and the other side's like there's a 90% chance all win then then then then a fight is much more likely um they can't both be right but they can both think that but this is like a fully general argument against the diffusion of AI technology which it may it which is that's the implication of this world let me let me let me just go out because I think we will get diffusion eventually.
你提到了有关控制的指数,为什么美国和中国不能都有一个国家级的天才中心?为什么这种情况不会发生或为什么它应该或不应该发生呢? 我认为,如果这种情况真的发生,我们可能会面临几种局面。如果我们进入一个进攻主导的状态,情况可能会像核武器那样,但更加危险,因为无论哪一方都可能轻易摧毁一切。我们也可能处于一个不稳定的世界,就像核平衡是稳定的,因为它是通过威慑来维持的。但是,如果不确定两个人工智能交战时哪一个会胜出,这就可能造成不稳定。当双方对各自胜算的评估不同的时候,冲突往往更容易发生。如果一方认为自己有90%的胜算,另一方也认为自己有90%胜算,那么冲突就更有可能了。双方都不能都对,但他们都可以这样认为。这实际上是一个反对人工智能技术扩散的通用论点,这可能就是这个世界的意义。但我认为,最终我们会实现技术扩散。

The other concern I have is that people the governments will oppress their own people with AI and and and so um you know I'm just. I'm worried about some world where you have a country that's already you know kind of uh uh you know uh uh you know there's there's a government that kind of kind of already um you know is is kind of kind of building a you know a tech high tech authoritarian state um and to be clear this is about the government this is not about the people like people we need to find a way for people everywhere to benefit my worry here is about governments um so yeah my you know my my worry is if the world gets carved up into two pieces one of those two pieces could be authoritarian or totalitarian in a way that's very difficult to displace.
我另一个担忧是,政府可能会利用人工智能来压迫自己的人民。我担心的是,世界上有一些国家,它们的政府正在建立一个高科技的威权国家。需要说明的是,这个问题是关于政府的,不是人民的。我们需要找到一种方法,让每个人都能从中受益。我担心的是,如果世界被分成两部分,其中一部分可能会变得很难撼动的专制或极权。

Now will will governments eventually get powerful AI and and you know there's risk of authoritarianism yes well governments eventually get powerful AI and there's risk of um uh you know of of kind of bad bad bad equilibria yes I think both things but the initial conditions matter right you know at at some point we're need we're going to need to set up the rules of the road I'm not saying that one country either the United States or a coalition of democracies which I think it would be a better setup although it requires more international cooperation than we currently seem to want to make um but you know I don't I don't think a coalition of democracies or or certainly one country should just say these are the rules of the there's going to be some negotiation right the world is going to have to grapple with this and what I would like is that the the the the democratic nations of the world those with you know who are close whose governments have represent closer to prohuman values are are holding the stronger hand then have have more leverage when the rules of the road are set and and so I'm very concerned about that initial condition.
政府最终会获得强大的人工智能,对此存在威权主义的风险,对,这种风险确实存在。我认为两种情况都会出现,但初始条件很重要。我们在某个时刻需要设定游戏规则。我并不是说某一个国家,比如美国,或者由民主国家组成的联盟(我认为这会是更好的安排,但这需要比我们当前所愿更多的国际合作)应该单方面制定规则。这个问题需要全球共同面对,我希望世界上的民主国家,那些其政府代表的价值观更接近人类利益的国家,能够在制定规则时拥有更强的影响力和主动权。我对这样的初始条件非常关注。

Um I was really listening to an interview from three years ago and one of the ways it aged poorly is that I kept asking questions assuming there's going to be some key falch or moment two to three years from now when in fact being that far out it just seems like progress continues AI improves AI is more diffused people will use it for more things it seems like you're imagining a world in the future where the countries get together and here's a world to the world and here's the leverage we have here's the leverage you have when it seems like on current directory everybody will have more AI some of that AI will be used by authoritarian countries some of that within the authoritarian countries will be by its by private actors versus state actors it's not clear who will benefit more it's always unpredictable to tell in advance you know it seems like the internet privileged authoritarian countries more than you would have expected um and maybe the AI will be the opposite way around um so I I want to better understand what you're imagining here.
呃,我最近听了一个三年前的采访,其中有一点老化得不好,就是我当时在提问时总是假设两到三年内会有一些关键的失败或时刻。然而,事实上,从那时到现在,只是进展的持续,人工智能在不断改进,传播得更广,人们会将其用于更多的事情。你似乎在想象一个未来的世界,在那个世界中,各国聚在一起,分享资源和力量。按目前的趋势来看,每个国家都会拥有更多的人工智能,其中一部分会被威权国家使用,而在这些国家里,人工智能可能会由私人部门还是国家部门使用尚不清楚,也不明确谁会因此受益。你知道,提前预测总是很难说。比如,互联网似乎比预期中更有利于威权国家,而人工智能可能反过来会有不同的结果。所以我想更好地理解你在想象的是什么情况。

Yeah yeah so so just to be precise about it I think the exponential of the underlying technology will continue as it has before right the models get smarter and smarter even when they get to country of geniuses in a data center you know I think you can continue to make the model smarter there's a question of like getting diminishing returns on their value in the world right how much does it matter after you've already solved human biology or you know at some point you can do harder math you can do more abs true math problems but nothing after that matters but putting that aside I do think the the exponential will continue but there will be certain distinguished points on the exponential and companies individuals countries will reach those points at different times um.
对,我想明确一点的是,我认为基础技术的指数增长会像之前一样持续。模型会变得越来越智能,即使在数据中心达到了天才的级别,我认为可以继续让模型变得更智能。不过,有一个问题是模型对世界的价值会逐渐递减。比方说,在你已经解决了人类生物学问题后,这究竟还有多大意义?或者在某个时刻你能解决更复杂的数学问题,但之后的意义会减少。不过撇开这些不谈,我确实认为这种指数增长会继续,但会在增长曲线上有某些显著的点,各个公司、个人、国家会在不同时间到达这些点。

And and so you know there's there's you know could there be some you know I you know I talk about is a nuclear deterrent still in adolescence of technology is a nuclear deterrent still stable uh in the world of of of AI I don't know but that's that's an example of like one thing we've taken for granted that like the technology could reach such a level that it's no longer like you know we can no longer be certain of it at least um uh you know think of think of others you know they're they're you know they're they're kind of points where if you if you reach a certain point you maybe you have offensive cyber dominance and like every every computer system is transparent to you after that um I you unless the other side has it has a kind of equivalent defense.
为了便于理解,我将这一段翻译成中文: 你知道,有一些事情我们习以为常,但我们可能需要重新审视。比如说,核威慑技术仍处于发展的初期阶段,那么在人工智能的时代,它依然是稳定的吗?我并不确定。但这就是一个例子,说明我们可能认为理所当然的技术会达到一个我们无法再确定其稳定性的程度。 想想其他的情况,比如在网络战中,如果你达到了一定的水平,可能就会获得压倒性的网络攻击优势,从而让每一个计算机系统对你来说都是透明的。除非对方也拥有相当的防御能力。

So I don't know what the critical moment is or if there's a single critical moment but I think there will be either a critical moment a small number of critical moments or some critical window where it's like AI is AI confers some large advantage from the perspective of national security and one country or coalition has reached it before others that that you know that that you know I'm not advocating that they're just like okay we're in charge now or that's not that's not how that's not how I think about it you know that there's always the the other side is catching up there's extreme actions you're not willing to take and and and it's not right to take you know to take complete um to take complete control anyway.
所以,我不确定是否存在一个关键时刻,或者是否只有一个关键时刻,不过我认为可能会有一个关键时刻、少数几个关键时刻,或者某个重要的时间窗口,在这期间,人工智能在国家安全方面能够提供巨大的优势,而某个国家或联盟在其他国家之前达到了这一点。我并不是主张说因此他们就可以宣称要领导世界,也不是这样看的,因为总有其他人会迎头赶上,也有一些极端行动是你不愿意采取的,而且完全控制一切本身就是不对的。

But but at the point that that happens I think people are going to understand that the world has changed and there there's going to be some negotiation implicit or implicit about what what is the what is the post AI world order look like and and I think my interest is in you know making that negotiation be one in which you know classical liberal democracy has you know has a strong hand. Well I want to understand what that better means because you say in the essay quote a talker see is simply not a former government that people can accept in the post-powerfully iH and that sounds like you're saying the ccp as an institution cannot exist after we get a g i um and that seems like a like a very strong demand and it seems to imply a world where the leading lab or the leading country will be able to and by that language should get to determine how the world is governed or what kinds of governments are allowed and not allowed.
在那种情况发生的时候,我认为人们会意识到世界已经改变了,并且可能会展开某种形式的谈判,无论是公开的还是隐含的,关于后AI时代的世界秩序会是什么样。我感兴趣的是促使这种谈判在一定程度上赋予古典自由民主一个有力的位置。我想了解这个过程更深层次的含义,因为你在文章中说过,"专制政府是人们在强大的人工智能时代无法接受的",这听起来就像是在说,像中共这样的体制在获得广义人工智能后将无法存在。这似乎是一个非常强硬的要求,似乎意味着在这个过程中,领先的实验室或国家将有能力,并且根据这个说法也应该有权决定世界如何治理,或者哪些类型的政府被允许存在。

Yeah so when when I um I believe that paragraph was I think I said something like you could take it even further and say x so I wasn't I wasn't necessarily endorsing that that that I wasn't necessarily endorsing that view I you know I was saying like here's if first you know here's a weaker thing that I believe but you know I think I you know I think I said you know we have to worry a lot about authoritarians and you know we should try and you know kind of kind of check them and limit their power like you could take this kind of further much more interventionist view that says like authoritarian countries with AI are these you know the you know these kind of self-fulfilling cycles that you can't that are very hard to displace and so you. just need to get rid of them from from the beginning that that has exactly all the problems you say which is you know you know if you were to make a commitment to overthrowing every authoritarian country i mean they then they would take a bunch of actions now that like you know that that that could could lead to instability so that that may or you know that that that just that just may not be possible but the point i was making that i do endorse is that it is it is quite possible that you know today you know the view or at least my view or the view in most of the western world is democracy is a better form of government than authoritarianism but it's not like if a country's authoritarian we don't react the way we reacted if they committed to genocide or something right and and i'm i guess what i'm saying is i'm a little worried that in the age of a g i authoritarianism will have a different meaning it will be a grave or thing um and we have to decide one way or another how how how how how to deal with that and the interventionist view is one possible view i was exploring such views um you know uh it may end up being the right view it may end up being too extreme to be the right view but i do have hope
对,就是说,我相信那段话,我想我当时说的是,你甚至可以更进一步地认为x,所以我并不一定在支持那个观点。我是想说,首先,这是一个我相信的较温和的观点,但我认为我们需要非常关注独裁者,并试图限制他们的权力。你可以采取一种更激进的干预观点,认为拥有人工智能的独裁国家是那种难以打破的自我实现的循环,因此你需要从一开始就消除它们。这个观点确实有你提到的问题,比如,如果承诺推翻每个独裁国家,他们可能会采取某些行动导致不稳定,这也许是不可能实现的。但我想表达的,并且我支持的观点是,今天在我或者西方世界的大多数人看来,民主是一种优于专制的政府形式。但是当一个国家专制时,我们的反应并不像面对种族灭绝那样激烈。而我想说的是,我有点担心在人工智能时代,专制会有不同的含义,会更为严重。我们需要决定如何处理这个问题,干预是其中一个可能的观点。我是在探索这些观点,一个可能是正确的,也可能过于极端。但我仍抱有希望。

and one piece of hope i have is there there is we have seen that as new technologies are invented forms of government become obsolete i i mentioned this in adolescence of technology where i said you know like feudalism was basically you know like a form of government right and and then when when we invented industrialization feudalism was no longer sustainable no longer made sense why is that hope why we couldn't that imply the democracy is no longer going to be well competitive system it could right it could go it could go either way right but but i actually so i these problems with authoritarianism right that the problems of authoritarianism get deeper i just i wonder if that's an indicator of other problems that authoritarianism will have right another words people become because authoritarianism becomes worse people are more afraid of authoritarianism they work harder to stop it
翻译如下: 我心中有一个小小的希望,就是随着新技术的发明,某些形式的政府可能会变得不再适用。我在文章《技术的青春期》中提到过这一点,比如封建主义曾经是一种政府形式,但当我们发明了工业化后,封建主义就不再可持续,也不再合理。为什么说这是希望呢?这是不是意味着民主制度也有可能面临同样的境地?这确实有可能发生,但事情可能朝不同方向发展。我对集权主义的某些问题感到担忧,这些问题可能会日益严重。也就是说,随着集权主义的问题加重,人们可能会更加害怕这种制度,从而更加努力去阻止它。

it's it's more of a like you have to think in terms of total equilibrium right i just wonder if it will motivate new ways of thinking about with with with the new technology how to preserve and protect freedom and and even more optimistically will it lead to a collective reckoning and you know a kind of a more emphatic realization of how important some of the things we take as individual rights are right a more emphatic realization that we just we really can't give these away there's there we've seen there's no other way to live that actually works um i i i i i am actually i am actually hopeful that i guess one way to say it it sounds too idealistic but i actually believe it could be the case is that is that dictatorships become morally obsolete they become morally unworkable forms of government um and that and that and that the the the the crisis that that creates is is is sufficient to force us to find another way
意思是:你需要从整体平衡的角度思考问题。我在想,这是否会激发人们以新的思维方式来应对新技术,比如如何维护和保护自由。更乐观地说,这是否会导致集体反思,让人们更加清楚地认识到我们一直认为是个人权利的东西的重要性,并意识到这些权利是不能轻易放弃的。事实上,我其实是抱有希望的。我相信,虽然听起来有些理想化,但可能会出现这样的情况:独裁政权会在道德上显得过时,不再是可行的政府形式,而由此带来的危机将迫使我们找到新的道路。

um i think there is genuinely a tough question here which i'm not sure how you resolve for and we've had to come out one way or another on it through history right so with China in the 70s and 80s we decided even though it's an authoritarian system we will engage with it i think it right respect that was the right call because it a state of authoritarian system but a billion plus people are much wealthy are and better off than they would have otherwise been and it's not clear that it would have stopped being an authoritarian country otherwise you can just look at North Korea as an example of that right and i don't know if that to accept my that much intelligence to remain in the authoritarian country that continues to coalesce its own power and so you can just imagine a North Korea with an AI that's much worse than everybody else's but still enough to keep power and and and and and and so in general it seems like should we just have this attitude of the benefits of AI will in the form of all these empowerment of humanity and health and so forth will be big and in historically we have decided it's good to spread the benefits of technology widely even with even to people whose governments are authoritarian and i think i guess it is a tough question about how to think about it with the AI but um historically we have said yes this is a positive some world and it's still worth defusing technology.
我认为这里确实有一个很棘手的问题,我不确定如何解决。在历史上,我们不得不在这个问题上做出某种选择。例如,在70年代和80年代,我们决定与中国接触,尽管它是一个专制政权。从回头看的角度,我认为这是正确的决定。因为尽管它是一个专制国家,但十多亿人的生活变得更富裕,生活质量也得到了提升。否则,这个国家不太可能停止专制,我们可以看看朝鲜作为例子。 我不认为拥有一定程度智能的国家会主动放弃专制,他们更可能会继续巩固自己的权力。可以想象,一个拥有相对落后AI的朝鲜仍然会有足够的能力保持其控制。 总之,是否应该抱着这样的态度:人工智能的好处,包括赋予人类力量、改善健康等,将会非常显著?历史上,我们曾决定广泛传播技术的好处,即便是对那些在专制政府统治下的人来说。我认为如今这个关于人工智能的话题确实是个难题,但历史上我们则普遍认为这是一个正和博弈的世界,因此依旧值得传播技术。

yeah so so there are a number of choices we have i you know i think framing this as a kind of government to government decision and you know in in national security terms that's like one lens but there are a lot of other lenses like you could imagine a world where you know we produce all these cures to diseases and like the you know the the cures to diseases are fine to sell to authoritarian countries the data centers to start right the chips and the data centers just aren't and and the AI industry itself um you know like like another possibility is and i think folks should think about this like you know could there be developments we can make either that naturally happened as a result of AI or that we could make happen by building technology on AI could we create an equilibrium where where it becomes infeasible for authoritarian countries to deny their people kind of private use of the benefits of the technology um uh you know are there are there are there are there are there equilibrium where we can kind of give everyone in the authoritarian country their own AI model that kind of you know know like defend themselves from surveillance and there isn't a way for the authoritarian country to like crack crack down on this while while retaining power i don't know that that sounds to me like if that went far enough it would be it would be a reason why authoritarian countries would disintegrate from the inside.
是的,所以我们有很多选择。我认为将这个问题框架为一种政府对政府的决策可以是一个视角,特别是在国家安全的层面上。但还有很多其他视角可以考虑。比如,你可以想象一个世界,我们开发出了各种疾病的治疗方法,并且这些疗法可以出售给独裁国家。然而,芯片和数据中心却不适合出售和使用。还有,关于人工智能产业本身,我们应该考虑另一种可能:是否可以通过人工智能使某些发展自然发生,或者通过构建人工智能技术来推动事情的发展? 我们能否创造一种平衡,使得独裁国家无法阻止其人民私下享受技术带来的好处?有没有可能给独裁国家的每个人提供他们自己的人工智能模型,以便能够抵御监控,而独裁国家在保留权力的同时却无从打击这种现象?我不确定,如果这种情况发展得足够远,它可能会成为独裁国家从内部瓦解的原因。

but but maybe there's a middle world where like there there's an equilibrium or if they want to hold on the power the authoritarians can't deny kind of individualized access access to the technology but i actually do have a hope for the for the for the for the more radical version which is you know is it possible that the technology might inherently have properties or that by building not in certain ways we could create properties um that that that that have this kind of dissolving effect on authoritarian structures now we we hope to originally right we think about back to the beginning of the bomb administration we thought originally that that you know social media and the internet would have that property turns out not to but but i i don't know what what if we could uh what if we could try again with with the knowledge of how many things could go wrong and that this is a different technology i don't know that it would work it's worth a try.
或许存在一个中间地带,即权力能够达到平衡,专制者无法拒绝个体对技术的接触。我其实对一种更激进的可能性抱有希望,那就是,是否有可能技术本身具有某种特性,或者通过某种方式构建技术,使其具备削弱专制结构的效果。回想奥巴马政府初期,我们曾希望社交媒体和互联网具备这一特性,但事实证明并非如此。不过,我不知道,如果我们带着对潜在问题的认识,尝试利用这项不同的技术,也许会有机会。虽然我不确定是否能成功,但值得一试。

yeah i think it's a it's very unpredictable like there's first one's for reasons why authoritarianism i do it's all very unpredictable i i don't think i mean we got it we we just got to we kind of of we got to recognize the problem and then we got to come up with 10 things we can try and we got to try those and then assess whether they're working or which ones are working if any and then try new ones if the old ones are working but i guess what i guess that's out to today as you say we will not sell data centers or sorry at chips and then the ability to make chips to China and so in some sense you are denying there will be some benefits to that's right the chinese economy chinese people etc out because we're doing that and then there'd also be benefits to the american economy because it's a positive some world we could trade they could have their country data centers doing one thing we could have ours doing another and already we
翻译为中文: 是的,我觉得这很难预测,比如说为什么会出现威权主义,这一切都非常难以预料。我觉得,我们必须认识到这个问题,然后想出10个可以尝试的方法,我们需要尝试这些方法,然后评估它们是否有效,或者哪些有效,如果有的话。如果旧的方法有效,那就继续尝试新的。我想,你今天提到不会向中国出售数据中心或芯片,也不会提供制造芯片的能力。从某种意义上来说,这会给中国经济和人民带来一些影响,因为我们这样做。然而,这也对美国经济有好处,因为在一个正和的世界中,我们可以通过贸易互利。中国可以有他们的数据中心做某事,而我们可以有我们的做另一件事。

you're saying it's not worth that positive some uh stipend to empower those countries what what i would say is that you know we are we are about to be in a world where growth and economic value will come very easily if right if we're able to build these powerfully i models growth and economic value will come very easily what will not come easily is distribution of benefits distribution of wealth political freedom um you know these are the things that are going to be hard to achieve and so when i think about policy i think i think that the technology in the market will deliver all the fundamental benefits you know almost almost faster than we can take them um uh and and that these questions about about distribution and political freedom and rights are are are the ones that that will actually matter and that policy should focus on okay so speaking of distribution as you're mentioning we have developing countries and um in many cases catch up growth has been weaker than we would have hoped for
你在说让这些国家得到一些帮助是不值得的吗?我想说的是,我们将进入一个只要我们能够成功地构建强大的人工智能模型,增长和经济价值就会很容易实现的世界。真正难以实现的是利益的分配、财富的分配、政治自由等。因此,当我考虑政策时,我认为技术和市场几乎能比我们所能接受的更快地提供所有基本利益,而分配、政治自由和权利等问题才是真正重要的,政策应该关注这些问题。 说到分配,你提到我们有发展中国家,在许多情况下,它们的跟踪增长比我们预期的要弱。

yes when catch up growth does happen it's fundamentally because they have underutilized labor you can bring the capital and know how from developed countries to these countries and then they can grow quite rapidly yes obviously in a world where labor is no longer the constraining factor this mechanism no longer works and so is the hope basically to rely on philanthropy from the people who immediately get wealthy from AI or from the countries that go out to rely what is the hope i mean i mean philanthropy should obviously play some role as it has you know as it has as in the past but i think growth is always growth is always better and stronger if we can make it endogenous
是的,当赶超增长发生时,根本原因是这些国家拥有未充分利用的劳动力。可以将资本和技术从发达国家引入这些国家,然后它们就能快速增长。确实,在一个劳动力不再是限制因素的世界中,这种机制不再有效。那么,希望是否基本上寄托于那些因人工智能迅速致富的人或国家的慈善事业呢?我的意思是,慈善当然应该发挥一定作用,就像过去一样,但如果我们能够使增长成为内生增长,那么增长总是更加稳定和强劲的。

yeah so you know what are the relevant industries in like in like in like in an AI driven world look there's lots of stuff you know like there's you know i said i said we shouldn't build data centers in china but there's no reason we shouldn't build data centers in africa right um in fact i think it'd be great to build data centers in africa um you know as long as they're not owned by china we should we should build we should build data centers in africa i think that's a that's that's i think that's a great thing to do um you know we should also build you know there's no reason we can't build you know a pharmaceutical industry that's like AI driven like you know the the if AI is accelerating accelerating drug discovery then you know there will be a bunch of biotech startups like let's make sure some of those happen in the developing world and certainly during the transition i mean we can talk about the point where humans have no role but but humans will have still have some role in starting up these companies and supervising supervising the AI models so let's make sure some loose humans are humans in the developing world so that fast growth can happen there as well
你知道,在一个由人工智能驱动的世界中,哪些行业会变得重要。其实有很多,我们不应该在中国建设数据中心,但没有理由不在非洲建设。实际上,我认为在非洲建设数据中心是很好的主意,只要不被中国控制。此外,我们可以建立一个由人工智能驱动的制药行业。如果人工智能能够加速药物发现,那么就会有许多生物技术初创公司,我们应该确保其中一些公司在发展中国家落地。 当然,在过渡阶段,我们仍然会有一些人工操作,比如创办公司和监督人工智能模型的运行。我们要确保发展中国家的人们在这方面也能发挥作用,以便那里的经济能够快速增长。

you guys recently announced quad is going to have a constitution that's a line to a set of values and not necessarily just the end user and there's a world that you can imagine where if it is aligned to the end user it preserves the balance of power we have in the world today because everybody gets to have their own AI that's advocating for them and so the ratio of bad actors to good actors stays constant it seems to work out for our world today um why is it better not to do that but to have a specific set of values that the AI should carry forward uh yeah so i i'm not sure i'd quite draw the distinction in that way um there there may be two relevant distinctions here which are i think you're talking about a mix of the two like one is should we give the model a set of instructions about do this and versus don't do this yeah and the other you know versus should we give the model a set of principles for you know for kind of how to act um and and and there it's it's you know it's you know it's it's it's it's it's kind of purely a practical and empirical thing that we've observed that by teaching the model principles getting it to learn from principles its behavior is more consistent.
你们最近宣布了四方会有一个与一组价值观相符的宪章,而不只是与最终用户的利益对齐。在这种情况下,你可以想象如果四方的价值观与最终用户一致,就可以保持现有世界的权力平衡,因为每个人都可以拥有一个为自己辩护的AI,坏人与好人的比例也会保持不变,这似乎对我们今天的世界是有效的。那么,为什么不这样做,而是要让AI传递一组特定的价值观会更好呢? 我不太确定是否能那样划分区别。这里可能有两个相关的区分:一个是我们是否应该给模型一组指令,告诉它做什么或者不做什么;另一个是是否应该为模型提供一组原则来指导其行为。在实践和经验中,我们观察到,通过教模型原则并让它从中学习,它的行为会更加一致。

it's easier to cover edge cases and the model is more likely to do what people want it to do another words if you you know if you're like you know don't tell people how to hotwire a car don't speak in Korean don't you know that you know just you know if you give it a list of rules it doesn't really understand the rules and it's kind of hard to generalize from them um you know if if it's just kind of a like you know list of do's and don'ts words if you give it principles and then you know it has some hard guardrails like don't make biological weapons but overall you're trying to understand what it should be aiming to do how it should be aiming to operate so just from a practical perspective that turns out to be just a more effective way to trade the model that's one piece of it so that you know that's the kind of rules versus principles trade off.
如果我们给模型一些原则,而不是简单的规则列表,它更容易处理极端的情况并更可能按照人们的期望去执行任务。换句话说,如果你告诉模型不要教人怎么偷车、不要说韩语,而只是列出这些禁止事项,它并不会真正理解这些规则,也难以从中概括出其他情况。相比之下,如果设定一些基本原则,并加上一些明确的限制(比如不要制造生物武器),那么模型更容易理解应该如何运作以及目标是什么。从实际效果来看,这种方法更有效。这就是所谓的规则与原则的取舍问题。

then there's another thing you're talking about which is kind of like the cordial ability versus um like you know I would say kind of in trend in intrinsic motivation trade off which is like how much should the model be a kind of I don't know like a a skin suit or something where you know you know you just kind of you know it just kind of directly follows the instructions that are given to it by whoever is giving it those instructions um versus how much should the model have an inherent set of values and go off and do things on its own um and and and and and there I would actually say everything about the model is actually closer to the direction of of like you know it should mostly do what people want it should mostly follow these we're not trying to build something that like you know goes off and runs the world on its own we're actually pretty far on the cordial side now now what we do say is there are certain things that the model won't do right that it's like you know that that that I think we say it in various ways in the constitution that under normal circumstances if someone asks the model to do a task you should do that task that that should be the default um but if you've asked it to do something dangerous or if you've you know if you've um asked it to um you know uh uh to kind of harm someone else um then the model is unwilling to do that.
你提到的另一个问题是关于模型的服从能力与内在动机之间的权衡。具体来说,模型到底应该在多大程度上完全按照指令执行任务,也就是像一个听话的“外部工具”一样,只执行指令提供者的命令;以及在多大程度上模型应该拥有自己的一套内在价值观,并能够独立自主地做出选择和行动。 在这个方面,我认为我们目前的模型更倾向于服从指令,即主要按照人类的意愿行事,我们并不是想打造一个可以自主运行世界的模型。目前,我们的立场倾向于模型应该保持礼貌和服从。 不过,我们也表示,模型不会执行某些类型的任务。例如,在正常情况下,如果有人要求模型完成一个任务,默认情况下模型应该去执行。但如果有人要求模型做危险的事情或者伤害他人,那么模型将拒绝执行这些请求。

so I actually think of it as like a mostly a mostly cordial model that has some limits but those limits are based on principles yeah I mean then the fundamental question is how are those principles determined and this is not a special question for enthropic this would be a question for any company but um uh because you have been the ones to actually write down the principles I get to ask you this question uh normally a constitution is like you write it down it's set in stone and there's a process of updating it and changing it and so forth in this case it seems like a document that people don't enthropic write that can be changed at any time that guides the behavior of systems are going to be the basis of a lot of economic activity what is the how do you think about how those principles should be set yes um so I think there's there's two there's maybe three three kind of sizes of loop here right three three ways to iterate one is you can iterate we iterate within thenthropic we train the model we're not happy with it and we kind of change the constitution and I think that's good to do um and you know putting out publicly you know making updates to the constitution every once in a while saying here's a new constitution right I think that's good to do because people can comment on it.
翻译成中文: 所以,我实际上认为它就像一个大多是友好的模型,但它有一些限制,这些限制是基于原则的。我的意思是,根本的问题在于这些原则是如何确定的,而这并不是一个特定于任何一个公司的问题。不过,由于是你们写下了这些原则,我可以问你这个问题。通常,一部宪法是写下来后就固定不变的,并有一个更新和修改的过程。而在这种情况下,似乎是一个可以随时改变的文件,用来指导系统行为,这将成为很多经济活动的基础。关于这些原则应如何制定,你怎么看? 是的,我认为这里有两,可能是三种规模的循环,三种迭代的方式。一种是我们可以在公司内部迭代。我们训练模型,如果对结果不满意,我们就会改变宪章。我认为这样做是好的。并且,公开发布宪章的更新,比如说这是一个新的宪章,我认为这样做是好的,因为这样人们可以对其发表评论。

the second level of loop is different companies will have different constitutions um and you know I think it's useful for like enthropic puts out a constitution and you know you know you know Gemini model puts out a constitution and you know other companies put out a constitution and then they then they can kind of look at them compare outside observers can critique and say this this I like this one this thing from this constitution and this thing for that constitution and and then kind of that that creates some kind of you know soft incentive and feedback for all the companies to like take the best of each elements and improve.
第二层次的循环是,不同的公司会有不同的宪章。我认为这很有用,比如说某公司发布一家宪章,然后像Gemini这样的模型也发布一份宪章,其他公司也发布自己的宪章。这样一来,它们就可以相互比较,外部观察者可以进行点评,指出哪些部分是他们喜欢的,哪些来自某些宪章的条款是好的。这样一来,所有公司都可以获取一些软性激励和反馈,从其他宪章中借鉴最好的元素来改进自己的宪章。

then I think there's a third loop which is you know society beyond the AI companies and beyond just those who kind of you know who who comment on the constitutions without hard power and and there you know we've done some experiments like you know a couple years ago we did an experiment with I think it was called the collective intelligence project to like um you know to to basically pull people and ask them what should be an R.A.I. constitution um uh and and you know I think at the time we incorporated some of those changes and so you could imagine with the new approach we've taken to the constitution doing something like that it's a little harder because it's like that was actually an easier approach to take when the constitution was like a list of do's and don'ts um at the level of principles it asked to have a certain amount of coherence um but but you could you could still imagine getting views from a wide variety of people.
我认为还有第三个环节,就是超越AI公司和那些仅仅是评论宪法但没有实际权力的人士之外的社会。在这方面,我们进行了一些实验,比如几年前我们进行了一项可能叫做集体智慧项目的实验,基本上就是收集人们的意见,询问他们认为AI宪法应该包含哪些内容。当时,我们采纳了一些建议。所以,你可以想象我们在宪法的新方法中也采取类似的方式。这确实有点难度,因为以前的宪法就像是个待办事项清单一样简单,而在原则层面上,它需要一定的连贯性。但是,你仍然可以想象从各种各样的人那里收集意见。

and I think you could also imagine and this is like a crazy idea but hey you know this whole interview was about crazy ideas right so um you know you could even imagine systems of of kind of representative government having having input right like you know I wouldn't I wouldn't do this today because a legislative process is so slow like this is exactly why I think we should be careful about the legislative process and AI regulation but there's no reason you couldn't in principles say like you know all AI you know all AI models have to have a constitution that starts with like these things and then like you can append you can append other things after it but like there has to be this special section that like takes presence.
我认为你可以想象一下,这只是一个疯狂的想法,但嘿,你知道这次采访本来就是关于疯狂想法的,对吧?所以,你甚至可以想象一种代表政府的系统能够参与其中。举个例子,虽然我不会在今天这样做,因为立法过程太慢了,这正是我认为我们在立法过程和人工智能监管方面需要小心的原因。但从原则上讲,你可以说,所有的人工智能模型都必须有一个宪法,首先包含这些内容,然后可以在后面附加其他内容。但必须有一个特别的部分具有优先性。

I wouldn't do that that's too rigid that that sounds um you know that that that sounds kind of overly prescriptive in a way that I think overly aggressive legislation is but like that is the thing you could you know like like that is that is the thing you could try to do is there's some much less heavy-handed version of that maybe I really like control loop two um where obviously this is not how constitutions of actual governments do or should work where there's not this vague sense in which the Supreme Court will feel out well how people are feeling and what are the vibes and then update the of the the Constitution accordingly.
我不会那样做,那太死板了,听起来有点过于规定化,这让我觉得有些激进的立法也是这样的。不过这是一种你可以尝试的方法,虽然有较不强硬的版本。我很喜欢控制回路二,显然这与实际政府宪法的运作方式和应该运作的方式不同,因为宪法不会模糊地依赖最高法院感受人们的情绪,然后根据这些感觉来更新宪法。

so there's yeah with actual governments there's a more procedural process yeah exactly but you actually have a vision of competition between constitutions which is actually very reminiscent of how um some libertarian charter city people you used to talk about yes and archipelago of different kinds of governments could look like and then there would be selection among them of who could operate the most effectively yes in which place people would be the happiest and in a sense you're actually yeah there's this vision I'm kind of recreating that yeah yeah like the the say utopia of archipelago you know again you know I think I think that vision has has you know if things to recommend it and things that things that things that will kind. of kind of go wrong with it you know I think I think it's a I think it's an interesting in some ways compelling vision but also things will go wrong with it that you hadn't that you hadn't imagined so you know I like loop two as well but I I feel like the whole thing has got to be some some mix of loops one two and three and it's a it's a matter of the proportions right I think that's got to be the answer when somebody eventually writes the equivalent of the making of the atomic bomb for this era what is the thing that will be hardest to glean from the historical record they're most likely to miss.
所以,对于实际的政府来说,确实是有一个更加程序化的过程。没错,但实际上你提出了一种宪法之间竞争的愿景,这让人想起一些自由意志主义者谈论过的理想城市,就像一个由不同类型政府组成的群岛,人们可以从中选择哪个运作最有效,哪个地方的人们最幸福。从某种意义上说,你其实是在重新创造这种愿景。这个构思有好的方面,但也有可能出问题的地方。我认为这是一个有趣且在某些方面有吸引力的愿景,但也可能会出现你未曾想到的问题。就像是不同环节的循环,我觉得整个体系需要将一、二、三区循环某种比例地结合在一起。最终的答案可能在于这种比例的把握。未来如果有人写出类似《原子弹的诞生》这种关于我们这个时代的书,哪些信息是最难从历史记录中获取的,哪些是他们最有可能忽略的。

I think a few things one is at every moment of this exponential the extent to which the world outside it didn't understand it this is this is a bias that's often present in history where anything that actually happened looks inevitable in retrospect and and so you know I think when people when people look back it will be hard for them to put themselves in the place of people who actually making a bet on this thing to happen that wasn't inevitable that we had these arguments like the arguments that you know that I make for scaling or that continual learning will be solved you know that that you know some of us internally in our heads put a high probability on this happening but but it's like there's there's a world outside us that's not that's not acting on that's not kind of not acting on that at all and and I think I think the the weirdness of it I think unfortunately like the insularity of it like you know if we're one year or two years away from it happening like the average person on the street has no idea and that's one of the things I'm trying to change like with the memos with talking to policymakers but like I don't know I think I think that's just a that's just like a crazy that's just like a crazy thing.
我认为有几件事情值得注意,其中一个是,在这个指数增长的每个时刻,外界对它的理解程度都很有限。这是一种历史上常见的偏见,即任何已经发生的事情在回顾时都显得不可避免。因此,我认为当人们回顾过去时,他们会很难理解那些实际上在为这种不确定的事情下注的人们的心态。我们曾有过诸如规模化和持续学习将得到解决这样的争论。我们当中一些人在心里对此的发生给予了很高的概率,但外界并没有采取任何行动来应对这种可能性。我认为这种情况的怪异之处在于其封闭性,即便我们离实现仅有一两年的时间,普通大众对此却毫无头绪。这也是我努力改变的事情之一,比如通过备忘录与政策制定者沟通。但我不知道,我只是觉得这真是一件疯狂的事情。

yeah um finally I would say and in this probably applies to almost all historical moments of crisis um how absolutely fast it was happening how everything was happening all at once and so decisions that you might think you know we're kind of carefully calculated well actually you have to make that decision and then you have to make 30 other decisions on the on the same day because it's all happening so fast and and you don't even know which decisions are going to turn out to be consequential so you know one of my one of my I guess worries although it's also an insight into you know into kind of what's happening is that you know some very critical decision will be will be some decision that you know someone just comes into my office and is like Daria you have two minutes like you know should we should we do you know should we do thing thing a or thing b on this like you know someone gives me this random you know half page half page memo and it's like should we should we do a or b and I'm like I don't know I have to eat lunch let's do b and and you know that ends up being the most consequential thing ever hmm.
嗯,我最后想说的是,这可能适用于几乎所有历史上的危机时刻,就是事件的发展速度实在太快了,一切似乎都同时发生。因此,那些你以为经过深思熟虑的决策,其实都是在极短的时间内做出的。有时候,你可能一天内就要做出三十个决策,因为事情发生得太快了,而且你甚至不知道哪些决策会产生重大影响。 我有一个顾虑,同时这也是对现状的一种洞察,就是一些非常关键的决策可能是以一种非常随意的方式做出的。比如,有人走进我的办公室,对我说:“Daria,你有两分钟,应该选择方案A还是方案B?”可能只给我一张内容不到半页的备忘录,我在还没吃午饭的情况下随口说:“就选B吧。”结果,这个决定可能会成为最重要的一个。这种情况让我感到担忧。嗯。

so final question uh it seems like you have there's not text CEOs who are usually writing 50 page memos every few months and it seems like you have managed to build a rule for yourself and accompany around you which is compatible with this more intellectual type role SEO and I want to understand how you construct that and how like how does that work to be that you just go away for a couple of weeks and then you tell your company this is the memo like here's what we're doing it's also reported you write a bunch of these internally yeah so I mean for this particular one you know I wrote it over winter break um uh so there was the title you know and I was having a hard time finding the time to actually find it to actually write it but I actually think about this in a broader way um I actually think it relates to the culture of the company so I probably spend a third maybe 40% of my time making sure the culture of Anthropic is good.
最后一个问题,你似乎不像那些通常每隔几个月就写50页备忘录的CEO。你好像为自己和你的公司创造了一种更适合这种更具思考性的角色的规则。我想了解你是如何构建这种模式的。这是怎么运作的?是你出去几周,然后告诉公司“这是备忘录,我们要这样做”吗?据说你在内部写了很多这样的备忘录。嗯,其实对于这份备忘录,我是在冬季休假期间写的。我当时很难找到时间来真正着手写它,但从更广泛的角度来看,我认为这与公司的文化有关。所以,我大概花三分之一,甚至40%的时间来确保Anthropic的企业文化良好。

As Anthropic has gotten larger it's it's gotten harder to just you know get involved in like you know directly involved in like the train of the models the launch of the models the building of the products like it's 2500 people it's like you know there's just you know I have certain instincts but like there's only you know I it's very difficult to get to get to get involved in every single detail you know I like I try as much as possible but one thing that's very leveraged is making sure Anthropic is a good place to work people like working there everyone thinks of themselves as team members everyone works together instead of against each other and you know we've seen as some of the other AI companies have grown without naming any names you know we're starting to see the coherence and people fighting each other and you know I would argue there was even a lot of that from the beginning but but you know that it's it's gotten worse but I think we've done an extraordinarily good job even if not perfect of holding the company together making everyone feel the mission that we're sincere about the mission and that you know everyone has faith that everyone else there is working for the right reason that we're a team that people aren't trying to get ahead of each other's expense or backstab each other which again and happens a lot at some of the other places.
随着Anthropic的规模变大,参与诸如模型训练、模型发布、产品开发这些具体事务变得越来越困难,现在有大约2500人。我有自己的直觉和判断,但很难参与到每个细节中。我尽量去做这些事情,但其中一个重要任务是确保Anthropic成为一个良好的工作环境,员工们喜欢在这里工作,大家认为自己是团队成员,并且合作共事而不是互相对抗。在其他一些AI公司中(我不会点名),随着它们的发展,我们开始看到内部不和,员工之间互相争斗。我认为这些问题从一开始就存在,只是现在更严重了。但我认为我们在保持公司团结和让每个人都感受到我们的使命方面做得非常出色,即便不完美。我们努力让每个人相信,公司的每位成员都在为正确的理由工作,我们是一个团队,没有人试图通过损害他人来取得进步或背后捅刀子,这种情况在其他一些公司中是常见的。

And how do you make that the case I mean it's a lot of things you know it's me it's it's it's Denyella who you know runs the company day to day it's the co-founders it's the other people we hire it's the environment who try to create but I think an important thing in the culture is I something to you know the other leaders as well but especially me have to articulate what the company is about why it's doing what it's doing what its strategy is what its values are what its mission is and what it stands for and you know when you get to 2500 people you can't do that person by person you have to write or you have to speak to the whole company this is why I get up in front of the whole company every two weeks and speak for an hour it's actually I mean I wouldn't say I write it as season internally I do two things one I write this thing called the DVQ Dario vision quest um I wasn't the one who named it that that's the name it it received and it's one of these names that I kind of I tried to fight it because it made it sound like I was like going off and smoking pody or something but but the name just stuck.
要把这些事情做好需要很多因素,你知道的,比如我自己,Denyella 她每天管理公司,还有我们的联合创始人,以及我们聘请的其他人,还有我们努力创造的公司环境。但是,我认为公司文化中很重要的一点是,我以及其他领导者特别是我要明确表达公司的宗旨,为什么公司要做这些事情,公司战略是什么,价值观是什么,使命是什么,以及公司代表什么。你知道,当公司有2500名员工时,你不能逐一与每个人沟通,必须通过写作或在公司所有人面前演讲来传达。因此我每两周在全公司面前演讲一小时。实际上,我不算是编写内部季刊,我做两件事:一,我写了一个叫做 DVQ(Dario 远景探索)的东西。这不是我起的名字,是别人取的名字。我试图反对这个名字,因为它让我听起来像是去吸烟或做别的事情,但这个名字就是被沿用了。

So I get up in front of the company every two weeks I have like a three or four page document and I just kind of talk through like three or four different topics about what's going on internally the you know the the models were producing the products the outside industry the world as a whole as it relates to AI and geopolitically in general you know just some mix of that and I just go through very very honestly I just go through and I just I just say you know this is this is what I'm thinking this is what anthropic leadership is thinking and then I answer questions and and that direct connection I think has a lot of value that is hard to achieve when you're passing things down the chain you know six six levels deep um uh and you know a large fraction of the company comes comes to attend either either in person or either in person or virtually and it you know it really means that you can communicate a lot.
所以,我每两周都会站在公司面前,拿着一份三四页的文件,讲述三到四个关于内部事务的主题。比如我们正在研发的模型、产品,还有外部行业、全球范围内与人工智能相关的动态以及一些地缘政治方面的问题。我会非常诚实地分享自己和Anthropic领导层的想法,然后回答问题。我认为这种直接的沟通方式非常有价值,因为通过六层的层级传递信息无法达成这种效果。公司里很大一部分员工都会参加,不论是亲自到场还是通过虚拟方式,这种沟通能够传递大量信息。

Then the other thing I do is I just you know I have a channel in Slack where I just write a bunch of things and comment a lot um and often that's in response to you know just things I'm seeing at the company or questions people ask or like you know we do internal surveys and their things people are concerned about and so I'll write them up and I'm like I'm you know I'm I'm just I'm very honest about these things you know I just I just say them very directly and the point is to get a reputation of telling the company the truth about what's happening to call things what they are to acknowledge problems to avoid the sort of corpore speak the kind of defensive communication that often is necessary in public because you know the world is very large and full of people who are you know interpreting things in bad faith um but you know if you have a company of people who you trust and we try to hire people that we trust then then you know you can you can you can you know you can you can really just be entirely unfiltered um and uh you know I think I think that's an enormous strength of the company it makes it a better place to work and makes people more you know more the sum of their parts and increases likelihood that we accomplish the mission because everyone is on the same page about the mission and everyone is debating and discussing how best to accomplish the mission.
然后,我做的另一件事是,我在Slack上有一个频道,我常常在上面写很多东西并发表评论。通常,这是针对我在公司里看到的事情,或者是人们问的问题,或者是我们进行内部调查时人们关注的问题。我会将这些写下来,非常诚实地表达,直接说明问题是什么,承认存在的问题,避免使用那种在公众场合常见的公司式防御性交流,因为外面的世界很大,总有人会恶意解读。但在公司内部,如果你信任彼此,而我们也努力招聘我们信任的人,那么你可以完全不加掩饰。这是公司的一大优势,改善了工作环境,让大家的协作更加紧密,有助于更好地完成任务,因为每个人都在同一频道上讨论如何最好地实现目标。

Hmm well in lieu of an external Dario vision quest we have this interview this interview is a little like that uh this is in front of you thanks for doing it yeah thank you Dorkash hey everybody I hope you enjoyed that episode if you did the most helpful thing you can do is just share it with other people who you think might enjoy it's also helpful if you leave a rating or a comment on whatever platform you're listening on if you're interested in sponsoring the podcast you can reach out at workhash.com slash advertise otherwise I'll see you at the next one.
嗯,虽然没有外部的达里奥寻梦之旅,我们有这次访谈,它有点像那样。这次访谈就在你面前,谢谢你参与,谢谢你,Dorkash。大家好,希望您喜欢这一集。如果您喜欢,最有帮助的就是和其他可能也会感兴趣的人分享。如果您在使用的平台上留下评分或评论也很有帮助。如果您有兴趣赞助这个播客,可以访问workhash.com/advertise联系我们。否则,就在下一集见。