Sam Altman's Blockbuster Announcement: OpenAI to Unveil Open-Source Model, GPT5 Advances Toward Full Multimodal Capabilities (Complete 10,000-Word Transcript)
Recently, in an interview with Garry Tan, the president of YC, Sam Altman, the founder of OpenAI, made a significant preview of the latest progress of the GPT large - model: OpenAI is about to launch a powerful open - source model and hinted at some capabilities of GPT - 5.
This interview is extremely informative. Let's highlight the key points for you:
1. OpenAI is about to launch an open - source model, and GPT - 5 is moving towards full multimodality
Sam Altman previewed that OpenAI is about to release an open - source model. This model will far exceed everyone's expectations and has the ability to run powerful models locally, which will greatly accelerate the popularization and innovation of AI technology.
GPT - 5 is expected to be launched this summer. It is a multimodal model that supports multiple input methods such as voice, image, code, and video. GPT - 5 will not fully realize OpenAI's ultimate vision for future models, but it will be an important step towards that vision.
OpenAI's ultimate full multimodality will have deep reasoning ability, be able to conduct in - depth research, generate real - time videos, write a large amount of code, instantly create new applications for users, and even render real - time videos for user interaction. When OpenAI achieves full multimodality, it will bring a brand - new computer interface.
2. The current situation of AI development and product innovation: great potential and a sharp drop in cost.
Sam Altman believes that the capabilities of current AI models (such as GPT - 3) obviously exceed the application level of existing products, and there is a huge "product spillover" space. This means that even if the model capabilities do not improve, they can still support the development of a large number of new products.
The cost of using AI models is dropping sharply. For example, the cost of GPT - 3 dropped by five times in just one week, and this trend will continue. The decline in the price - to - performance ratio is astonishing.
Sam Altman mentioned the memory function of ChatGPT, which will enable AI to develop into an entity participant that understands users, runs continuously, actively provides help, and is integrated into various devices and services. Currently, users have begun to set ChatGPT as an operating system and connect their lives to multiple data sources.
3. This year is the "Year of Agents", and agents become "junior employees"
Sam Altman quoted the view of Greg Brockman, the president of OpenAI, saying that this year is the "Year of Agents". He described AI agents as "Level 3 AGI" (L3), which can act like a "junior employee" and perform hours of work in front of a computer, such as receiving tasks and independently completing a series of operations. Most of the work in the world that is completed in front of a computer in units of a few hours may be replaced by agents, and existing models can already build a large number of such experiences.
Sam Altman divides the realization of AGI into five levels from L1 to L5: Chatters (L1); Reasoners (L2); Actors (L3), Innovators (L4), and finally Organizers (L5).
4. Seize the opportunity of technological change. Now is the best time for entrepreneurship in the history of technology
Sam Altman called on entrepreneurs to seize the opportunity of technological change. Now is the "best time for entrepreneurship in the history of technology". AI will, like the invention of the transistor, improve the quality of life faster and more profoundly. Entrepreneurs should seize this historical opportunity to create new value for the world. Because when there are such huge changes in industry technology, large companies often perish, and small companies usually iterate faster and at a lower cost than large companies.
He also encouraged entrepreneurs not to try to copy OpenAI's core chat assistant but to focus on unmet markets and unique challenges, because the most successful and defensive companies usually do not follow the same path as others.
Original video link:
https://www.youtube.com/watch?v=V979Wd1gmTU
The following is the full text of the interview:
OpenAI is committed to achieving AGI
Garry Tan: Hi, Sam. Thank you so much for joining us and for all the inspiration you've brought. OpenAI itself is a real source of inspiration for anyone truly ambitious. Maybe we can start with this. Were there any seemingly small decisions in the early days that became extremely crucial turning points?
Sam Altman: The decision to do this was a major one. We almost didn't launch OpenAI at that time. Artificial General Intelligence (AGI) sounded crazy. I had my job at that time, and we had many other great things to do, all of which would have worked. All those great startups and AGI used to be like a daydream. Even if it was possible, DeepMind seemed almost impossible to catch up with and was far ahead. So, throughout 2015, we were constantly discussing whether to launch it, and it was a bit like a coin - toss decision. I think this is the story of many ambitious things. They all seem very difficult, and there are many good reasons not to do them. It really takes a group of people sitting in a room, looking at each other, and saying, "Okay, let's start." These are very important moments, and I think when in doubt, you should be brave.
Garry Tan: So, there are countless reasons. There are a billion reasons why people might say you shouldn't do it. At the beginning, even... One of the things you discovered was the scaling law.
Sam Altman: It's really hard to remember the situation at that time. Next year will be our tenth anniversary, so, no, it's not. Thank you. But to remember the atmosphere of artificial intelligence ten years ago, it was long before the first effective language model emerged. We were trying to play video games at that time. We had a small robotic hand that could barely solve the Rubik's Cube, but we had no idea about products, no revenue, and actually didn't think we would have any. We were just sitting around the conference table, looking at the whiteboard, and discussing. Trying to come up with ideas for writing papers. It's really hard to explain. It seems obvious now, but it seemed very impossible back then. The idea of a chatbot was completely in the realm of science fiction.
Garry Tan: One thing that impressed me was that you kind of called on people to work on the research and development of AGI. And at the same time, you found the smartest people in the world who were working on that. The second part was actually easier than it sounded.
Sam Altman: If you say you're going to do this crazy thing, which is exciting and important, and if it succeeds and no one else is doing it, you can actually gather a lot of people. So we said, "Okay, we're going to pursue AGI." 99% of the people in the world thought we were crazy. 1% of the people really resonated with it. It turns out that there are many smart people among that 1%. And they had almost nowhere else to go. So we were able to really concentrate talent. This is a mission that people care about. So even if it seems unlikely, if it succeeds, it will be very valuable. We've observed this in many startups. If you're doing the same thing as others, it's hard to focus. Talent is important, and it's hard to make people really believe in a mission. If you're working on a unique project, you'll really have a good tailwind.
Garry Tan: Okay. So, some people in this room might be thinking, should I try to start an OpenAI? Should I consider the scale from the beginning? You were also involved in the development of Loop before. Do you think you learned any lessons from that?
Sam Altman: OpenAI wasn't an open - scale project at the beginning. OpenAI was like a few people in a room, and then there were 20 people in the room. It was very unclear what to do. We were just trying to write a good research paper. So, the things that eventually became very important weren't like that at the start. I think it's important to dream that it could be very big if it succeeds. Nothing big starts that way. There's a quote I've always liked: There's a big difference between a startup worth $0 million and a startup worth $0 billion, but their revenues are both zero dollars. They're both like a few people sitting in a room, and you're both just trying to make the first thing work. So, my only advice on trying to start is to choose something big, choose a market that seems like it could be big in some future version if it succeeds. But other than that, it's like taking stupid steps one after another for a long time.
GPT - 4o and the future of inference models
Garry Tan: The way people use ChatGPT has changed a lot. The way people use your API has changed a lot. For the latest models like o3, what surprises you the most? What emerging behaviors or use cases have impressed you so far?
Sam Altman: I think we're in a very interesting period. We haven't had a period like this for a while. But right now, we're in an interesting period. The product backlog compared to what the models can achieve is here. The products that people have come up with to make are far below here. Even if the models don't get better, and of course they will, there are still a large number of new things that can be made. And just like last week, the cost of o3 was one - fifth of this week's. And this situation will continue. I think people will be shocked by the decline in the performance - to - price ratio.
We're about to launch an open - source model. I think everyone will be, I don't want to steal the team's thunder by announcing this in advance, but I think you'll all be shocked. I think it will be far better than you expect, and the ability to run very powerful local models with it will really surprise people. So, you have a world where the capabilities of models have entered a very new territory. The cost of APIs will drop significantly. The open - source models will be very good. I think we haven't seen the real level of new product innovation yet. The capabilities of inference models are reasonable because they're quite new, but this is a special time to start a company that takes advantage of this. It's like a new square on the periodic table of emerging things that no one has ever used. So, I think it's only until last month that we've really started to see the emergence of startups. Those who are saying that good, inference models are different. The entire interaction model is different, and they're really built for this.
ChatGPT memory and vision
Garry Tan: For me, even memory has become like having a conversation with someone who understands me. It's very interesting. Yes, memory is my favorite feature launched this year.
Sam Altman: I think most people at OpenAI won't... People will say because we've launched many products, but I love the memory feature in ChatGPT. And I think this points to the direction we hope our products will develop in the future, that is, you'll have an entity that understands you, connects all your things, and actively helps you. It won't just be that you send a message and it replies. It will run continuously. It will look at your things. It will know when to send you a message. It will know when to do something on your behalf. You'll have, you know, special new devices, and it will be integrated into every other service you use. You'll have such a thing running throughout your life. I think memory is the moment when people can first foresee that.
Garry Tan: You mentioned it a little on Twitter before. When will that come? Can you give us a preliminary leak of the time?
Sam Altman: I think the answer is gradually. If I had a date in mind, I might be very excited to tell you. But it's a bit of a memory - related issue, right? When it runs continuously, there will be a little more here. When it runs in the background and sends you content, there will be more here. When we launch the first new device, there will be much more. But I think the key is not that small piece of hardware. It's that this thing reaches a point where it can run in the background and feels like a kind of AI companion.
Garry Tan: I think we're starting to see the powerful function of integrating large - language models (LLMs) with your real data. I've heard rumors that MCP is going to join OpenAI. Was there anything surprising in the actual integration process? Have you seen anyone actually operate their core databases? You know, at YC, we actually have that agent infrastructure for internal use, and we've been using it.
Sam Altman: People are really starting to see ChatGPT as an all - encompassing operating system, integrating their entire lives into it. It's very important to integrate as many data sources as possible. Devices that you carry around all the time, like new - type web browsers, connect all data sources, memory, and then there are the continuously running models. Combining these, I think you'll reach a fairly powerful situation.
Garry Tan: Do you think it will be in the cloud in the future, or on our desktops? Or some kind of combination of the two?
Sam Altman: Some combination of all these. There will definitely be people running local models for certain things. For example, if we could shift half of the chatbot workload to your local device, no one would be happier than us. Like our cloud. I think we'll soon be operating the largest and most expensive infrastructure project in the world. So, if we could postpone a part of it, that would be great. But a lot will still run in the cloud.
Garry Tan: Are you surprised that it's so difficult to obtain computing resources?
Sam Altman: We've become very good at it, but it's true. We've gone from having nothing, absolutely no chatgpt.com two and a half years ago, to being one of the top five largest websites in the world. It will become the third - largest at some point, hopefully one day if our current growth rate continues. I think it's difficult to achieve this anyway. Usually, you need a much longer time to scale up, like building infrastructure for a new company.
GPT - 5 and the vision of multimodal super - models
Garry Tan: But there are many people willing to help. The work you've done is truly incredible. We see that inference models like o3 and o4 mini are developing in parallel with multimodal models like 4o. What will happen when these two threads merge? So, what's the vision for GPT - 5 and beyond?
Sam Altman: We won't reach that goal with just GPT - 5, but ultimately, we really hope to have a comprehensive model that can perform inferences. When needed, it will generate content like real - time videos. If you ask a question, you can imagine it thinking very hard, conducting some research, and writing a bunch of code in time for a brand - new application just for you, or something like rendering a live video that you can interact with. So, I think that will feel like a brand - new computer interface. That AI is already kind of like that, but when we encounter a model like this that is truly fully multimodal, like perfect video, perfect coding, everything, and deep reasoning, it will feel very powerful.
Scaling up robots
Garry Tan: This seems like a small step towards embodiment. That is, having vision, language, and reasoning abilities. It's a small step, and basically, it's the kind of robot we want.
Sam Altman: Our strategy has always been to get the first step right and then make sure we can connect it to robots. But the era of robots is coming. I'm really excited about a world where when you subscribe to the highest - level service of the chatbot, we'll also give you a humanoid robot for free.
Garry Tan: The future will be really crazy. Being able to have robots that can do real work in the real world.
Sam Altman: I think we're not far from that now. The mechanical engineering of robots has always been quite difficult, and the AI cognitive part is also quite difficult, but it feels within reach. I think in a few years, robots will start to do very useful things. It will still take some time to manufacture a billion robots, but I don't know. I'm interested in the question of "how many robots are