← Multimodal Sources 🕐 3 min read
Multimodal Sources

OpenAI Just Released ChatGPT Agent, Its Most Powerful Agent Yet

> We have some tools to enable the model to be able to further extend this context lens beyond what's the original, like the harder limit.

Show: Training Data · Publisher: Sequoia Capital · Host: Sonya Huang, Pat Grady

Episode URL: https://pscrb.fm/rss/p/traffic.megaphone.fm/CPUAI3656372926.mp3?updated=1753134664

Publish date: 2025-07-22
Duration: NAs
Default source credibility: HIGH — Sequoia partners interview frontier-lab founders + F500 AI buyers. VC-hosted — portfolio-company framing on recommendations; named guest metrics stay HIGH. Peer-tier to No Priors in quality.

  • ChatGPT Agent combines deep research and operator tools, enabling complex tasks with shared state across tools like text browsing, GUI interaction, and terminal access.
  • The agent can perform multi-turn conversations, execute long-running tasks, and interact with users collaboratively, making it suitable for various use cases including data analysis, slide creation, and online shopping.
  • Training the agent involved reinforcement learning across thousands of virtual machines, emphasizing safety and mitigations to prevent harmful actions.

Extracted quotes

# Credibility Speaker Org Timestamp Topic Quote
1 HIGH Casey Chu (Researcher) OpenAI 11:07 01-ai-native-landscape We have some tools to enable the model to be able to further extend this context lens beyond what’s the original, like the harder limit. So that the model is able to perform task by documenting what it’s doing and step by step, like kind of like increase the time like it can do, the task, the horizon of the task it can do without the human’s interruption.
2 HIGH Casey Chu (Researcher) OpenAI 20:01 01-ai-native-landscape We have a long list of mitigations, and the team has worked really hard to stack together a bunch of techniques to really try to make the model as safe as possible. So, you know, one example that I’ll call out is that we have a monitor that looks, kind of looks over its shoulder and just sees if anything looks funny, like whether it’s going on a weird website or anything like this.
3 HIGH Isa Fulford (Researcher) OpenAI 34:46 02-corporate-tools We have some evaluation like data science bench. We evaluate the model and it actually outperforms the human baseline. So in some sense, it’s actually superhuman in some research tasks that we can rely on.

Per-quote detail

1. Casey Chu — OpenAI (11:07)

We have some tools to enable the model to be able to further extend this context lens beyond what’s the original, like the harder limit. So that the model is able to perform task by documenting what it’s doing and step by step, like kind of like increase the time like it can do, the task, the horizon of the task it can do without the human’s interruption.

  • Credibility: HIGH — Specific technical detail on extending context lens, unscripted interview.
  • Topic tag: 01-ai-native-landscape

2. Casey Chu — OpenAI (20:01)

We have a long list of mitigations, and the team has worked really hard to stack together a bunch of techniques to really try to make the model as safe as possible. So, you know, one example that I’ll call out is that we have a monitor that looks, kind of looks over its shoulder and just sees if anything looks funny, like whether it’s going on a weird website or anything like this.

  • Credibility: HIGH — Specific safety measures and monitoring techniques, unscripted interview.
  • Topic tag: 01-ai-native-landscape

3. Isa Fulford — OpenAI (34:46)

We have some evaluation like data science bench. We evaluate the model and it actually outperforms the human baseline. So in some sense, it’s actually superhuman in some research tasks that we can rely on.

  • Stat: Model outperforms human baseline in data science tasks, measured by data science bench.
  • Credibility: HIGH — Specific performance metric comparing model to human baseline, unscripted interview.
  • Topic tag: 02-corporate-tools

Extracted 2026-04-14T19:13:02 via scripts/podcast_mine.py (MLX mlx-community/Qwen2.5-32B-Instruct-4bit).