OpenAI announces the Operator AI agent that can browse the web for you
Operator, an OpenAI agent capable of automating multi-step tasks, has arrived.
The maker of ChatGPT launched a preview of Operator mode on Thursday, explaining how it works and what it can do. A user can browse the web, perform tasks such as calculating a refund from a canceled order and finding customers with certain criteria in the internal sales database. It can also buy groceries and send emails.
On a computer, an Operator can perform a variety of tasks, such as downloading files, merging PDFs, analyzing spreadsheets, and exporting images.
OpenAI presents its promise to make 2025 the year of agent AI. Last week, the company launched Tasks for ChatGPT, which allows users to automate future tasks such as sending a daily tech news brief or setting reminders. While many of these tasks are already possible with basic tools like Google Alerts and calendars, an early example of AI bots doing the legwork for the user. Combined with the release of Operator and its ability to automate complex tasks, you can begin to see OpenAI’s vision of making ChatGPT a valuable tool that leverages its core product.
Mashable Light Speed
The operator support model is a Computer-Using Agent (CUA) that includes GPT-4o’s visual mode to “see” what is on the user’s screen through screenshots with graphical user interfaces (GUIs) that enable the Operator to interact with the screen (by clicking buttons, typing, scrolling, etc.).
The user is active, browsing a Yosemite campsite with picnic tables.
Credit: OpenAI
OpenAI and Operator security approach
Obviously, security is a major concern for an autonomous AI agent like Operator. OpenAI says it has considered the vulnerability in several different ways. The operator minimizes misuse by preventing dangerous or illegal activities, and cannot access restricted sites such as gambling and adult entertainment sites and drug or gun shops.
And OpenAI is looking over your shoulder as you use Operator. The announcement states that “user interactions are reviewed in real-time by automated security inspectors designed to ensure compliance with Usage Policies and have the ability to issue warnings or block prohibited activities,” and that the company has developed “automated detection and human review pipelines to detect prohibited use in key policy areas, including child safety and fraudulent activities.” .”
Since the operator can make costly mistakes without human supervision, the model will ask for confirmation “before sending an order, sending an email, etc., so the user can double-check the model’s work before it becomes permanent.” The operator is also currently limited to “high-risk transactions such as banking.”
Operator availability
Now is where we start to see OpenAI’s new premium tier subscription, ChatGPT Pro. The operator in preview mode is only available in the US for those who pay $200 per month as Pro users. But over time, OpenAI expects to expand availability to Plus, Team, and Enterprise users.