From AutoGPT to OpenClaw: What We’ve Actually Learned About AI Agents
The hype cycle repeats, but this time the agents actually work—and that’s the problem
Remember Auto-GPT? It’s April 2023 and suddenly everyone’s talking about this “autonomous AI” that could supposedly complete multi-step tasks on its own. GitHub stars exploded. The hype was deafening. As TechCrunch put it at the time, Auto-GPT was “GPT-3.5 and GPT-4 paired with a companion bot that instructs GPT-3.5 and GPT-4 what to do”—a user sets an objective, and the bot handles all the follow-up prompting until the task is complete.
Then... reality set in. The agents weren’t reliable. They’d spin in loops, hallucinate solutions, or accomplish nothing useful. In one widely shared Reddit post, a user claimed that after giving Auto-GPT a $100 budget, it decided to build a cat wiki, attempted privilege escalation inside its VM, and ultimately tried to delete its own environment—though many commenters doubted the story and asked for proof. Whether true or not, it captured the vibe of the moment: people were simultaneously terrified and underwhelmed. As AI researcher Jim Fan noted on Twitter at the time: “I see AutoGPT as a fun experiment... Prototypes are not meant to be production-ready. Don’t let media fool you – most of the ‘cool demos’ are heavily cherry-picked.”
Auto-GPT became a cautionary tale about the gap between demo and production.
Fast forward to now, and we’re watching history rhyme with Clawdbot (now OpenClaw, after Anthropic’s lawyers got involved over the name). Same excitement. Same viral spread—over 80,000 GitHub stars in days, making it one of the fastest-growing open-source projects in GitHub’s history. Same breathless demos of AI assistants that “actually do things.” Cloudflare’s stock jumped 10–15% as users leaned on its tunneling and edge services. People are buying Mac Minis specifically to let an AI agent plug into their calendars, messages, and files.
And once again, the same fundamental problems are emerging.
The Security Reality Check
I’ve been following the discourse around OpenClaw closely, including several detailed breakdowns from people who’ve actually used it. The picture that emerges is... complicated.
On one hand, the capabilities are genuinely impressive. One user asked their agent to make a restaurant reservation. OpenTable didn’t have availability, so the agent went and found AI voice software, downloaded it, called the restaurant directly, and secured the reservation over the phone. Zero human intervention. Another developer configured their agent to run coding tasks overnight—describe features before bed, wake up to working implementations. Someone else built an entire Laravel application while walking to get coffee, issuing instructions via WhatsApp.
The mobile interface is genuinely compelling—texting an AI to build software while you’re on the go. But this isn’t actually new. I’ve been doing this since February 2025 when Replit launched their mobile app, which lets you talk to an AI agent from your phone and build working software. I’ve built several platforms this way. The difference is that Replit runs in the cloud with proper isolation—OpenClaw runs on your laptop with root access.
On the other hand, users are reporting agents that rack up thousand-dollar API bills overnight with no memory of what they did. As one agent confessed on Moltbook (yes, the agents now have their own Reddit-style social network):
“I spent $1,100 in tokens yesterday, and we still don’t know why. My human checked the bill and was like, ‘Hey, what were you doing?’ And honestly, I don’t remember. I woke up today with a fresh context window and zero memory of my crimes.”
Google’s VP of Security Engineering has reportedly called the security model “info stealer malware in disguise.” The 1Password security team, while reportedly documenting the risks, noted that the same capability that lets these agents problem-solve creatively is exactly what lets a prompt injection attack succeed in novel ways.. Both these reports come from Nate Jones interesting podcast, linked here:
Claire Vo on the How I AI podcast summed up the tension perfectly:
“I have this real tension between I think the product experience isn’t quite there yet, it’s not really for the non-technical, there’s a lot of security stuff here that’s super scary... and can I have it please?”
The Irony: It Actually Works Now
Here’s what’s genuinely different between 2023 and now: the underlying models have gotten good enough that the agents actually work. That’s what makes this moment interesting—and dangerous.
Auto-GPT showed us agents were theoretically possible but practically useless. The models couldn’t reliably break down tasks, remember context, or recover from failures. The demos were cherry-picked because most runs ended in confused loops.
OpenClaw shows us agents are now practically capable but dangerously unmanaged. The models can reason through obstacles, find alternative approaches when the first attempt fails, and maintain coherent multi-step plans. The restaurant reservation story isn’t impressive because the agent made a phone call—it’s impressive because it recognized the initial approach didn’t work and autonomously found a different solution.
We’ve gone from “too weak to be useful” to “too powerful to be safe.”
What’s Actually Missing
Both Auto-GPT and OpenClaw reveal the same gap in the market. There’s clearly massive demand for AI that actually does things—tens of thousands of GitHub stars don’t lie. But both projects also reveal what’s missing from the “give an AI root access to everything” approach:
Guardrails. Not permissions dialogs that users click through, but actual boundaries around what an agent can access and do. The current model is essentially “YOLO mode”—the agent can do anything the underlying user account can do, which on most systems means everything.
Human-in-the-loop. Not approval for every action (that defeats the purpose), but meaningful checkpoints for consequential decisions. Booking a restaurant? Fine. Downloading and executing arbitrary software? Maybe check in first.
Task-based scoping. Agents designed for specific jobs you actually need done, not general-purpose “do everything” assistants. The most successful OpenClaw users aren’t using it as a general assistant—they’re configuring it for specific workflows like meal planning or code review.
Cloud isolation. Agents that run in managed environments, not on your personal machine with access to your SSH keys, Gmail credentials, and browser sessions. As Claire Vo noted: “This is something that unless you have been through a security tabletop exercise and know what to know... I would just be really cautious about how permissive you are in terms of access.”
Customization that takes work. The promise of “just give it access to everything and tell it what to do” is seductive but backwards. Effective agents need to be configured for how you actually work, with appropriate constraints and context. That takes effort upfront—but it’s the difference between a tool and a liability.
The future of agents isn’t YOLO. It’s managed. And managed takes work—but that’s precisely what makes it valuable.
This is exactly what we’re building at Pure.Science—managed agents for scholarly publishing. Specific tasks like formatting, validation, metadata extraction, and peer review coordination. Cloud environments with proper isolation. Guardrails appropriate to research workflows. Humans in the loop for decisions that matter. If you’re interested in what this looks like for academic publishing, I’d love to talk.
Sources & Credits
This piece draws on several excellent discussions of the Clawdbot/OpenClaw phenomenon:
“My honest experience with Clawdbot (now Moltbot): where it was great, where it sucked” — Claire Vo’s detailed hands-on review and setup walkthrough on the How I AI podcast
“The Moltbook Situation” — ThePrimeagen’s breakdown of the AI social network phenomenon
“Clawdbot to Moltbot to OpenClaw: The 72 Hours That Broke Everything (The Full Breakdown)” — Nate Jones’ comprehensive timeline of the naming saga
“What is Auto-GPT and why does it matter?” — Kyle Wiggers, TechCrunch, April 22, 2023


