- BI got early access to Manus, which claims to be the world's first fully autonomous AI agent.
- It structured tasks well but stumbled in execution — hallucinating data and creating clunky designs.
- Here's how it did at capturing public opinion on DOGE, and building a startup from scratch.
We tested Manus, the new general AI agent from China, that promises to be the future of AI helpers, requiring minimal human oversight.
Since its launch last week, it has already been praised by AI experts and industry observers, with some even calling it "the second DeepSeek."
For now, Manus is currently invite-only, but I was among a small proportion of waitlisted users granted access.
I wanted to see if it could live up to its promise as a fully autonomous general AI agent.
Here's what I asked it to do — and how it handled those tasks.
Task 1: Analyze DOGE sentiment in news and social media
Manus claims to be able to scrape the internet, analyze public discourse, and map real-time sentiment shifts on social media and news sites.
I asked it to analyze how the public is reacting to federal workforce cuts under the Department of Government Efficiency, or DOGE.
From its initial response to my prompt, things looked promising.
But Manus didn't really get the memo.
First, it couldn't find any reactions on social media — despite the fact that federal workforce cuts have been making headlines for many weeks.
Instead of stopping and asking if I wanted real news articles, it instead simulated public discourse about DOGE.
Then, it got worse.
For the next five minutes, I watched it generate fake social media reactions and accounts, completely made-up tweets, and even showed real websites making up posts that did not appear to be real.
At no point did it ask if I wanted this. I didn't.
This went on for 20 minutes. There was an option to step in and take control, but that seemed to me to be at odds with the whole point of this supposedly being a fully autonomous agent capable of working independently.
The final report pulled fake data from real websites, including Taxpayers for Common Sense, described as "a fiscally conservative watchdog organization with the highest overall influence in news coverage."
But Manus' claim that these were the most influential voices on DOGE was questionable at best. Among its top-listed sources was a Medium blog called Progressive Times, which hadn't published anything since 2017 — long before DOGE existed.
As for social trends, Manus appears to have provided made-up X and Reddit users and listed them as driving online discourse about DOGE.
The one redeeming feature was the visualization of its — completely fake — dataset. The way it categorized sentiment, made predictions, and generated visual breakdowns was impressive.
It could have been useful if it had been working with real data. But since it wasn't, it just felt like a very polished way of presenting entirely fabricated information.
At first glance, the report looked legitimate, complete with a convincing reference list. But only at the very end — buried in fine print — was a disclaimer saying the entire 10-page analysis was of synthetic data.
If someone needed a real sentiment analysis and wasn't actively monitoring the agent's actions, they'd end up with useless results.
It left me with very little confidence going into the next task.
Task 2: Launch a business to solve the rising price of eggs
For this test, I asked Manus to develop a startup to tackle rising egg prices. Admittedly, my request was ambitious: I wanted a business plan, a founder's backstory, a fully designed website, brand guidelines, a marketing strategy, and even a logo and business card.
From the moment I hit return, Manus was enthusiastic, entrepreneurial, and organized — a stark contrast to the earlier test where it made up data, and needed constant course correction.
This time, it got off to a smooth start. The process looked structured and methodical.
Manus was fantastic at outlining multiple strategies, and managing expectations throughout.
Things were looking up!
Halfway through, it offered to show me progress, revealing the first branding assets for my new business: Eggonomy™, a "direct-to-consumer egg savings platform."
The logo's odd, petri dish-esque design felt like it had been lifted from the pages of a school science textbook. It also provided a basic business card with the slogan "eggs without the price shock."
But I held out hope. Given the scale of the task, I expected it to take much longer, and it didn't appear to hit any technical roadblocks.
The process was clear, fast, and easy to follow — until it wasn't.
After half an hour, Manus told me the final product, Eggonomy™, was ready.
I was taken aback by a first look at the website, which looked clean and vaguely egg-related.
But something was off.
The blog section featured random, unrelated posts that had nothing to do with eggs.
It didn't take long to figure out why. Eggonomy already existed. The website wasn't generated from scratch — and it was registered in 2016, according to domain checking services.
At least the business strategy appeared to be backed by real data and market research.
Manus was great at brainstorming brand names, structuring business plans, and analyzing the main competitors — but its execution was way off.
Worse, it wasn't transparent about lifting an existing website — unlike in the DOGE task, where it at least admitted to using synthetic data.
Manus isn't ready to go solo yet
Manus is fascinating to watch in action, but for now, it's far from the fully autonomous agent it claims to be.
That said, the two tests I threw at Manus weren't formal or scientific. On the GAIA benchmark — a more robust measure of AI utility — Manus claims to outperform OpenAI's DeepResearch and GPT-4.
While it's not ready to work alone yet, it is still an early version of the tool.
It could be a powerful AI assistant if it stabilizes, improves data reliability, and stops making things up. For now, it's more of a research intern than a fully independent operator.
Manus did not immediately respond to Business Insider's request for comment.