Elon Musk's xAI used clips from 'Hellboy II' to train workers for its video AI project

2 months ago 31

Photo collage showing Elon Musk, a scene from Hellboy, and a robot holding a phone displaying the Grok logo.

Allison Robbert/AFP via Getty Images; Millennium Media/Ketchup Entertainment; Alyssa Powell/BI

Elon Musk said xAI's chatbot Grok will release "watchable" full-length films next year.
XAI tutors have annotated short videos to train the AI, including Hollywood films, workers say.
XAI is one of many AI companies that are navigating complex copyright concerns.

Elon Musk wants Grok's name in lights.

Grok Imagine, an image and video generation tool, debuted in July. Musk said earlier this month that the company plans to release a "watchable" full-length film by the end of 2026, and "really good movies" in 2027. He has hyped up the chatbot's image generation skills, from reenactments of the final scene of "King Kong" to a version of "Iron Man" in which he plays Tony Stark.

Over the past few months, employees at the AI company have worked on multiple internal video annotation projects.

In August, dozens of AI tutors began painstakingly annotating short video clips for a project internally referred to as "Vision," three people with knowledge of the initiative said. Vision's onboarding process had workers label footage from Universal Pictures' "Hellboy II: The Golden Army," according to internal documents viewed by Business Insider.

Workers were instructed to perform a detailed labeling process on five- to 10-second video clips, the people said. They labeled shot composition, camera depth and view, cinematography style, and lighting, the documents show. Workers also provided in-depth breakdowns of the scene's setting and each object in the field of view.

Like much of the AI industry, the use of copyrighted material to train models is fluid and complex. Some Hollywood studios and other rights-holders argue such training could infringe on copyright, while some tech companies contend it's necessary to build sophisticated products.

Whatever side wins out will help determine how AI tools like Grok evolve, and who will profit from the creative work that feeds the system.

"At every stage of the process — downloading the data, storing the data, filtering, then with outputs, at every stage there is possible infringement," Matt Blaszczyk, a research fellow at the University of Michigan Law School, told Business Insider. "The question is if they're doing it for the machine to learn or to generate outputs."

In response to a detailed list of emailed questions about this story, xAI wrote, "Legacy Media Lies." The company responded with the same message to multiple follow-up emails requesting clarification.

Spokespeople for Universal Pictures did not respond to a request for comment. (In August, Universal Pictures began adding warnings to its films that the content "may not be used to train AI.")

Two workers said they recalled annotating clips from other Hollywood films and TV shows as part of their work on Vision. The project also involved annotating creator-made videos and foreign films, workers said.

Two workers described Vision as similar to an exercise they'd expect to see in film school, and more detailed than most projects they'd worked on at xAI.

Employees also worked on a separate video project referred to as "Moongazer," which involved identifying individual elements of the clips, like transitions, captions, and infographics. They said video clips included news segments, amateur videos, tutorials, and foreign films.

'Heckboy'

XAI is one of many AI companies attempting to produce its own videos — and walking a fine legal line in the process.

Mark Lemley, the director of Stanford University's Program in Law, Science and Technology, told Business Insider that Hollywood studios need to find a balance between protecting their work and encouraging technology that could benefit them in the future.

"Part of finding that balance is that if we want the technology to work well, it has to be trained on quality work," he added. "You'll get worse AI if you're only using amateur videos or if you're limited to a small subset of licensed material."

OpenAI expressed a similar sentiment in a submission to the House of Lords communications and digital select committee last year.

"Because copyright today covers virtually every sort of human expression — including blogposts, photographs, forum posts, scraps of software code, and government documents — it would be impossible to train today's leading AI models without using copyrighted materials," the company wrote.

In June, Disney and Universal filed a joint copyright infringement lawsuit against text-to-image AI company Midjourney, which has said it plans to release an AI video service. The complaint alleges the company trained its AI models on copyrighted material from movies. Midjourney said in a court filing that AI training is a form of "fair use" and is protected by copyright law.

Anthropic settled a copyright infringement lawsuit for $1.5 billion last month. The company was accused of using pirated books to train its large language model. Several news agencies have also pursued lawsuits against AI companies. In February, Business Insider joined several other news organizations in suing the AI company Cohere over claims of copyright infringement. (Cohere has filed a motion to dismiss the suit and has argued the content is fair use.)

Hayleigh Bosher, an intellectual property researcher at Brunel University, told Business Insider that the legal industry is still rushing to keep up with the rapid pace of AI innovation.

"The key factor seems to be whether the output will compete commercially with the original work and what that means for the market," Bosher said.

Some AI companies have implemented guardrails to prevent the AI models from spitting out copyrighted materials.

When OpenAI released the newest version of Sora, its AI video generation app, it allowed users to create videos featuring characters from their favorite films and TV shows. A few days later, the company restricted users' ability to generate copyrighted characters.

OpenAI CEO Sam Altman wrote in a blog post that the company planned to "give rightsholders more granular control over generation of characters, similar to the opt-in model for likeness but with additional controls." The company also said on Monday that it is working with actor Bryan Cranston to limit deepfakes on its video app.

In a handful of tests, chatbots with image generation capabilities and AI image generation tools like ChatGPT, Midjourney, and Gemini were inconsistent when it came to restricting production of copyrighted images.

When Business Insider initially asked ChatGPT to create an image of Hellboy, the large language model said it "can't create or modify copyrighted characters like Hellboy directly." The bot offered to make "something inspired by Hellboy" instead.

The result: a red-skinned and horned demon named "Heckboy."

In later tests, the chatbot offered to create "a lookalike homage" and later agreed to add the title "Hellboy" to the image.

ChatGPT offered to generate images inspired by the Hellboy comic book character.

xAI's Grok chatbot provided an image of Hellboy, and its Grok Imagine feature gave dozens of options for images and short clips of AI-generated Hellboy when Business Insider tested it.

Grok Imagine created multiple images of Hellboy when prompted.

Spokespeople for OpenAI, Google, Midjourney, and Anthropic did not respond to a request for comment.

Lemley said it is risky for models to generate copyrighted content and added that "the suggestion of how to create something similar to Hellboy seems particularly problematic."

Yelena Ambartsumian, an AI governance and intellectual property lawyer, told Business Insider that she expects many AI companies to try to train on as much high-quality content as possible.

"Their bet is: 'We're going to develop this and claim it's transformative, so we don't have to pay for the work. Our company will be a success and we can afford to pay for it later, or our company will fail and it won't matter,'" she said.

Do you work for xAI or have a tip? Contact this reporter via email at [email protected] or Signal at 248-894-6012. Use a personal email address, a nonwork device, and nonwork WiFi; here's our guide to sharing information securely.

Read Entire Article