November 14, 2024

Meta Platforms on Thursday released early versions of its latest large language model, Llama 3, and an image generator that updates pictures in real time while users type prompts, as it races to catch up to generative AI market leader OpenAI.

The models will be integrated into its virtual assistant Meta AI, which the company is pitching as the most sophisticated of its free-to-use peers, citing performance comparisons on subjects like reasoning, coding and creative writing against offerings from rivals including Alphabet’s Google and French startup Mistral AI.

The updated Meta AI assistant will be given more prominent billing within Meta’s Facebook, Instagram, WhatsApp and Messenger apps as well as a new standalone website that positions it to compete more directly with Microsoft-backed OpenAI’s breakout hit, ChatGPT.

A landing page greeting visitors on that site prompts them to try having the assistant create a vacation packing list, play 1990s music trivia with them, provide homework help and paint pictures of the New York City skyline.

Meta has been scrambling to push generative AI products out to its billions of users to challenge OpenAI’s leading position on the technology, involving a pricey overhaul of computing infrastructure and the consolidation of previously distinct research and product teams.

The social media giant has been openly releasing its Llama models for use by developers building AI apps as part of its catch-up effort, as a powerful free option could stymie rivals’ plans to earn revenue off their proprietary technology. The strategy has elicited safety concerns from critics wary of what unscrupulous actors may use the model to build.

Meta equipped Llama 3 with new computer coding capabilities and fed it images as well as text in training this time, though for now the model will output only text, Meta Chief Product Officer Chris Cox said in an interview.

More advanced reasoning, like the ability to craft longer multi-step plans, will follow in subsequent versions, he added. Versions planned for release in the coming months will also be capable of “multimodality,” meaning they can generate both text and images, Meta said in blog posts.

“The goal eventually is to help take things off your plate, just help make your life easier, whether it’s interacting with businesses, whether it’s writing something, whether it’s planning a trip,” Cox said.

Cox said the inclusion of images in the training of Llama 3 would enhance an update rolling out this year to the Ray-Ban Meta smart glasses, a product made with glasses maker Essilor Luxoticca, enabling Meta AI to identify objects seen by the wearer and answer questions about them.

Meta shares closed up 1.5% on Thursday.

Meta also announced a partnership with Google to include its real-time search results in the assistant’s responses, supplementing an existing arrangement with Microsoft’s Bing search engine.

The Meta AI assistant is expanding to more than a dozen markets outside the US with the update, including Australia, Canada, Singapore, Nigeria and Pakistan. Meta is “still working on the right way to do this in Europe,” Cox said, where privacy rules are more stringent and the forthcoming AI Act is poised to impose requirements like disclosure of models’ training data.

Generative AI models’ voracious need for data has emerged as a major source of tension in the technology’s development.

Meta CEO Mark Zuckerberg nodded at the competition with OpenAI in a video accompanying the announcement, in which he called Meta AI “the most intelligent AI assistant that you can freely use.”

Zuckerberg said the two smaller versions of Llama 3 rolling out now, with 8 billion parameters and 70 billion parameters, scored favorably against other free models on performance benchmarks commonly used to assess model quality. The biggest version of Llama 3 is still being trained, with 400 billion parameters, he said.

Those results were “undoubtedly impressive,” but also indicative of a growing performance gap between free and proprietary models, said Nathan Benaich, founder of AI-focused venture firm Air Street Capital.

Developers have complained that the previous Llama 2 version of the model failed to understand basic context, confusing queries on how to “kill” a computer program with requests for instructions on committing murder. Rival Google has run into similar problems and recently paused use of its Gemini AI image generation tool after it drew criticism for churning out inaccurate depictions of historical figures.

Meta said it cut down on those problems in Llama 3 by using “high quality data” to get the model to recognize nuance. It did not elaborate on the datasets used, although it said it fed seven times more data into Llama 3 than it used for Llama 2.