What Meta learned from Galactica, the doomed model launched two weeks before ChatGPT

Key Takeaways:

– Meta released a research demo called Galactica, a large language model for science, trained on 48 million scientific papers.
– Galactica received backlash for producing unscientific and offensive output, leading to its removal after three days.
– ChatGPT, another AI model, was released shortly after Galactica.
– ChatGPT also had issues with generating false information, but it quickly gained popularity with millions of users.
– Galactica was intended as a research project and not a product, but the gap between expectations and reality was too big.
– Lessons from Galactica’s release led to the development of Meta’s next generation of models, including Llama.
– Llama, released in February 2023, sparked a debate about open-source AI and was followed by Llama 2 and Code Llama.
– Meta faced criticism for limiting access to Llama due to negative reactions to Galactica’s release.
– The lessons learned from Galactica’s release were applied to the next generation of models.

VentureBeat:

VentureBeat presents: AI Unleashed – An exclusive executive event for enterprise data leaders. Hear from top industry leaders on Nov 15. Reserve your free pass


One year ago — and two weeks before OpenAI released ChatGPT — Meta released a research demo called Galactica. An open source “large language model for science” that was trained on data including 48 million scientific papers, Meta touted Galactica’s ability to “summarize academic literature, solve math problems, generate Wiki articles, write scientific code, annotate molecules and proteins, and more.”

Galactica survived publicly for only three days. On November 17, Meta took down the demo after an outcry over what was, back then, a word that had not yet made it into the mainstream: Hallucinations. Many were appalled by Galactica’s sometimes very unscientific output, which, like other LLMs, included information that sounded plausible but was factually wrong and in some cases also highly offensive. 

At the time, Meta chief scientist Yann LeCun stuck up for the model and posted a series of defensive tweets: “It’s no longer possible to have some fun by casually misusing it. Happy?”), but to no avail. Galactica would not be the game-changing model for the generative AI era.

Two weeks later, ChatGPT was released into the wild

That same week, however, tantalizing rumors about the upcoming release of GPT-4 — which some predicted could be in a few months — made the rounds. And just two weeks later, as many AI researchers attending NeurIPS in New Orleans whispered hopefully that OpenAI might release GPT-4 at the conference, suddenly there it was — ChatGPT, released into the wild.

VB Event

AI Unleashed

Don’t miss out on AI Unleashed on November 15! This virtual event will showcase exclusive insights and best practices from data leaders including Albertsons, Intuit, and more.

 


Register for free here

Of course, it was quickly clear that ChatGPT had its own hallucination problem. Like Galactica and other generative AI models, ChatGPT quickly spit out eloquent, confident responses that often sounded plausible and true even if they were not. OpenAI made this weakness very clear in its blog announcing ChatGPT and explained that fixing it is “challenging.”

Still, that did not slow down ChatGPT’s ride to LLM stardom: Over the past year it has become one of the fastest growing services of all time, with an estimated 100 million monthly users in just two months and, now, 100 million weekly users.

However, Galactica’s legacy endures. “There were a lot of good lessons learned,” Joelle Pineau, VP of AI research at Meta, recently told VentureBeat. “That’s a good model — I still get a lot of requests from people who want the model.”

Pineau emphasized that Galactica was never meant to be a product. “It was absolutely a research project,” she said. “We released with the intent, we did a low-key release, put it on GitHub, the researcher tweeted about it.”

But everyone got so excited by it, she explained. “The gap between the expectation, and where the research was, was too big.” People were surprised by things like hallucinations that would hardly be news a year later, she added — and Galactica’s level of hallucination was actually lower than other models because it was fine-tuned on scientific literature.

“Suddenly people had a product expectation, like you would use it to actually write your papers — no, that’s not the intent,” she said.

Galactica lessons led to decisions about Llama release

Meta pulled down the Galactica demo, Pineau explained, “to make sure that people were not misled into using it,” adding that it was not released with a responsible use guide “which we’ve learned to do.”

Overall, Pineau said, “If I was to do it today, we would just manage the release.” She added that Meta “probably misjudged” the expectations around Galactica, but “the lessons of that have been folded into our next generation of models.”

That next generation of models was Llama, Meta’s large language model that took the AI research world by storm in February 2023 — followed by the commercial Llama 2 in July and Code Llama in August. With Llama, the first major free ‘open source’ LLM (Llama and Llama 2 are not fully open by traditional license definitions), open source AI began to have a moment — and a red-hot debate — that has not ebbed all year long.

When Llama was released on February 24, Meta was careful — Yann LeCun, in sharing the paper, posted that “Meta is committed to open research and releases all the models [to] the research community under a GPL v3 license.”

When asked why researchers had to fill out a form to get access to Llama, LeCun retorted: “Because last time we made an LLM available to everyone (Galactica, designed to help scientists write scientific papers), people threw vitriol at our face and told us this was going to destroy the fabric of society.”

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.


Source link

AI Eclipse TLDR:

One year ago, before the release of OpenAI’s ChatGPT, Meta unveiled a research demo called Galactica. Galactica was an open-source “large language model for science” trained on 48 million scientific papers. It had various capabilities, including summarizing academic literature, solving math problems, generating Wiki articles, and writing scientific code. However, Galactica was taken down after three days due to concerns over its output, which sometimes included factually incorrect and offensive information. Meta’s chief scientist, Yann LeCun, defended the model but was unsuccessful in changing public opinion. Two weeks later, ChatGPT was released by OpenAI, which also had issues with producing misleading responses. Despite these problems, ChatGPT gained immense popularity and now has an estimated 100 million weekly users. Although Galactica was never intended to be a product, it taught Meta valuable lessons about managing expectations. Meta’s subsequent models, such as Llama and Llama 2, have incorporated these lessons and have contributed to the ongoing debate around open-source AI.