Google Bard gets image generation and a more capable Gemini Pro to take on ChatGPT

Key Takeaways:

– Google is updating its Bard AI chatbot to compete with OpenAI’s ChatGPT.
– Bard now includes image generation capabilities powered by Imagen 2 AI model.
– Gemini Pro, the most powerful language model from Google, will be available in over 40 languages.
– Bard’s double-check feature validates responses by searching the web.
– Imagen 2 model allows Bard to generate custom visuals based on text descriptions.
– Image generation on Bard is consistent but may fail in some cases.
– Google Bard allows users to report legal issues and limits offensive or explicit content.
– Google is experimenting with ImageFX, a tool for image generation.
– The AI Test Kitchen includes other experimental projects such as MusicFX and TextFX.


Google is updating its Bard AI chatbot to step up its competition with rival OpenAI’s ChatGPT. The Sundar Pichai-led internet giant today announced it is expanding Bard to now include image generation capabilities, powered by its own Imagen 2 AI model, as well as a more capable version of Gemini Pro.

The move gives more people access to Bard’s AI smarts, including a new free tool to create AI images.

“These updates make Bard an even more helpful and globally accessible AI collaborator for everything from big, creative projects to smaller, everyday tasks,” Jack Krawczyk, product lead for Bard, noted in a blog post.

Separately, the company also announced it is experimenting with another image generator, dubbed ImageFX, starting today.

VB Event

The AI Impact Tour – NYC

We’ll be in New York on February 29 in partnership with Microsoft to discuss how to balance risks and rewards of AI applications. Request an invite to the exclusive event below.


Request an invite

Gemini Pro with multi-lingual support

Over a month ago, Google announced Gemini in three sizes: Nano for mobile devices, Pro for more intermediate use cases, and Ultra, what it claimed to be the most powerful and capable large language model (LLM) yet developed by any company — even more powerful than GPT-4 — though this one is not due out until later this year.

Third-party comparisons between Gemini Pro, the most powerful LLM currently available from Google, and other models found that it actually lags behind even OpenAI’s older GPT-3.5 Turbo, a worrying sign for Google as it seeks to show the world it has the juice to take on the new insurgents in the generative AI race. Google did release a fine-tuned version of Gemini Pro on Bard last month, but only in English. 

But today’s flurry of new consumer-facing AI announcements should help Google close the gap. The latest update for Bard, Gemini Pro will be available in over 40 languages — including Korean, Spanish, Tamil, Italian and Russian — across more than 230 countries and territories.

This not only gives more people access to Gemini Pro’s advanced understanding, summarizing, reasoning and coding capabilities but also Bard’s double-check feature, which validates a response by searching across the web.

Imagen-2 on Bard to take on ChatGPT Plus with DALL-E 3

Most importantly, the long-awaited AI image generation capabilities are also coming in. This is being delivered with the help of the Imagen 2 model, which, Google says, can produce high-quality, photorealistic outputs from text inputs, turning Bard into more of a direct and capable competitor to OpenAI’s ChatGPT Plus with DALL-E 3 image generator model, which has been available to users of OpenAI’s subscription tiers since October 2023.

“Just type in a description — like “create an image of a dog riding a surfboard” — and Bard will generate custom, wide-ranging visuals to help bring your idea to life,” Krawczyk noted.

Imagen 2 in action on Bard

We tested image generation on Bard and found that it produces outputs in about 30-40 seconds with good consistency. In some cases, however, it failed to generate the image altogether – even when it did not involve any famed individual, which Google filters out (likely in an effort to avoid scandalous deepfakes similar to what occurred with the musician Taylor Swift and users of Microsoft’s Designer AI image generator powered by OpenAI’s DALL-E 3).

There’s also no support to change the aspect ratio of outputs or any prompt in any other language apart from English at this stage — at least not from our initial usage of the tool.

However, what’s good is that given the copyright infringement concerns around AI-generated media, Google Bard is giving users the option to report legal issues under data protection, copyright and other laws for all generated media.

The company also noted that it limits the production of violent, offensive or sexually explicit content and has used Deepmind-developed SynthID to embed digitally identifiable watermarks into the pixels of generated images. This can help people differentiate if a visual has been generated with Google’s AI or an actual human artist.

A few way to iterate on AI images

Beyond updates for Bard, Google also announced that it is experimenting with ImageFX, a new tool for image generation powered by Imagen 2. 

Available starting today in AI Test Kitchen, Google’s app for experimental AI projects, ImageFX tries to spur creative ideas with “expressive chips” that give users adjacent dimensions and suggestions to iterate on their prompt. This kind of feature is also available on competitive tools, including Ideogram.

The AI Test Kitchen also includes other interesting experimental projects from Google, including MusicFX, which can now create tunes up to 70 seconds in length with text prompts and expressive chips, and TextFX, a generative AI experiment for lyricists, wordsmiths and other creative artists.

VentureBeat’s mission is to be a digital town square for technical decision-makers to gain knowledge about transformative enterprise technology and transact. Discover our Briefings.

Source link

AI Eclipse TLDR:

Google is enhancing its Bard AI chatbot to better compete with OpenAI’s ChatGPT. The company is expanding Bard to include image generation capabilities using its Imagen 2 AI model, as well as a more advanced version of Gemini Pro. These updates aim to make Bard a more useful and globally accessible AI collaborator. Additionally, Google is experimenting with ImageFX, another image generator powered by Imagen 2. The latest update for Bard, Gemini Pro, will be available in over 40 languages across more than 230 countries and territories. The introduction of image generation capabilities allows Bard to directly compete with OpenAI’s ChatGPT Plus with DALL-E 3. Users can input a description, and Bard will generate custom visuals to bring ideas to life. However, there are some limitations, such as the inability to change the aspect ratio of outputs and the lack of support for languages other than English. Google is also introducing features to address copyright infringement concerns and to differentiate AI-generated media from human-created content. Overall, these updates aim to close the gap between Google and OpenAI in the generative AI race.