Google DeepMind Launches Gemma 4 Amid Growing Competition from Chinese Open AI Models

Written by: Mane Sachin

Published on:

Follow Us

Google DeepMind has introduced Gemma 4, a new family of open AI models aimed at improving reasoning tasks and supporting agent-based workflows.

The launch builds on the earlier Gemma versions, which have already been downloaded more than 400 million times by developers.

Released under the Apache 2.0 license, Gemma 4 is designed to deliver stronger performance without increasing model size too much. Google says the models are efficient enough to run across a wide range of devices, from mobile hardware to GPUs.

The new lineup includes four variants β€” Effective 2B (E2B), Effective 4B (E4B), a 26B Mixture of Experts model, and a 31B Dense model.

Developers can access these models through platforms like Hugging Face, Kaggle, and Ollama. They are also compatible with tools such as Transformers, vLLM, and llama.cpp, with deployment options available on Google Cloud.

In terms of performance, the 31B Dense model ranks third among open models on Arena AI’s leaderboard, while the 26B version is placed sixth, competing with much larger systems.

The 31B model has an Elo score of around 1452, putting it close to models like Qwen 3.5, GLM-5, and Kimi K2.5, which are currently gaining global attention.

Commenting on the development, Clement Delangue said Google appears to be re-entering the competition, especially as Chinese open-source models continue to grow in popularity.

Google said the smaller E2B and E4B models are capable of running directly on devices, offering multimodal features with low latency. The larger models are designed for GPUs, including setups using a single NVIDIA H100, with optimised versions also available for consumer hardware.

The company added that the models can be fine-tuned across different environments, including Android devices, laptops, and developer workstations. It has also worked with hardware partners like Qualcomm and MediaTek to support mobile and IoT use cases.

Gemma 4 shows improvements in multi-step reasoning, including tasks involving mathematics and instruction-following. It also supports agentic workflows, with features like function calling, structured outputs, and system instructions for interacting with tools and APIs.

The models can process images and video for tasks such as text extraction and chart understanding. Smaller variants also support audio input for speech recognition.

In addition, Gemma 4 supports long context windows β€” up to 256K tokens in larger models and 128K in edge versions β€” along with support for more than 140 languages. Developers can also fine-tune the models for specific use cases.

With its open licensing, Google is allowing developers to build and deploy applications across cloud, on-premise, and hybrid environments.

Also Read: Google DeepMind Introduces Gemini Embedding 2 in Public Preview

Mane Sachin

My name is Sachin Mane, and I’m the founder and writer of AI Hub Blog. I’m passionate about exploring the latest AI news, trends, and innovations in Artificial Intelligence, Machine Learning, Robotics, and digital technology. Through AI Hub Blog, I aim to provide readers with valuable insights on the most recent AI tools, advancements, and developments.

For Feedback - aihubblog@gmail.com