Docker: IBM Granite 4.0 Models Now Available on Docker Hub

Oct 6, 2025

—

Source URL: https://www.docker.com/blog/ibm-granite-4-0-models-now-available-on-docker-hub/
Source: Docker
Title: IBM Granite 4.0 Models Now Available on Docker Hub

Feedly Summary: Developers can now discover and run IBM’s latest open-source Granite 4.0 language models from the Docker Hub model catalog, and start building in minutes with Docker Model Runner. Granite 4.0 pairs strong, enterprise-ready performance with a lightweight footprint, so you can prototype locally and scale confidently. The Granite 4.0 family is designed for speed, flexibility,…

AI Summary and Description: Yes

Summary: The text discusses the release of IBM’s Granite 4.0 language models, which are available on Docker Hub. These models feature a hybrid architecture for improved efficiency and speed and are designed for generative AI applications. The accessibility of these models for various hardware platforms broadens their usability, enabling developers to rapidly prototype and deploy advanced applications.

Detailed Description:

The release of IBM’s Granite 4.0 language models marks a significant advance in the realm of generative AI technology, particularly for developers looking to leverage lightweight and efficient models for various applications. Here are the major points that highlight its significance:

– **Granite 4.0 Overview**:
– It’s an open-source language model family that can be easily accessed and run via Docker Hub.
– Designed for speed, flexibility, and cost-effectiveness in constructing and deploying generative AI applications.

– **Docker Hub’s Role**:
– A platform used by millions of developers for container management, it’s now also pivotal for managing AI models.
– The integration facilitates the downloading, sharing, and running of curated AI models as OCI Artifacts.

– **Innovation of Granite 4.0**:
– **Hybrid Architecture**: Combines Mamba-2’s scaling efficiency with transformer accuracy, enhancing model performance and energy efficiency.
– **Mixture of Experts (MoE)**: Activates only selected model parameters for specific tasks, leading to a significant reduction in processing time and memory usage (over 70% less compared to traditional models).

– **Context Handling**:
– The removal of positional encoding allows for the processing of extremely long documents, enhancing capabilities like document analysis and Retrieval-Augmented Generation (RAG) with up to 128,000 tokens.

– **Model Size Variations**:
– The Granite 4.0 lineup includes various model sizes that allow flexibility according to specific performance needs without compromising on resource efficiency. Options range from ultra-light Micro models to larger Small models.

– **Use Cases**:
– **Document Summarization**: Efficient for summarizing lengthy texts across various applications.
– **RAG Systems**: Building advanced conversational agents that leverage extensive data from multiple sources.
– **Multi-agent Workflows**: Running numerous AI agents for complex reasoning tasks.
– **Edge AI Deployment**: Utilizing the compact models for real-time applications in constrained environments.

– **Community Engagement**:
– The models are released under the Apache 2.0 license, encouraging customization and commercial usage.
– Promotes developer interaction through collaboration on the Docker Model Runner repository and GitHub.

Overall, Granite 4.0 empowers developers to quickly and effectively build next-generation AI solutions, fostering innovation and enhancing the potential for local model deployment in a variety of environments. This aligns well with current trends emphasizing cloud and edge computing, as well as efficient resource management in software development.

0 license 1 2 4 7 a access accessibility accuracy Act advanced advanced applications age agent agent workflows agents AGI AI AI applications ai model AI models AI technology air All allow analysis and apach Apache Apache 2 Apache 2.0 Apache 2.0 license API app Application applications Arch architecture Aria art as at ated Augment augmented generation Bi building by C capabilities catalog CI CIA Cloud co coding Col collaboration commercial community community engagement complex reasoning Computing constrained environments container container management Context context handling conversation conversational Conversational Agents cost cost-effective cost-effectiveness cross Current custom customization D data de deployment design developer developers development Docker Docker Hub Docker Model Runner document document analysis document summarization e edge edge computing effective effectiveness efficiency efficient efficient resource management encoding end energy energy efficiency engagement enterprise Enterprise-ready environment environments ERP EU exp expert Experts Experts (MoE) fact feature Flex flexibility for g Gen generation generative Generative AI git GitHub H handling hardware hardware platforms high Highlight HR http HTTPS hybrid hybrid architecture IBM in innovation integration inter interaction io Iron IRS ite J k l Labor language language model language models large Lead leading led Li license lightweight line lm load local local model deployment long low M man management memory memory usage Micro Mixture Mixture of Experts (MoE) mixture-of-experts Mode model model catalog model deployment model family model parameters model performance Model Runner models MoE multi N needs next no o OCI artifacts oE of on only ons open open-source OPM opt options ory oS oss out over Pair parameter per performance platform platforms point positional encoding potential Power pro process processing prototype ps Q QUIC R rag rate RCE re ready real real-time real-time applications reasoning reasoning tasks red reduction release repository resource resource efficiency resource management retrieval Retrieval-Augmented Generation Ro Role RoT RSA s Scale scaling scaling efficiency SHA sharing Sig size sizes small small models software software development solutions source specific speed SSE STAR start summarization system systems T Task tasks tech technology ted test text the Time time applications to token tokens Tor TP trained Transform transformer trends trie type UI Ultra UN under up US usability usage use use cases V val Ware weight Well Wi workflow workflows x you z