ai/glm-5-safetensors

Verified Publisher

By Docker

Updated 2 months ago

744B MoE language model with 40B active params for reasoning, coding, and agentic tasks (FP8)

Model
3

10K+

ai/glm-5-safetensors repository overview

GLM-5

GLM-5 is a large-scale Mixture-of-Experts (MoE) language model designed for complex systems engineering and long-horizon agentic tasks. Developed by Z.ai, this model represents a significant advancement in scaling and efficiency, featuring 744B total parameters with 40B active parameters during inference. The model integrates DeepSeek Sparse Attention (DSA) to reduce deployment costs while maintaining exceptional long-context capabilities.

GLM-5 was trained on 28.5 trillion tokens and leverages an innovative asynchronous reinforcement learning infrastructure called slime to bridge the gap between competence and excellence in pre-trained models. The model delivers state-of-the-art performance among open-source models on reasoning, coding, and agentic tasks, achieving results competitive with leading frontier models across a wide range of academic benchmarks.

This FP8-quantized version provides an optimized deployment option, maintaining model quality while significantly reducing memory requirements and computational costs for practical applications.


Characteristics

AttributeValue
ProviderZ.ai
ArchitectureGlmMoeDsaForCausalLM (MoE with DeepSeek Sparse Attention)
Total Parameters744B (40B active)
Training Data28.5T tokens
LanguagesEnglish, Chinese
Input modalitiesText
Output modalitiesText
Context Length128K tokens (up to 202K with tools)
LicenseMIT
QuantizationFP8

Using this model with Docker Model Runner

docker model run gml-5-safetensors

For more information, check out the Docker Model Runner docs.

Benchmarks

GLM-5 demonstrates exceptional performance across reasoning, coding, and agentic tasks, achieving best-in-class results among open-source models:

Reasoning & Mathematical Tasks
BenchmarkGLM-5GLM-4.7DeepSeek-V3.2Kimi K2.5Claude Opus 4.5Gemini 3 ProGPT-5.2 (xhigh)
HLE30.524.825.131.528.437.235.4
HLE (w/ Tools)50.442.840.851.843.445.845.5
AIME 2026 I92.792.992.792.593.390.6-
HMMT Nov. 202596.993.590.291.191.793.097.1
IMOAnswerBench82.582.078.381.878.583.386.3
GPQA-Diamond86.085.782.487.687.091.992.4
Coding Benchmarks
BenchmarkGLM-5GLM-4.7DeepSeek-V3.2Kimi K2.5Claude Opus 4.5Gemini 3 ProGPT-5.2 (xhigh)
SWE-bench Verified77.873.873.176.880.976.280.0
SWE-bench Multilingual73.366.770.273.077.565.072.0
Agentic Tasks
BenchmarkGLM-5GLM-4.7DeepSeek-V3.2Kimi K2.5Claude Opus 4.5Gemini 3 ProGPT-5.2 (xhigh)
Terminal-Bench 2.0 (Terminus 2)56.2 / 60.741.039.350.859.354.254.0
Terminal-Bench 2.0 (Claude Code)56.2 / 61.132.846.4-57.9--
CyberGym43.223.517.341.350.639.9-
BrowseComp62.052.051.460.637.037.8-
BrowseComp (w/ Context Manage)75.967.567.674.967.859.265.8
BrowseComp-Zh72.766.665.062.362.466.876.1
τ²-Bench89.787.485.380.291.690.785.5
MCP-Atlas (Public Set)67.852.062.263.865.266.668.0
Tool-Decathlon38.023.835.227.843.536.446.3
Vending Bench 2$4,432.12$2,376.82$1,034.00$1,198.46$4,967.06$5,478.16$3,591.33

Considerations

  • Resource Requirements: While FP8 quantization significantly reduces memory footprint, the model still requires substantial computational resources (8 GPUs recommended for inference with tensor parallelism)
  • Language Focus: The model is optimized primarily for English and Chinese; performance on other languages may be limited
  • Long-Context Optimization: Best performance is achieved with tasks that can leverage the model's extensive context window capabilities
  • Agentic Use Cases: The model is specifically designed for complex, long-horizon agentic tasks and may be over-engineered for simple text generation use cases
  • Tool Integration: For optimal performance on complex tasks, consider using the model with tool-calling capabilities enabled
Generated by

This model card was automatically generated using cagent-action. Want to learn more about Docker Model Runner? Check out the project repository: https://github.com/docker/model-runner.

Tag summary

Content type

Model

Digest

sha256:f606751ff

Size

704.3 GB

Last updated

2 months ago

docker model pull ai/glm-5-safetensors

This week's pulls

Pulls:

292

Last week