NitroGen

Open Foundation Model for Generalist Gaming Agents – Vision-to-Action AI Trained on 40,000 Hours of Gameplay Across 1000+ Titles
Last Updated: December 21, 2025
By Zelili AI

About This AI

NitroGen is an open-source unified vision-to-action foundation model developed by NVIDIA researchers in collaboration with Stanford, Caltech, UChicago, UT Austin, and others.

It enables generalist gaming agents to play a wide variety of commercial video games directly from raw RGB video frames by predicting gamepad actions through large-scale behavior cloning.

Trained on 40,000 hours of internet-scale gameplay videos spanning more than 1,000 diverse titles, it learns to handle action, platformer, racing, RPG, battle royale, 2D/3D games, and more without game-specific APIs or reward signals.

The model takes 256×256 RGB inputs and outputs a 21×16 action tensor (two continuous 2D joystick vectors plus 17 binary buttons).

Architecture combines a SigLip2 Vision Transformer backbone with a Diffusion Matching Transformer (DiT) for action prediction.

It demonstrates strong zero-shot competence in complex tasks like combat, navigation, high-precision platforming, and exploration, with up to 52 percent relative improvement in task success rates when fine-tuned on unseen games compared to scratch-trained models.

Released in December 2025 with full weights on Hugging Face, dataset, universal Gymnasium API simulator for commercial games, evaluation suite, and code on GitHub under NVIDIA’s non-commercial license (backbone Apache 2.0).

Ideal for advancing embodied AI research, game testing automation, NPC behavior training, and transferring skills to robotics or real-world simulation.

As an open model, it promotes progress in generalist agents beyond narrow RL-trained bots.

Key Features

  1. Vision-to-action prediction: Maps raw 256x256 RGB game frames directly to gamepad actions (joysticks and buttons)
  2. Large-scale imitation learning: Trained purely via behavior cloning on 40,000 hours of unlabeled internet gameplay videos
  3. Multi-game generalization: Zero-shot competence across 1000+ titles in diverse genres (action, platformer, racing, RPG, etc.)
  4. High-precision control: Handles fine movements in 2D platformers and complex decision-making in 3D games
  5. Strong transfer learning: Up to 52 percent relative success rate improvement when fine-tuned on unseen games
  6. Universal simulator API: Gymnasium wrapper to run commercial games with agent control for evaluation
  7. Open dataset and eval suite: 40k-hour action-annotated videos plus multi-task benchmark across 10+ games
  8. Efficient architecture: SigLip2 ViT backbone + Diffusion Matching Transformer for scalable inference
  9. Research-focused release: Full weights, code, dataset, and paper available for community advancement

Price Plans

  1. Free ($0): Fully open-source model weights, code, dataset, and tools available on Hugging Face and GitHub under NVIDIA non-commercial license; no usage fees for research purposes
  2. Commercial (Not Available): License restricts commercial use; contact NVIDIA for potential enterprise licensing

Pros

  1. Groundbreaking generalization: First open model to play 1000+ diverse games from pixels alone
  2. Internet-scale training: Leverages massive public gameplay data without expensive human demos
  3. Strong transfer performance: Significant gains when fine-tuned on new titles compared to scratch baselines
  4. Fully open ecosystem: Weights, dataset, code, simulator, and eval suite released for research
  5. Embodied AI potential: Skills transferable to robotics and real-world simulation domains
  6. Non-commercial license clarity: Enables broad academic and open-source experimentation

Cons

  1. Non-commercial license: Restricted to research/non-commercial use per NVIDIA terms
  2. Gamepad focus limitation: Best on controller games; weaker on mouse/keyboard-heavy titles (RTS/MOBA)
  3. Hardware demands: Inference requires capable GPU for real-time or large-scale use
  4. Setup complexity: Local deployment needs Git clone, pip install, checkpoint download, and game wrappers
  5. No hosted demo: No simple web interface; requires technical setup to run
  6. Early-stage maturity: Released December 2025; community integrations and fine-tuning examples emerging
  7. Potential noise in data: Trained on unfiltered internet videos which may include suboptimal play

Use Cases

  1. Game AI research: Benchmarking generalist agents across diverse titles and tasks
  2. Game development testing: Automate playtesting, bug hunting, or NPC behavior prototyping
  3. Embodied AI transfer: Adapt gaming skills to robotics simulation or real-world control
  4. Autonomous agent training: Fine-tune on specific games for stronger performance
  5. Educational simulations: Create interactive game-based learning environments
  6. Procedural content evaluation: Test AI in generated or varied game worlds

Target Audience

  1. AI researchers in embodied intelligence: Studying generalist agents and imitation learning
  2. Game developers and studios: Exploring AI for testing, NPCs, or procedural generation
  3. Robotics and simulation engineers: Transferring pixel-to-action skills to physical domains
  4. Academic institutions: Using open dataset and model for courses/projects
  5. Open-source AI community: Fine-tuning or extending the foundation model

How To Use

  1. Clone repository: git clone https://github.com/MineDojo/NitroGen.git and cd NitroGen
  2. Install dependencies: pip install -e . to set up environment
  3. Download checkpoint: hf download nvidia/NitroGen ng.pt (or from Hugging Face)
  4. Start inference server: python scripts/serve.py path/to/ng.pt
  5. Run agent on game: python scripts/play.py --process 'game.exe' for Windows titles
  6. Prepare game: Ensure game runs and exposes window for frame capture
  7. Monitor output: Observe agent actions and performance in real-time

How we rated NitroGen

  • Performance: 4.6/5
  • Accuracy: 4.5/5
  • Features: 4.7/5
  • Cost-Efficiency: 5.0/5
  • Ease of Use: 4.0/5
  • Customization: 4.8/5
  • Data Privacy: 5.0/5
  • Support: 4.2/5
  • Integration: 4.4/5
  • Overall Score: 4.6/5

NitroGen integration with other tools

  1. Hugging Face: Model weights and dataset hosted for easy download and experimentation
  2. GitHub Repository: Full code, inference scripts, and deployment instructions available
  3. Gymnasium API: Universal simulator wrapper to run commercial games with agent control
  4. MineDojo Ecosystem: Ties into broader MineDojo framework for embodied AI research
  5. Local GPU Setup: Runs natively on NVIDIA hardware with PyTorch/CUDA

Best prompts optimised for NitroGen

  1. Not applicable - NitroGen is a vision-to-action gaming agent model that takes raw RGB frames as input and predicts gamepad actions; it does not use text prompts for generation.
NitroGen marks a major step in open embodied AI by training a single vision-to-action model on massive internet gameplay data to play over 1000 diverse games. Its strong zero-shot and transfer performance, combined with full open-source release of weights, dataset, and simulator, makes it invaluable for gaming AI research. Setup is technical, but the potential for generalist agents in games and beyond is exciting.

FAQs

  • What is NitroGen?

    NitroGen is an open-source vision-to-action foundation model from NVIDIA that plays over 1000 diverse video games directly from raw pixel frames by predicting gamepad actions.

  • When was NitroGen released?

    NitroGen was released on December 19, 2025, with model weights, dataset, code, and paper made publicly available.

  • Is NitroGen free to use?

    Yes, it is completely free and open-source with weights and code on Hugging Face/GitHub under NVIDIA’s non-commercial license for research purposes.

  • How was NitroGen trained?

    It uses large-scale behavior cloning on 40,000 hours of internet gameplay videos across 1000+ games, without rewards or game-specific APIs.

  • What games does NitroGen play best?

    It excels at gamepad-controlled titles like action, platformers, racing, RPGs, and battle royales; weaker on heavy mouse/keyboard games like RTS or MOBAs.

  • How can I run NitroGen?

    Clone the GitHub repo, install dependencies, download checkpoint from Hugging Face, start inference server, and run on Windows game executables.

  • What license does NitroGen use?

    Governed by NVIDIA One-Way Noncommercial License; backbone (SigLip2) is Apache 2.0; intended strictly for research/non-commercial use.

  • What are NitroGen’s implications beyond gaming?

    Skills learned from games can transfer to embodied AI, robotics simulation, autonomous systems training, and real-world control tasks.

Newly Added Tools​

Qwen-Image-2.0

$0/Month

Qodo AI

$0/Month

Codiga

$10/Month

Tabnine

$59/Month
NitroGen Alternatives

Qodo AI

$0/Month

Codiga

$10/Month

Tabnine

$59/Month

About Author

Hi Guys! We are a group of ML Engineers by profession with years of experience exploring and building AI tools, LLMs, and generative technologies. We analyze new tools not just as a user, but as someone who understands their technical depth and real-world value.We know how overwhelming these tools can be for most people, that’s why we break down complex AI concepts into simple, practical insights. Our goal is to help you discover these magical AI tools that actually save your time and make everyday work smarter, not harder.“We don’t just write about AI: We build, test and simplify it for you.”