The data foundry for frontier AI

Train on worlds, not just pixels.

ZENOS turns licensed video games into rights-cleared, structured training data for AI. Frame-perfect game state, not scraped video.

STATE ACTION NEXT‑STATE // the learning triplet, captured frame-perfect
Built by ZENOS · Real-time game data technology since 2020.
Why ZENOS

Built by game-data veterans.

5+Years building real-time game-data tech

We've spent five years building real-time game data technology with some of the biggest games in the world. We come from this industry, and we build for it.

Historically trusted by brands such as
Riot Games Valve Meta Nvidia BLAST FIFA
Why it's different

Anyone can scrape YouTube. Only ground truth trains models that understand the world.

Tier 1 · Raw video

Pixels only

No state, no labels. What's publicly available.

Tier 2 · Visual labelling

Inferred from pixels

Lossy and indirect. What scraping and labelling give you.

Tier 3 · ZENOS ground truth

Frame-perfect game state

Direct from the running game. The (state, action, next-state) triplet, rights-cleared.

+ all Tier 1 & 2 data

Your games are already Tier 1: scraped, unpaid, uncontrolled.
Licensing moves them to Tier 3: paid, rights-cleared, and yours to pull anytime.

The data

From gameplay to ground truth.

PRISM reads the running game directly. Output is structured and frame-aligned, captured at user-defined rates up to 4K and 120 Hz, so you only take what your model needs.

Live sample: a real capture session, frame-synced to its own VTX state. Nothing inferred, nothing staged.

Render

Rendering

Visual ground truth straight from the engine. Pixels and the render buffers behind them.

VideoDepthSegmentationSurface normalsMotion vectorsUI mask
Input

Player Inputs

What the human did, synchronised with the state stream. The action half of any imitation-learning pair.

MouseKeyboardControllerAction labelsSemantic keybindingsPlayer viewport
State

World State

Everything the game is doing, frame by frame. The structured world behind the screen, not reconstructed from pixels.

EntitiesTransformsPhysicsCollisionsHealth & statusEventsObjectivesRewardsCamera
Enrich

Data Enrichment

Added labels generated after capture. Semantic, inferred, and consistent across every supported title.

Scene captionsEvent annotationsIntent & trajectoryNamed actionsInferred inputs
+ added after capture
Queryable metadata

20+ categories per clip: environment, setting, weather, time-of-day, lighting, vehicles, people, activities, interactions, camera, shot type, occlusion, materials, on-screen text, audio, music, language. Filter across any combination.

Your game, untouched

We never mod your game. We capture the original, released build. No mods, no source code, no assets. Nothing that could rebuild or redistribute your title.

The more we extract from a title, the more it is worth to labs, and the more it earns you.

What each side gets

Ground-truth data for labs. Recurring revenue for IP owners.

For AI labs

Nothing inferred. Everything captured.

  • Frame-perfect ground truth: state, action, next state
  • Engine-normalised across every title, one coordinate system
  • Render buffers you can't scrape: depth, normals, masks
  • Rights-cleared and auditable, source to frame

Any genre, captured to spec. Supply that doesn't run out.

For IP owners

Shared upside. No lock-in.

  • 50/50 net revenue share
  • Non-exclusive. Keep your rights and your options.
  • Zero effort. We capture, process, and sell.
  • Capture-ready in 24h

A one-off buyout caps your upside. Revenue share keeps paying for as long as the data sells.

How it works

Licensed games in. Lab-ready data out.

01

Evaluate

Rights, suitability, lab demand.

02

Capture

On the released game. No SDK.

03

Package

Structured, normalised, enriched, QA'd.

04

Deliver

Rights-cleared, non-exclusive.

05

Share

Costs recovered, then 50/50.

Governance

Rights-cleared. Auditable. Under your control.

Licensed, never scraped

We only capture games we've licensed. Nothing is taken without a deal — every frame starts from a signed agreement.

Traceable to the licence

Every frame traces back to its source and rights. Full chain of custody.

You see the usage

Know exactly which labs license your data, and how much they use.

Pull it anytime

Opt out any title at any time. Enforced immediately.

No clones

Labs are contractually barred from training models to recreate your titles.

Private by default

Your involvement stays confidential unless you choose otherwise.