> For the complete documentation index, see [llms.txt](https://guide.404.xyz/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://guide.404.xyz/404-gen-user-guide/extended-resources/readme.md).

# 404—GEN Synthetic 3D Dataset

### Overview

The **404—GEN** Synthetic 3D Dataset is the world's largest open-source 3D asset repository. Comprising over 21.5 million high-fidelity, AI-generated 3D models, this dataset surpasses the volume of all other existing 3D datasets combined.

### Key Highlights & Specifications

* **Dataset Size:** 21.5M+ High-fidelity 3D assets
* **Total Storage Volume:** 40 Terabytes (TB)
* **Licensing:** Open-source and attribution-ready
* **Metadata:** Each asset includes detailed metadata, usage rights, and ownership attribution.
* **Target Applications:** 3D generative AI, Gaussian Splatting research, neural radiance fields (NeRFs), AR/VR development, and gaming studios.

### Why Synthetic Data Matters

#### Overcoming Data Scarcity

AI models require massive amounts of high-quality data to continue scaling and learning. However, research from organisations like Epoch AI indicates that the tech industry faces a looming "data wall," predicting that we may run out of high-quality human-generated data for training by 2030.

#### The Synthetic Solution

"Synthetic" datasets—data generated programmatically by other AI models—provide a scalable alternative to human data. When produced at an elite standard and a massive scale, synthetic data enables next-generation 3D models to train efficiently without hitting the constraints of human data availability.

### Powered by Decentralised Intelligence

Achieving this milestone through traditional, centralised infrastructure would have been nearly impossible. **404—GEN**’s rapid scaling and production speed are a direct result of operating within the Bittensor network.

* **Incentivised Architecture:** Independent network miners are directly rewarded based on the quality and volume of their outputs.
* **Unprecedented Throughput:** This decentralised design allowed **404—GEN** to generate, score, filter, and aggregate 40TB of synthetic 3D content in record time, demonstrating the raw power of decentralised compute.

### Dataset Access & Downloads

Because hosting a 40TB dataset for public download is an immense infrastructure challenge, we offer flexible access tiers depending on your project needs.

#### 1. Light/Sample Tier (404—Mini)

For researchers and developers looking to evaluate the dataset or run smaller-scale experiments, a curated sample set [**404—Mini**](https://huggingface.co/datasets/404-Gen/404mini) is publicly hosted on Hugging Face.

#### 2. Full Dataset Access

The complete 40TB dataset is available for large-scale institutional research, commercial training, and studio pipelines. Priority access is given to projects actively advancing the boundaries of 3D research and development.

**Request Portal:**[ dataset.404.xyz](https://dataset.404.xyz/)

#### 3. Immediate API Access

If you are a researcher, developer, or studio requiring direct, high-speed API integration into your training pipeline, please reach out directly via email: **<sheesh@404.xyz>**<br>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://guide.404.xyz/404-gen-user-guide/extended-resources/readme.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
