Introduction: OMEGA Any-to-Any is a decentralized,
open-source AI project built on the Bittensor blockchain by OMEGA Labs. Our
mission is to create state-of-the-art (SOTA) multimodal any-to-any models by
attracting the world's top AI researchers to train on Bittensor, taking
advantage of Bittensor's incentivized intelligence platform. Our goal is to
establish a self-sustaining, well-resourced research lab, where participants
are rewarded for contributing compute and/or research insight.
MainNet UID: 21
TestNet UID: 157
Why Any-to-Any? 🧠📚🌃🎧🎥
-
Multimodal First: A2A jointly models all modalities (text,
image, audio, video) at once, with the belief that true intelligence lies in
the associative representations present at the intersection of all
modalities.
-
Unified Fundamental Representation of Reality: The Platonic
Representation Hypothesis suggests that as AI models increase in scale and
capability, they converge towards a shared, fundamental representation of
reality. A2A models, by jointly modeling all modalities, are uniquely
positioned to capture this underlying structure, potentially accelerating
the path towards more general and robust AI.
-
Decentralized Data Collection: Thanks to our SN24 data
collection, we leverage a fresh stream of data that mimics real-world demand
distribution for training and evaluation. By frequently refreshing our data
collection topics based on gaps in the current data, we avoid the issue of
underrepresented data classes. Through self-play, our SN's best checkpoints
can learn from each other and pool their intelligence.
-
Incentivized Research: World class AI researchers and
engineers already love open source. With Bittensor's model for incentivizing
intelligence, researchers can be permissionlessly compensated for their
efforts and have their compute subsidized according to their productivity.
-
Bittensor Subnet Orchestrator: Incorporates specialist
models from other Bittensor subnets, acting as a high-bandwidth,
general-purpose router. By being the best open source natively multimodal
model, future AI projects can leverage our rich multimodal embeddings to
bootstrap their own expert models.
-
Public-Driven Capability Expansion: Public demand dictates
which capabilities the model learns first through the decentralized
incentive structure.
-
Beyond Transformers: Integrate emerging state-of-the-art
architectures like early fusion transformers, diffusion transformers, liquid
neural networks, and KANs.
Roadmap 🚀
Phase 1: Foundation (Remainder of Q2 2024)
-
Design a hard-to-game validation mechanism that rewards deep video
understanding
-
Produce the first checkpoint with SOTA image and video understanding
capabilities with our ImageBind + Llama-3 architecture as a proof-of-concept
starting point
-
Generalize the validation mechanism to enable broad architecture search and
new multimodal tokenization methods
-
Onboard 20+ top AI researchers from frontier labs and open source projects
-
Expand SN24 data collection beyond YouTube to include multimodal websites
(e.g. Reddit, blogposts) and synthetic data pipelines
-
Launch OMEGA Focus screen recording app, providing rich data for modelling
long-horizon human workflows, combatting the hallucination and distraction
problem found in top closed-source LLMs
Phase 2: Fully Multimodal (Q3 2024)
-
Produce the first any-to-any checkpoint natively modelling all modalities
that can beat other OSS models on top multimodal and reasoning benchmarks
-
Develop a user-friendly interface for miners and validators to interact with
the subnet's top models
-
Onboard 50 more top AI researchers from top labs and open source research
collectives
-
Publish a research paper on A2A's architecture, incentive model, and
performance
-
Release open source multimodal embedding models (based on our top A2A
checkpoint's internal embedding space) for other labs to condition their
models on
-
Integrate a framework that can auto-evaluate all the models & commodities
produced by other subnets on Bittensor which our top models can then
interact with, both through tool-use and through native communication in the
latent-space via projection modules
Phase 3: Exponential Open Research Progress (Q4 2024)
-
Produce the first any-to-any OSS checkpoint that beats all closed-source
SOTA general intelligence models
-
Establish partnerships with AI labs, universities, and industry leaders to
drive adoption
-
Expand our one-stop-shop Bittensor model evaluation and router framework to
arbitrary open source and closed-source checkpoints and APIs
-
Implement task-driven learning, with OMEGA Labs routinely curating
high-signal tasks for model trainers to master
-
Start crafting an entirely new "online" validation mechanism that rewards
miners for producing agentic models that can complete real-world tasks
-
Use our top checkpoint to power up the multimodal intelligence features of
the OMEGA Focus app
Phase 4: Agentic Focus (Q1 2025)
-
Launch our agent-focused "online" validation mechanism centered around
long-range task completion
- Achieve SOTA performance on agent benchmarks
-
Use OMEGA Focus as an outlet to provide OMEGA digital twin companions to
users
-
Launch an app store for A2A-powered applications leveraging our open source
models
- Reach 10M+ users with the OMEGA Focus app