01 // ABSTRACT
SUBJECT: BEYOND THE TRANSFORMER
REF: LECUN_JEPA_2022
Current LLMs are autoregressive text simulators. They
hallucinate because they optimize for token probability, not
physical reality.
We are building World Models. Systems that
learn a causal representation of the environment. We do not
train on text; we train on video, sensor data, and
simulation.
>> TARGET: AGI VIA SIMULATION
>> METHOD: SELF-SUPERVISED
LEARNING
>> STATUS: PRE-TRAINING
RS-001: RESEARCH SCIENTIST (WORLD MODELS)
Palo Alto
You will focus on Joint Embedding Predictive
Architectures (JEPA). The goal is to learn hierarchical
state representations that can predict future states in
latent space without pixel-level reconstruction.
See: Hafner et al., "DreamerV3: Mastering Diverse
Domains through World Models"
ENG-004: KERNEL ENGINEER (INFERENCE)
REMOTE
Writing custom CUDA kernels to optimize the training of
non-transformer architectures. We are bottlenecked by
memory bandwidth, not compute. You need to know the H100
architecture down to the register level.
MTS-002: MEMBER OF TECHNICAL STAFF
(SIMULATION)
Palo Alto
Scaling procedural environment generation for Sim2Real
transfer. If the agent can't learn it in the sim, it
won't work on the robot. We need millions of hours of
varied physics data.
03 // LAB_STATUS
COMPUTE CLUSTER:
ONLINE [98% UTIL]
ONLINE [98% UTIL]
CURRENT RUN:
MODEL: N_CORP_V4
PARAMS: 70B (DENSE)
TOKENS: N/A (VIDEO)
MODEL: N_CORP_V4
PARAMS: 70B (DENSE)
TOKENS: N/A (VIDEO)
NOTE:
PLEASE DON'T BE A SCRIPT KIDDIE.
RESEARCHERS AND ENGINEERS ARE TWO HALVES OF A WHOLE.
PLEASE DON'T BE A SCRIPT KIDDIE.
RESEARCHERS AND ENGINEERS ARE TWO HALVES OF A WHOLE.