SO101-Nexus
Concepts

Backend Support

The MuJoCo and MuJoCo Warp backends, environment availability, and vectorization.

SO101-Nexus uses MuJoCo as its default simulation backend. A MuJoCo Warp backend adds native GPU-parallel batching for reinforcement learning, and covers all five tasks (Touch, LookAt, Move, PickLift, PickAndPlace).

Task Availability

All five tasks run on the SO-101 arm:

TaskRobot
PickLiftSO-101
PickAndPlaceSO-101
TouchSO-101
LookAtSO-101
MoveSO-101

Vectorization

MuJoCo environments are vectorized through Gymnasium's standard gym.vector.SyncVectorEnv wrapper, which runs multiple environment instances on CPU.

The MuJoCo Warp backend instead batches simulation natively on the GPU. Construct any task with gym.make_vec("WarpPickLift-v1", num_envs=N, device="cuda", vectorization_mode="vector_entry_point") (env IDs WarpTouch-v1, WarpLookAt-v1, WarpMove-v1, WarpPickLift-v1, WarpPickAndPlace-v1); reset and step return torch CUDA tensors of shape (num_envs, ...).

Warp envs route the same composable state observation components as MuJoCo (joint positions, end-effector pose, grasp state, object and target offsets, gaze direction). They also support WristCamera and OverheadCamera observation components, rendered in one batched ray-tracing pass on the simulation device. Camera observations return uint8 image tensors of shape (num_envs, height, width, 3), with per-world wrist-camera domain randomization.

Render Modes

MuJoCo envs support Gymnasium render_mode="human" and render_mode="rgb_array" for visualization. Warp envs do not implement render() because images are produced as observation tensors; passing a render_mode to gym.make_vec() is accepted for compatibility, warns, and is ignored. Configure WristCamera or OverheadCamera in observations when a Warp policy needs pixels.

WarpPickLift and WarpTouch accept the same heterogeneous object pools as MuJoCo (CubeObject, YCBObject, MeshObject, plus n_distractors), and WarpPickAndPlace accepts a carried-object pool. Each pool object is compiled as a freejoint slot, the per-world target is selected at reset, and inactive slots are parked off-world. Because all worlds share one model, per-episode geom_rgba colour randomization of a single object is unsupported; distinct coloured cube slots still give per-world colour variation through selection, and the WarpPickAndPlace goal disc colour is fixed to the first configured colour.

Vectorized worlds may carry different tasks, so env.task_descriptions holds one string per world and info["task_description"] is returned from reset() and step(). env.task_description reduces to the shared string, or a generic family string when worlds differ.

The physics also diverges: Warp uses the implicit integrator without the noslip constraint, so policies may need re-tuning when moving between backends. Install the backend with the warp extra (pip install "so101-nexus[warp]"), which needs an NVIDIA GPU and CUDA >= 12.4 for GPU execution.

Objects

Both backends support CubeObject, YCBObject, and MeshObject, configured through the same composable observation components and object-pool config (objects, n_distractors). See Observations for details.

For details on object types, see the Objects API reference. For observation configuration, see Observations.

On this page