SO101-Nexus

Practical recipes for configuring episode length, cameras, colors, spawn regions, rewards, and more.

Every SO101-Nexus environment accepts a typed config object that controls its behavior. Configs are classes from so101_nexus and can be passed to gymnasium.make via the config keyword.

Change Episode Length

Shorter episodes speed up training on simple tasks. Longer episodes give the agent more time for multi-step tasks. Episode length is max_episode_steps, passed at make time (not on the config), the same way for every env on both backends.

import gymnasium as gym
import so101_nexus.mujoco
import so101_nexus.warp
from so101_nexus import TouchConfig

config = TouchConfig()

# MuJoCo (single env): applied via the TimeLimit wrapper
env = gym.make("MuJoCoTouch-v1", config=config, max_episode_steps=256)

# Warp (batched): forwarded to the vector env, which truncates internally
envs = gym.make_vec("WarpTouch-v1", num_envs=4, device="cuda", config=config, max_episode_steps=256)

Add Camera Observations

Include camera images by adding a camera component to the observations list. The observation becomes a dictionary with a "state" key (flat vector) plus one key per camera.

import gymnasium as gym
import so101_nexus.mujoco
from so101_nexus import (
    PickConfig, EndEffectorPose, GraspState, ObjectPose, ObjectOffset, WristCamera,
)

config = PickConfig(observations=[
    EndEffectorPose(),
    GraspState(),
    ObjectPose(),
    ObjectOffset(),
    WristCamera(width=224, height=224),
])
env = gym.make("MuJoCoPickLift-v1", config=config, render_mode="rgb_array")
obs, _ = env.reset()
print(obs["state"].shape)         # (18,): flat state vector
print(obs["wrist_camera"].shape)  # (224, 224, 3): camera image

For Warp, use the same observation config and create a batched vector env. Do not rely on render_mode for pixels; camera components are the image source.

import so101_nexus.warp

envs = gym.make_vec("WarpPickLift-v1", num_envs=1024, device="cuda", config=config)
obs, _ = envs.reset()
print(obs["wrist_camera"].shape)  # (1024, 224, 224, 3)

You can also add an overhead camera, or both:

from so101_nexus import OverheadCamera

config = PickConfig(observations=[
    EndEffectorPose(),
    GraspState(),
    ObjectPose(),
    ObjectOffset(),
    WristCamera(width=224, height=224),
    OverheadCamera(width=224, height=224),
])

Visual Observation Mode

For vision-based policies, use obs_mode="visual" to signal that the policy should not rely on privileged simulator state. This mode requires at least one camera component in the observations list.

from so101_nexus import PickConfig, JointPositions, WristCamera

config = PickConfig(
    obs_mode="visual",
    observations=[JointPositions(), WristCamera(width=224, height=224)],
)
env = gym.make("MuJoCoPickLift-v1", config=config)
obs, info = env.reset()

obs["state"]              # (6,) joint positions only
obs["wrist_camera"]       # (224, 224, 3) camera image

Domain Randomization

Randomize colors and spawn positions to improve sim-to-real transfer. Color fields accept a single color name or a list -- when a list is provided, the environment samples uniformly at each reset.

from so101_nexus import PickConfig

config = PickConfig(
    # Randomize visual appearance
    ground_colors=["gray", "white", "black"],
    robot_colors=["yellow", "orange"],
    # Tighten the spawn region
    spawn_min_radius=0.20,
    spawn_max_radius=0.30,
    spawn_angle_half_range_deg=60.0,
)
env = gym.make("MuJoCoPickLift-v1", config=config)

Available colors: red, orange, yellow, green, blue, purple, black, white, gray.

Reset Settling

By default, environments advance 5 no-op frames after reset before returning the first observation. This skips the unstable first frames after contacts and controllers initialize. Set reset_settle_frames=0 when you need to inspect the raw reset state.

import gymnasium as gym
from so101_nexus import PickConfig

config = PickConfig(reset_settle_frames=0)
env = gym.make("MuJoCoPickLift-v1", config=config)

Tune Rewards

Reward is composed of four weighted components: reaching (progress toward the object), grasping (gripper contact), task_objective (the primary goal like lifting), and completion_bonus (one-time success reward). The four weights must sum to 1.0. Optional penalty terms are applied separately.

from so101_nexus import RewardConfig, PickConfig

reward = RewardConfig(
    reaching=0.20,
    grasping=0.20,
    task_objective=0.50,
    completion_bonus=0.10,
    action_delta_penalty=0.025,
    energy_penalty=0.025,
)
config = PickConfig(reward=reward)
env = gym.make("MuJoCoPickLift-v1", config=config)

Configure Pick Tasks

PickConfig extends EnvironmentConfig with object and distractor settings. Use it for PickLift environments.

from so101_nexus import PickConfig, CubeObject, YCBObject

config = PickConfig(
    objects=[
        CubeObject(color="blue"),
        YCBObject(model_id="058_golf_ball"),
        YCBObject(model_id="011_banana"),
    ],
    n_distractors=1,
    lift_threshold=0.06,
    min_object_separation=0.04,
    ground_colors=["gray", "white"],
)
env = gym.make("MuJoCoPickLift-v1", config=config, render_mode="human")

When multiple objects are provided, the environment selects one as the target at each reset.

Configure Pick-and-Place Tasks

PickAndPlaceConfig controls the cube and target disc appearance. Use it for PickAndPlace environments.

from so101_nexus import PickAndPlaceConfig

config = PickAndPlaceConfig(
    cube_colors=["red", "green", "blue"],
    target_colors="purple",
    min_cube_target_separation=0.05,
)
env = gym.make("MuJoCoPickAndPlace-v1", config=config, render_mode="human")

All Parameters

See the Configs API reference for the full list of parameters on EnvironmentConfig, PickConfig, PickAndPlaceConfig, TouchConfig, LookAtConfig, MoveConfig, RewardConfig, RenderConfig, and RobotConfig.

Customizing Environments