Tasks and Success Conditions
Task objectives, success criteria, and reward structure.
SO101-Nexus ships five task types that cover fundamental manipulation primitives.
Task Overview
| Task | Objective | Max Steps |
|---|---|---|
| PickLift | Grasp an object and lift it above a height threshold | 1024 |
| PickAndPlace | Pick up a cube and place it at a target location | 1024 |
| Touch | Bring the gripper to an object resting on the table | 512 |
| LookAt | Orient the end-effector to gaze at a target object | 256 |
| Move | Move the TCP a set distance in a cardinal direction | 256 |
Success Conditions
Each task defines a binary success signal returned in the info dict.
PickLift succeeds when the object is grasped and lifted above the lift_threshold (default: 0.05 m).
PickAndPlace succeeds when the cube's XY distance to the goal is below goal_thresh, the cube is on the ground, and the robot is static.
Touch succeeds when the TCP reaches within the target object's bounding radius plus touch_margin (default: 0.03 m), so it fires when the gripper reaches objects of any size.
LookAt succeeds when the target is within the wrist camera's field of view: the angle between the wrist-camera optical axis and the direction to the object is at most half the camera FOV (fov_deg / 2, or the live cam_fovy / 2 when fov_deg is None). The dense reward shapes toward dead-center, rewarding a centered view.
Move succeeds when the TCP has traveled at least target_distance along the move direction (within success_threshold, default: 0.01 m), regardless of perpendicular drift.
Reward Structure
All tasks use a shared reward structure that decomposes into weighted components: reaching progress, grasp quality, the task-specific objective (lifting height, placement accuracy, etc.), and a one-time completion bonus. The weights are configurable through RewardConfig and must sum to 1.0.
Distance-based reward components use tanh shaping to produce smooth, bounded rewards that decay as the agent moves away from the goal. Optional penalties for action deltas and energy usage can be enabled for smoother or more efficient behavior.
See the Configs API reference for all RewardConfig parameters and default values.