Vision-Guided Pick-and-Place with the xArm6
In this workshop you build a vision-guided robot that finds blocks by shape and picks each one into a bin. A shape-detector vision service locates blocks in camera space and feeds a vision-segment service (model detections-to-segments) that turns each detection into a point cloud segment the motion planner can grasp.
The workshop is structured as six sequential phases, each ending with a working system state you can verify before moving on. The workshop has two milestones: by the end of Phase 4 you drive the robot from your own code through a static, pre-planned sequence (milestone one, a real and bankable win), and by the end of Phase 5 you close the loop with live perception so the robot detects, picks, and places blocks on its own (milestone two). At minimum, aim to complete the Phase 4 script.
What you’ll build
You will configure an xArm6 robotic arm fitted with a finger gripper and a wrist-mounted Intel RealSense depth camera. A shape-detection vision service finds blocks in camera space, and the Viam motion service plans and executes collision-free picks that place each block in the bin. By the end of Phase 5 you have a Python script you run from your laptop that drives the full detect-pick-place loop.
Hardware
- uFactory xArm6: the six-axis robotic arm that picks and places the blocks.
- Intel RealSense D435: the wrist-mounted depth camera that detects block positions by shape.
- uFactory finger gripper: the end-effector that grasps the blocks.
- System76 Meerkat: the on-robot mini-PC that runs the Viam machine server.
Phases
Phases 1 through 5 are the core workshop. Phase 6 is optional.
- Platform mental model (~15 min)
- Configure resources and explore the app (~20 min)
- Static positions and obstacles (~20 min)
- Control the robot from Python (~15 min, milestone one)
- Perception-guided picking (~22 min, milestone two)
- Inline module (~20 min, optional)
Prerequisites
This is a self-serve workshop, so confirm each of the following before you start:
- Python 3.10 or newer. Install it with uv (recommended) or from python.org.
- The Viam Python SDK. The companion
scripts/project already declaresviam-sdk, souv runinstalls it for you in Phase 4. See the Python SDK docs for reference. Pip works too if you prefer it. - A working terminal on the machine you will run the Phase 4 and Phase 5 scripts from, typically your laptop rather than the robot’s Meerkat.
- A Viam account with an accessible machine. Log in at app.viam.com, open your machine, and confirm the green Live indicator before you begin.
Validate your environment
Before starting Phase 4, confirm your environment is ready:
python3 --version # 3.10 or newer
uv run --with viam-sdk python -c "import viam; print(viam.__version__)" # prints a version
If either command fails, revisit the checklist above before continuing.
Where to start
- Physical hardware ready: start at Phase 1.
- Provisioning your own hardware: complete the hardware setup guide first (forthcoming), then return here for Phase 1.
Only the physical hardware, viam-agent, viam-server, and the frame calibration (the camera’s mounting offset on the arm) may be pre-provisioned for you or come from the hardware setup guide. Configuring the arm, gripper, camera, and the vision and motion services is always your hands-on work in this workshop, starting in Phase 2.
Companion code
All supporting files for this workshop live in the viam-devrel/pick-and-place repository.
config/holds a machine config fragment and an obstacles template. Use them to check your work after you configure resources by hand, not as something to import wholesale.scripts/holds the starter script for Phase 4 and the reference solution for Phase 5.
Was this page helpful?
Glad to hear it! If you have any other feedback please let us know:
We're sorry about that. To help us improve, please tell us what we can do better:
Thank you!