Data Specs / 2026-06-06

Human demonstration dataset checklist for robotics teams

A practical checklist for evaluating clips, metadata, capture specs, annotations, consent, and QA before training physical AI systems.

Embodied AI Data Labs 7 min read

Start with clip-level traceability

Every video clip should have a stable clip_id, environment, task label, timestamps, camera view, frame rate, and quality score. That makes it possible to filter data before training and audit where model behavior came from later.

For manipulation tasks, fields such as action, object, hands, occlusion, and tool use are often more useful than broad activity labels. They describe what the robot actually needs to learn.

Separate capture specs from annotations

Capture specs describe how the data was filmed: camera angle, lighting, fps, environment, and device. Annotations describe what happened in the video. Keeping those layers separate makes schema changes cleaner.

This separation also helps teams compare egocentric and exocentric views without mixing collection quality with task semantics.

Make compliance visible

Consent status, anonymization notes, and license terms should be attached to deliveries in a way engineering and legal teams can both review.

A commercial-use rights workflow is not just a checkbox. It is a repeatable process that connects participant consent, collection scope, delivery use, and retention policy.

Human demonstration dataset checklist for robotics teams

Start with clip-level traceability

Separate capture specs from annotations

Make compliance visible

Need human task data your robots can learn from?

Keep the signal moving