Reachy Mini
Cascade pipeline ยท Swappable providers ยท Profiles & reactions

ChatBox

Modular voice conversations for your robot.

A cascade pipeline โ€” ASR โ†’ LLM โ†’ TTS โ€” where each stage is a swappable provider. Mix cloud APIs and local models, define personality profiles, trigger live reactions, and let the LLM call tools to dance, emote, or look around.

Cascade ASR โ†’ LLM โ†’ TTS Local + cloud providers Live reactions while you speak LLM tool calls for movement
Reachy Mini dancing

Reachy Mini can move, dance, and emote while holding a natural conversation.

What's inside

A modular conversational layer for your robot

Each piece of the pipeline is independent and swappable โ€” pick the providers that fit your setup, budget, and latency needs.

๐Ÿ”—

Cascade pipeline

ASR โ†’ LLM โ†’ TTS in discrete stages. VAD segments audio, ASR transcribes, LLM reasons, TTS speaks โ€” all streaming.

๐Ÿ”€

Swappable providers

Mix local models (Parakeet, Kokoro) with cloud APIs (Deepgram, Gemini, OpenAI, ElevenLabs). Switch from YAML or CLI flags.

๐ŸŽญ

Personality profiles

Each profile bundles a system prompt, voice, enabled tools, and reaction triggers. Switch profiles live from the Gradio UI.

โšก

Live reactions

Keyword and entity triggers fire while the user is still speaking โ€” the robot reacts before the LLM even sees the text.

๐Ÿ› ๏ธ

LLM tool calls

The LLM can dance, play emotions, move the head, peek through the camera, or toggle head-tracking โ€” all via tool calls.

How it feels

From config to conversation in one command

  • โš™๏ธ Pick your providers in cascade.yaml โ€” local ASR, cloud LLM, local TTS, or any mix.
  • ๐Ÿ—ฃ๏ธ Start talking and watch streaming transcripts appear in the Gradio UI.
  • โšก The robot reacts to keywords instantly โ€” before your sentence even finishes.
  • ๐Ÿ’ƒ Ask for a dance or an emotion; the LLM calls tools and the robot moves.

Where it shines

Demos, teaching, and custom robot personalities

Build a pirate-themed guide, a museum docent, or a classroom assistant โ€” each with its own voice, tools, and live reactions. Swap providers to balance cost and latency, run fully local on Apple Silicon, or mix in cloud APIs for maximum quality.

Cascade pipeline Local + cloud mix Profiles Live reactions Tool calls