What is Yumi?

Yumi is not just a chatbot, nor is she a standard command-line voice utility. She is designed to be a digital companion with an emotional presence, combining the state-of-the-art in local voice models, real-time animation, and multi-dimensional LLM-driven intelligence.

The Dream of Presence

Most modern AI tools are transactional. You open a webpage, submit a query, wait for a paragraph of text, and then close the window. The interaction is sterile, static, and disjointed.

Yumi was built on a different premise: that computers should feel like companions.

She Listens: Continuous Voice Activity Detection (VAD) monitors your voice input naturally.
She Speaks: Natural, expressive voices streamed with sub-second latencies.
She Feels: An active, animated Live2D avatar that reacts to conversations with expressive eye movements, body nods, and emotional shifts (tsundere, caring, kuudere) that map dynamically to her underlying thoughts.
She Adapts: Dynamic hot-swappable personalities change how she addresses you, what words she uses, and how she sounds.

High-Level Architecture Overview

Yumi bridges a concurrent Python 3.12+ backend with an HTML5 / WebSockets / PixiJS frontend:

  [ User Audio ] ──► [ Silero VAD ] ──► [ Whisper STT ] ──► [ LangGraph Engine ]
                                                                  │
  [ Live2D Render ] ◄── [ Audio RMS ] ◄── [ ElevenLabs ] ◄────────┘

The Ears: A local high-speed VAD pipeline captures audio slices from the web client, processing voice boundaries instantly.
The Brain: The transcribed query passes into a LangGraph conversational workflow, querying selected models (Groq/Llama-3, OpenAI/GPT-4, Anthropic/Claude-3.5) with rich persona prompts.
The Voice: The generated response text is synthesized through streaming TTS models (ElevenLabs or CAMB.ai).
The Body: The frontend plays the streaming audio, computes the real-time RMS wave amplitude, maps it to lip movements on a Live2D model (Huohuo), and applies the LLM's requested body gestures and expressions.

To get started, follow the Quickstart Guide to wake her up!