Agentic & multimodal systems

Multi-persona AI manager and team (Do Mode)

Multi-persona conversation engine that puts a fictional company's team around the user, with the manager and the right teammate stepping in for the moment.

What it is

A multi-persona conversation system that puts the user inside a fictional company's team. Each persona is a distinct role in that scenario (a manager, a senior peer, a cross-functional partner) with their own style, their own memory of the conversation, and their own view of the brief. The right persona steps in based on what the moment calls for: the manager when direction is needed, a teammate when the user is checking their work, the right specialist when a question crosses their lane. Along the way the team also sends voice notes, resources, and any starter artefact the stage's deliverable expects, and the user can attach images, PDFs, or notes of their own back into any message.

What it's for

A user working through a stage needs a real team to talk to: a manager who gives direction the way a real manager would, peers who push back, specialists who hand them what they need. The chat has to feel like a fictional company that just hired the user, not a fresh chat window every session. Every turn also has to feed signals back into how the rest of the platform reads the user, without slowing the reply down.

How it was built

A FastAPI WebSocket per user, with each persona running as its own cached agent so the model never reloads context between turns. The personas are cast at world-build time: each scenario hires a fictional company with specific roles (manager, peers, specialists), and those roles are who the user works with for the whole scenario. Before the user types the first message, a prewarm step pulls the conversation history, the stage brief, and the user's running profile in parallel, so the first reply has zero loading cost. Every turn passes through an in-flight router that decides four things at once: which persona is speaking, whether the turn is the user doing the work, asking for help, or asking for a resource, what artefact to attach to the reply (a voice note from the manager, a reference for the stage, a starter file the deliverable can build from), and what signals to ship to the rest of the platform. The user can attach images, PDFs, or notes of their own back to any message, and the engine reads them inline. While the persona's reply streams back to the user, the same message fans out in the background to the performance scoring layer, the evaluation layer, the conversation memory layer, and the experience-points engine, so the user's profile keeps updating in real time without the reply ever waiting on it. Gemini and Claude are swapped in per persona depending on the role. The same engine also handles the understanding check at the end of each stage, where the team switches into a debrief mode before the user moves on.

My role

Sole author of the conversation engine, the WebSocket prewarm path, and the per-turn calibration fanout.

Built with

PythonFastAPIWebSocketsGeminiPer-agent cachingMessage queue

Want the full technical depth, the tradeoffs, what broke, what I'd do differently? Ask the agent about this project.

More projects Talk through it