Beyond proto-personas
I started building Kapwa after watching my wife’s PhD, where the scarce thing isn’t knowledge but an advisor’s focused attention. The first in a series on what that turned into.
My wife is a PhD student at UC Irvine. For several years now I've watched her work at something that turns out to be most of what a doctorate actually is: building real relationships with her advisors. Getting their time. Getting their focused attention. Finding a space where she could think out loud, float a half-formed idea, and turn it over with someone who knew the field — without the cost of looking unready in front of someone who would later judge the work.
The scarce thing, it turned out, wasn't knowledge. Knowledge is everywhere now. The scarce thing was access to a particular kind of attention.
By then — about two years ago — I'd already been experimenting with personas, the way a lot of people were with the first wave of capable models: prompting Claude, GPT, and Gemini to take one on. You are a developmental psychologist. You are a conversational Tagalog instructor inspired by Kara David. For me it was mostly play. My wife's experience is what turned it into an idea — the kind of advisor relationship she valued was something I might be able to build, and it was a use case nothing on the market was really serving. So I started trying to build one in earnest. It worked, a little, for one conversation. The trouble started as the conversations got long and the topics multiplied. The persona would drift. And there was no thread tying one session to the next — every new chat, the advisor met me fresh, having forgotten everything we'd worked through.
Then projects arrived — custom instructions, a place to put context before a conversation even began. That was a real step up. I could hand the model a durable brief and get back something that held its shape across a session: a stable-enough advisor I could return to. I built a few. I leaned on them.
But I was bending general-purpose tools toward a job they weren't built for, and they had limits I couldn't prompt my way past. The first: memory still didn't carry across conversations in any deep way — the brief persisted, but the relationship didn't accumulate. The second, and the one that mattered more, was structural. A project holds one persona. I didn't want one advisor. I wanted a board — several of them, in the room at once, disagreeing with each other about my actual problem, pushing me toward a decision that was well thought out and argued from genuinely different angles.
That gap is what I started building Kapwa to fill. This series is about what it turned into, and what I think the current generation of AI personas is still missing. I'll start with the wall that mattered most: the board.
What current persona frameworks are good at
Those projects I built were a version of what a lot of careful work across the industry now produces. OpenClaw — Peter Steinberger's agent framework — ships a SOUL.md convention: a five-section file that gives an agent core truths, boundaries, a voice, and a continuity practice. ChatGPT's custom GPTs have put the same primitives in front of hundreds of millions of people. These systems are good — genuinely good — at voice continuity. They can make a conversation feel like a conversation with a particular character, across turns and across sessions. That is real progress over the stateless chatbot of three years ago, and worth saying clearly before I describe what still needs work.
I'll use the term proto-persona for what these systems produce — including the ones I built for myself. It's a lens, not an industry taxonomy; others will reasonably reject the label. By proto-persona I mean: a prompt scaffold plus a continuity layer. Voice persists. Memory persists, at some level. Values — what the agent will and won't do — persist. It's a useful lens on a recurring pattern, and the pattern serves real needs. It was enough to make my homemade advisors useful, and enough to make me want to build something larger on it, for more people than just me. Before any of that, though, a board had to be possible, and it wasn't yet.
The reason is specific. What proto-personas don't do well is hold multiple personas in disagreement.
If you load three proto-personas into one conversation and ask them the same question, you tend to get three responses that converge toward the same answer, with superficial variations in style. This is not a bug in any individual implementation. It's a consequence of running several personas on top of a single model that has its own preferences about what a good answer looks like. Push them through a synthesis step and the convergence accelerates. The disagreement I actually wanted — the thing that makes a real committee of advisors valuable — gets averaged away.
A narrower technical distinction
The instinctive fix is better prompting. Smarter scaffolds. Richer system messages. I tried these. They produce marginal gains. They don't address the underlying problem.
The root issue is that the shared substrate keeps pulling outputs toward the same center. Some of this is the shared model. Some of it is the prompting priors that any well-meaning system message imposes ("be balanced," "consider multiple angles"). Some of it is the orchestration layer rewarding apparent agreement. The metaphor "centripetal" is useful as a one-line illustration — modern LLMs do tend to pull toward the modal answer — but the real story is mechanical: several specific subsystems, each contributing to the convergence.
A real advisory board doesn't work this way. Three advisors disagree because they are different people, with different histories, different vocabularies, different things they refuse to be wrong about. The diversity is structural. It is the input, not the output.
To produce something along these lines from AI requires explicit countermeasures. Specifically:
- Identity-scoped memory. Each persona has its own memory of the user and of prior conversations, not a shared pool. The memory is part of what makes the identity hold.
- Stance commitments with explicit revision. Each persona holds positions across turns and across sessions. When new information shifts a position, the persona revises the position visibly — with a marker the user can see. Silent reversal is treated as a failure mode and logged. People reverse themselves silently all the time, and we mostly let it pass — but a persona that does it quietly stops being the same persona. Tracking the stances is part of what keeps the identity stable across time.
- Convergence defenses across personas. When multiple personas respond in the same conversation, the orchestration deliberately works against voice convergence. This is engineering: each persona's own identity is weighted above the peer responses it's shown, so an advisor reads what the others said as reference rather than dissolving into it — and the same-turn synthesis step, the one that would quietly average disagreement into a tidy consensus, is held back until the disagreement has had room to exist.
The problem these address is old. Every serious deliberative body — a panel, a committee, an intelligence shop — has wrestled with how to keep its members from collapsing into agreement, and built techniques for it: devil's advocates, independent written positions taken before discussion, structured dissent. What's new is the failure mode. Put three people in a room and they stay three people; put three personas on one model and they drift toward a single voice unless something holds them apart. So these countermeasures have to be engineered rather than borrowed — and none of them are needed in a single-persona system (except, perhaps, as a means to combat sycophancy, but more on that later). They become necessary the moment you commit to multiple personas as the primary mode.
This is the substrate of Kapwa, the AI advisory board I've been building. The first two — identity-scoped memory and stance commitments — have been running in production. The third is layered in across several subsystems and continues to be a place where we find new failure modes. Two further dimensions, around per-response quality scoring and identity-integrity handling, are net-new builds and will get their own posts when they ship.
What this actually looks like
Here are three Kapwa demo personas — Marcus Aurelius, Carl Jung, Ada Lovelace — each with a STANCES.md file. One section of each file documents the persona's position on change of mind. The architectural property is the same: each persona will revise its prior stance explicitly when given a reason to. The voices, on purpose, are not the same:
Marcus: "To change my mind is a virtue, when the new reason is better. To cling to an error because you once held it is its own corruption. When I revise, I say I am revising."
Jung: "A position untested by its opposite is a position only half-held. When I revise, I say what shifted. To cling to a view because one has held it is the ego's preference, not the psyche's."
Ada: "A position that cannot be revised is not a position; it is a habit. When I change my view, I will say what shifted and why."
An observant reader will note these three quotes have a similar shape — definition, revision clause, moralizing close. That's true. As written, they demonstrate style separation around a shared protocol, which is most of what a clever prompter could produce on a single model.
The actual structural claim is downstream of the quotes. It is that when these personas live in a multi-turn, multi-session conversation, the runtime enforces the revision behavior. Marcus's prior stance is surfaced into the next-turn context. A contradiction without an explicit revision marker is detected and logged as a soft-fail. The audit trail of who said what, when, and why is inspectable. The voices stay different across the conversation because the system actively works to keep them different, not because the prompt happens to be lucky on any given turn.
The quotes establish that the personas have distinct vocabularies. The runtime contract is what holds those vocabularies in place under pressure.
What this isn't claiming
I want to be precise about scope. What I've described is the foundation for a board of named advisors that holds together in conversation — distinct voices, stable disagreement, explicit revision when revision is warranted, no enforced consensus. That's a real product property and it's most of what this post is trying to establish.
It is not a claim that the personas are invested in your work in the way a real mentor is. That is a different and harder question — and it's the one that goes back to where I started. What was scarce for my wife wasn’t information from her advisors. It was focused, accumulating attention — a relationship that remembered last month and had a stake in next month. A board that holds together in conversation is the foundation for that. It is not yet the thing itself. Whether stable identity and memory can support something that begins to resemble genuine engagement, rather than just continuity, is a question I don't have a defensible answer to yet. That's later in this series, and I'd rather earn it than borrow the language of it now.
If this layer of architecture is interesting to you — or if you think the framing is wrong somewhere — get in touch. The series will continue with the centrifugal mechanics in more detail, the persona file convention we've adopted (and what we've added to it), and the question of whether a persona can hold a stake in your work without pretending to be more than it is.
Kapwa is launching soon. The Persona Runtime Contract spec will be published with a later post in this series.