---
title: "Building Clarissa: Learning How AI Agents Actually Work"
description: "A deep dive into building an AI-powered terminal assistant from scratch. Learn about the ReAct pattern, tool execution, context management, and what it takes to build a real AI agent."
date: 2025-12-07T00:00:00.000Z
tags: ["ai", "typescript", "bun", "mcp", "agents", "terminal", "cli"]
author: "Cameron Rye"
canonical_url: https://rye.dev/blog/building-clarissa-ai-terminal-assistant/
---
Building Clarissa started as a learning exercise to understand how AI agents actually work under the hood. After using tools like Claude, ChatGPT, and various coding assistants, I wanted to demystify the magic. What I discovered was both simpler and more nuanced than I expected.

This post shares what I learned building a terminal AI assistant from scratch, the architectural patterns that emerged, and the practical challenges of creating an agent that can reason about tasks and take action.

## Why Build a Terminal AI Agent?

Existing AI interfaces felt disconnected from my actual workflow. I spend most of my day in the terminal, and switching to a browser or GUI to ask an AI for help created friction. More importantly, I wanted to understand:

- How do AI agents decide when to use tools versus just respond?
- How do you manage context windows that can hold millions of tokens?
- What makes tool execution safe and reliable?
- How does the Model Context Protocol actually work?

The best way to learn was to build.

## The ReAct Pattern: Reasoning + Acting

The core of Clarissa is the ReAct (Reasoning + Acting) pattern. This isn't some complex neural architecture; it's a surprisingly simple loop:

```typescript
async run(userMessage: string): Promise<string> {
  this.messages.push({ role: "user", content: userMessage });

  for (let i = 0; i < maxIterations; i++) {
    // Get LLM response
    const response = await llmClient.chatStreamComplete(
      this.messages,
      toolRegistry.getDefinitions()
    );

    this.messages.push(response);

    // Check for tool calls
    if (response.tool_calls?.length) {
      for (const toolCall of response.tool_calls) {
        const result = await toolRegistry.execute(
          toolCall.function.name,
          toolCall.function.arguments
        );
        this.messages.push({
          role: "tool",
          tool_call_id: toolCall.id,
          content: result.content
        });
      }
      continue; // Loop back for next response
    }

    // No tool calls = final answer
    return response.content;
  }
}
```

The LLM doesn't "decide" to use tools in some mysterious way. You send it available tool definitions, and it responds with either a message or a request to call specific tools. You execute those tools, feed the results back, and repeat until it responds without tool calls.

![A diagrammatic visualization of the ReAct (Reasoning + Acting) loop, showing the cyclical nature of the LLM deciding to use a tool, getting results, and looping back.](/images/blog/generated/building-clarissa-ai-terminal-assistant-a-diagrammatic-visualization-o-1765150787749.jpg)


This loop is the entire agent. Everything else is infrastructure around it.

## What I Learned About Tool Design

The most interesting challenge was designing tools that are both useful and safe. Early versions had tools that were too granular (read a single line) or too powerful (execute arbitrary code). The sweet spot required iteration.

### Tool Confirmation

Potentially dangerous operations need confirmation. But what's "dangerous"? I settled on this heuristic:

- **No confirmation**: Reading files, listing directories, viewing git status
- **Confirmation required**: Writing files, executing shell commands, making commits

```typescript
interface Tool {
  name: string;
  description: string;

### The Tool Registry Pattern

Rather than hardcoding tools, I built a registry that tools register themselves into:

```typescript
class ToolRegistry {
  private tools: Map<string, Tool> = new Map();

  register(tool: Tool): void {
    this.tools.set(tool.name, tool);
  }

  getDefinitions(): ToolDefinition[] {
    return Array.from(this.tools.values()).map(toolToDefinition);
  }

  async execute(name: string, args: string): Promise<ToolResult> {
    const tool = this.tools.get(name);
    const parsedArgs = JSON.parse(args);
    const validatedArgs = tool.parameters.parse(parsedArgs);
    return await tool.execute(validatedArgs);
  }
}
```

This pattern made MCP integration trivial. When connecting to an MCP server, I just convert its tools to my format and register them:

```typescript
const tools = mcpTools.map((mcpTool) => ({
  name: `mcp_${serverName}_${mcpTool.name}`,
  description: mcpTool.description,
  parameters: jsonSchemaToZod(mcpTool.inputSchema),
  execute: async (input) => client.callTool({ name: mcpTool.name, arguments: input }),
  requiresConfirmation: true  // MCP tools are external
}));

toolRegistry.registerMany(tools);
```

## Context Management: The Underrated Challenge

Context windows are measured in tokens, but managing them well requires more than counting. Here's what I learned:

### Token Estimation

You can't send requests to the API just to count tokens. You need local estimation:

```typescript
estimateTokens(text: string): number {
  // Rough approximation: ~4 chars per token for English
  return Math.ceil(text.length / 4);
}

estimateMessageTokens(message: Message): number {
  let tokens = 0;
  if (message.content) tokens += this.estimateTokens(message.content);
  if (message.tool_calls) {
    for (const tc of message.tool_calls) {
      tokens += this.estimateTokens(tc.function.name);
      tokens += this.estimateTokens(tc.function.arguments);
    }
  }
  return tokens + 4;  // Role overhead
}
```


![A conceptual illustration of token management and smart truncation, visualizing how older messages fade away while keeping atomic groups of data intact.](/images/blog/generated/building-clarissa-ai-terminal-assistant-a-conceptual-illustration-of-t-1765150803838.jpg)

### Smart Truncation

When approaching the limit, you can't just drop the oldest messages. Tool calls and their results must stay together, or the LLM gets confused:

```typescript
truncateToFit(messages: Message[]): Message[] {
  // Group messages into atomic units
  // User message -> Assistant response -> Tool results
  const messageGroups: Message[][] = [];

  // Keep system prompt, add groups from newest to oldest
  // until we hit the limit
  for (const group of reversedGroups) {
    const groupTokens = group.reduce((sum, msg) =>
      sum + this.estimateMessageTokens(msg), 0);
    if (totalTokens + groupTokens <= availableTokens) {
      toAdd.unshift(...group);
      totalTokens += groupTokens;
    }
  }
}
```

This was one of those bugs that took hours to track down. The LLM would suddenly start hallucinating tool results because it could see a tool call but not the corresponding result.

## Building with Ink: React for the Terminal

Choosing Ink (React for CLIs) was initially just curiosity, but it proved invaluable. Terminal UIs have the same state management challenges as web UIs:

```tsx
function App() {
  const [messages, setMessages] = useState<DisplayMessage[]>([]);
  const [isThinking, setIsThinking] = useState(false);
  const [streamContent, setStreamContent] = useState('');

  const handleSubmit = async (input: string) => {
    setIsThinking(true);
    await agent.run(input, {
      onStreamChunk: (chunk) => setStreamContent(prev => prev + chunk),
      onToolCall: (name) => setMessages(prev => [...prev, { type: 'tool', name }])
    });
    setIsThinking(false);
  };

  return (
    <Box flexDirection="column">
      {messages.map(msg => <Message key={msg.id} {...msg} />)}
      {isThinking && <ThinkingIndicator />}
      {streamContent && <StreamingResponse content={streamContent} />}
      <Input onSubmit={handleSubmit} />
    </Box>
  );
}
```

The streaming response visualization was particularly satisfying. Tokens appear as they arrive, giving users immediate feedback that something is happening.

## The Memory System: Persistent Context

Sessions persist conversation history, but users also wanted to tell the agent facts it should always remember:

```typescript
class MemoryManager {
  async add(content: string): Promise<Memory> {
    const memory = {
      id: this.generateId(),
      content: content.trim(),
      createdAt: new Date().toISOString(),
    };
    this.memories.push(memory);
    await this.save();
    return memory;
  }

  async getForPrompt(): Promise<string | null> {
    if (this.memories.length === 0) return null;
    const lines = this.memories.map((m) => `- ${m.content}`);
    return `## Remembered Context\n${lines.join("\n")}`;
  }
}
```

Memories get injected into the system prompt. Simple, but it transforms the experience. Tell Clarissa once that you prefer TypeScript over JavaScript, and it remembers across every session.

## MCP Integration: Extending Without Modifying

The Model Context Protocol was the final piece. Rather than building every possible tool, Clarissa can connect to external MCP servers:

```bash
/mcp npx -y @modelcontextprotocol/server-filesystem /path/to/directory
```

The integration was straightforward once the tool registry pattern was in place. The challenge was converting JSON Schema (what MCP uses) to Zod (what I use internally):

```typescript
function jsonSchemaToZod(schema: unknown): z.ZodType {
  const s = schema as Record<string, unknown>;

  if (s.type === "object" && s.properties) {
    const shape: Record<string, z.ZodType> = {};
    for (const [key, propSchema] of Object.entries(s.properties)) {
      shape[key] = jsonSchemaToZod(propSchema);
    }
    return z.object(shape);
  }

  if (s.type === "string") return z.string();
  if (s.type === "number") return z.number();
  if (s.type === "boolean") return z.boolean();
  if (s.type === "array") return z.array(jsonSchemaToZod(s.items));

  return z.unknown();
}
```

## Key Learnings

Building Clarissa taught me several things that weren't obvious from using AI tools:

**Agents are loops, not magic.** The ReAct pattern is elegant in its simplicity. The complexity is in the infrastructure around it: streaming, context management, tool safety.

**Tool design is UX design.** The tools you provide shape what the agent can do. Too few and it's limited. Too many and it gets confused. The sweet spot requires iteration.

**Context windows are precious.** Even with million-token windows, you can exhaust them quickly. Smart truncation and memory systems extend useful context far beyond raw limits.

**Streaming matters.** Users hate staring at a blank screen. Showing tokens as they arrive transforms the experience from "is this broken?" to "I can see it thinking."

**Confirmation builds trust.** Letting users approve dangerous operations doesn't just prevent mistakes; it changes how they interact with the agent. They're more willing to ask for ambitious tasks.

## Try It Yourself

Clarissa is open source and available on npm:

```bash
bun install -g clarissa
# or
npm install -g clarissa
```

Set your OpenRouter API key and you're ready to go:

```bash
export OPENROUTER_API_KEY=your_key_here
clarissa
```

The source code is at [github.com/cameronrye/clarissa](https://github.com/cameronrye/clarissa), and the documentation at [clarissa.run](https://clarissa.run) covers everything from basic usage to MCP integration.

---

*Building Clarissa was one of the most educational projects I've undertaken. If you're curious about how AI agents work, I encourage you to build one yourself. The gap between "using AI tools" and "understanding AI tools" is smaller than you might think.*