Agent Chat Feature — Implementation Notes¶
Note
This page is a maintainer-oriented implementation note. For user-facing chat behavior and API usage, see Agent Chat.
Overview¶
This document describes the design and implementation plan for M7: Agent Chat — a passive, conversational interface that lets users talk to their agents about their configuration, abilities, and past work without triggering any tool execution or Celery tasks.
Users can ask questions like:
- "What have you been working on?"
- "What tools do you have access to?"
- "What do you remember about the last deployment?"
- "What skills are you configured with?"
Architecture¶
flowchart TD
Client["Client\n(HTTP)"] -->|POST /api/agents/{id}/chat| ChatRoute["Chat API\napp/api/routes/chat.py"]
ChatRoute -->|resolve/create| SessionDB[("MongoDB\nchat_sessions")]
ChatRoute -->|resolve user| Auth["GitHub Auth\napp/api/deps.py"]
ChatRoute -->|stream events| SSE["SSE StreamingResponse\ntext/event-stream"]
ChatRoute -->|calls| Handler["handle_chat()\napp/services/chat_handler.py"]
Handler -->|load history| MsgDB[("MongoDB\nchat_messages")]
Handler -->|build context| ContextBuilder["build_chat_context()\napp/services/chat_context.py"]
Handler -->|resolve key| ProviderDB[("MongoDB\nproviders / tokens")]
Handler -->|POST streaming| LLM["LLM API\n(httpx, OpenAI-compatible)"]
Handler -->|persist messages| MsgDB
Handler -->|update session| SessionDB
Handler -->|yield events| ChatRoute
ContextBuilder -->|agent profile| AgentDB[("MongoDB\nagents")]
ContextBuilder -->|skills| SkillDB[("MongoDB\nskills")]
ContextBuilder -->|MCP tools| MCPsDB[("MongoDB\nmcp_servers")]
ContextBuilder -->|task history| TaskDB[("MongoDB\ntask_executions / workflows")]
ContextBuilder -->|STM/LTM memory| MemoryMgr["MemoryManager\n(Redis + MongoDB)"]
SSE -->|event: session| Client
SSE -->|event: delta| Client
SSE -->|event: done| Client
SSE -->|event: error| Client
style Client fill:#dbeafe,stroke:#3b82f6
style LLM fill:#fef9c3,stroke:#ca8a04
style SessionDB fill:#f3e8ff,stroke:#9333ea
style MsgDB fill:#f3e8ff,stroke:#9333ea
style ProviderDB fill:#f3e8ff,stroke:#9333ea
style AgentDB fill:#f3e8ff,stroke:#9333ea
style SkillDB fill:#f3e8ff,stroke:#9333ea
style MCPsDB fill:#f3e8ff,stroke:#9333ea
style TaskDB fill:#f3e8ff,stroke:#9333ea
Data Flow¶
sequenceDiagram
participant C as Client
participant API as Chat API
participant H as handle_chat()
participant CTX as chat_context
participant DB as MongoDB
participant LLM as LLM API
C->>API: POST /api/agents/{id}/chat\n{message, session_id?}
API->>DB: get or create ChatSession
API->>H: handle_chat(agent, session, message, user, token)
H-->>C: SSE: {type: "session", session_id}
H->>CTX: build_chat_context(agent, user)
CTX->>DB: load skills, MCP servers, task history, memories
CTX-->>H: <agent_context>…</agent_context>
H->>DB: load ChatMessage history (last 50)
H->>DB: insert user ChatMessage
H->>LLM: POST /chat/completions (streaming)
loop Token stream
LLM-->>H: SSE delta chunk
H-->>C: SSE: {type: "delta", content: "…"}
end
H->>DB: insert assistant ChatMessage
H->>DB: update ChatSession (count, title, updated_at)
H-->>C: SSE: {type: "done", usage: {…}, message_id: "…"}
Component Breakdown¶
app/models/chat_session.py — ChatSession¶
Beanie document persisting a named conversation thread.
| Field | Type | Description |
|---|---|---|
agent_id |
PydanticObjectId |
The agent this session belongs to |
github_user |
str |
Owner — scopes access |
title |
str \| None |
Auto-generated from first message |
message_count |
int |
Running total of messages |
created_at / updated_at |
datetime |
Timestamps |
Index: (agent_id, github_user, updated_at DESC) for paginated listing.
app/models/chat_message.py — ChatMessage¶
Beanie document for individual messages within a session.
| Field | Type | Description |
|---|---|---|
session_id |
PydanticObjectId |
Parent session |
role |
"user" \| "assistant" |
Message author |
content |
str |
Message text |
usage |
dict \| None |
Token counts (assistant messages only) |
created_at |
datetime |
Timestamp |
Index: (session_id, created_at ASC) for chronological history retrieval.
app/services/chat_context.py — Agent Self-Awareness Context¶
Builds an XML <agent_context> block injected into the chat system prompt.
<agent_context>
<agent_profile>
<name>Deploy Assistant</name>
<model>gpt-4o</model>
<description>Handles deployment workflows</description>
</agent_profile>
<skills>
<skill><name>Kubernetes</name><description>…</description></skill>
</skills>
<available_tools>
<tool>kubectl_apply</tool>
<tool>git_push</tool>
</available_tools>
<task_history>
<task>
<prompt>Deploy v2.3.0 to staging</prompt>
<status>completed</status>
<created_at>2026-04-17T10:00:00Z</created_at>
</task>
</task_history>
<!-- STM/LTM memories injected here -->
</agent_context>
app/services/chat_handler.py — In-process LLM Handler¶
Async generator that yields ChatEvent dicts.
Key design decisions:
- No Celery dispatch — runs in the FastAPI process for low latency
- No tool execution —
stream=True, notoolsarray in the request body - No guardrails — v1 simplicity; may be added in a future milestone
- Conversation windowing — loads last 50 messages; oldest are dropped
- BYOK support — resolves the agent's attached provider (or falls back to the GitHub Models inference endpoint with the user's GitHub token)
app/api/routes/chat.py — HTTP Endpoints¶
| Method | Path | Description |
|---|---|---|
POST |
/api/agents/{id}/chat |
Send message → SSE stream |
GET |
/api/agents/{id}/chat/sessions |
List sessions (newest first) |
GET |
/api/agents/{id}/chat/sessions/{sid} |
Session + message history |
DELETE |
/api/agents/{id}/chat/sessions/{sid} |
Delete session + messages |
Auth: Same X-GitHub-Token / Authorization: Bearer mechanism as all other endpoints. Sessions are scoped to github_user.
SSE Event Protocol¶
id: 1
data: {"type": "session", "session_id": "abc123"}
id: 2
data: {"type": "delta", "content": "Based on"}
id: 3
data: {"type": "delta", "content": " my recent activity…"}
…
id: N
data: {"type": "done", "usage": {"prompt_tokens": 500, "completion_tokens": 120}, "message_id": "xyz789"}
On error:
Observability¶
Two new Prometheus metrics in app/observability.py:
| Metric | Type | Labels |
|---|---|---|
copilot_hub_chat_messages_total |
Counter | role (user/assistant) |
copilot_hub_chat_response_duration_seconds |
Histogram | model |
The existing copilot_hub_sse_connections_active gauge is reused for chat SSE connections.
Acceptance Criteria¶
- [x]
ChatSessionandChatMessagemodels with DB indexes - [x] Pydantic schemas for request/response
- [x] Models registered with Beanie in
app/db.py - [x]
POST /api/agents/{id}/chat— SSE streaming response - [x] Session create-or-reuse logic
- [x]
GET /api/agents/{id}/chat/sessions— paginated list - [x]
GET /api/agents/{id}/chat/sessions/{sid}— full message history - [x]
DELETE /api/agents/{id}/chat/sessions/{sid}— cascade delete - [x] Auth enforcement (user-scoped sessions)
- [x]
handle_chat()in-process async generator - [x] BYOK provider support + GitHub Models default path
- [x] Conversation windowing (50 messages)
- [x]
build_chat_context()— agent profile, skills, tools, task history, memories - [x] Chat observability metrics
- [x] Unit tests (
tests/test_chat.py) - [x] API reference (
docs/api/chat.md) - [x] User guide (
docs/guide/agent-chat.md)