PostgreSQL Backend Guide¶

TBD Agents supports a fully-PostgreSQL deployment stack via the pgvector profile. This guide covers everything you need to know to run, configure, migrate to, and operate the PostgreSQL backend.

Overview¶

By default TBD Agents uses MongoDB as its document store (the qdrant profile). The pgvector profile replaces MongoDB with PostgreSQL 16, using JSONB columns to store the same document structures — and it reuses that same PostgreSQL instance for vector storage via the pgvector extension.

Why choose the PostgreSQL backend?¶

Benefit	Detail
Single database service	One PostgreSQL container handles both document storage and vector search — no MongoDB to run alongside Qdrant
Unified backup / restore	A single `pg_dump` snapshot captures all agent data and all vector embeddings
Familiar ops tooling	Standard PostgreSQL tooling (psql, pgAdmin, pg_dump, Alembic) works out of the box
Kubernetes-friendly	One StatefulSet / PVC instead of two separate ones
JSONB flexibility	Same schemaless document semantics as MongoDB with full SQL query power on top
pgvector + data in one place	Vector search and relational queries share the same ACID transaction boundary

Profile Comparison¶

Feature	`qdrant` profile	`pgvector` profile
Document store	MongoDB 7	PostgreSQL 16 (JSONB)
Vector store	Qdrant	pgvector extension
Services needed	2 (`mongodb` + `qdrant`)	1 (`pgvector` with pgvector)
Data backup	`mongodump` + Qdrant snapshot	`pg_dump` only
Migrations	Beanie (schema-less)	Alembic
Recommended for	Existing MongoDB users	New deployments, Kubernetes, simplicity

Quick Start — all-PostgreSQL stack¶

1. Configure `.env`¶

COMPOSE_PROFILES=pgvector
VECTOR_STORE_BACKEND=pgvector
DB_BACKEND=postgres
POSTGRES_URI=postgresql+asyncpg://postgres:postgres@pgvector:5432/tbd_agents

2. Start services¶

docker compose up -d

This starts a single pgvector container with the pgvector extension pre-installed. No MongoDB, no Qdrant.

3. Apply database migrations¶

docker compose exec app alembic upgrade head

4. Verify¶

docker compose ps          # pgvector should be Up
docker compose exec app alembic current   # should show head revision

Configuration Reference¶

Variable	Default	Description
`DB_BACKEND`	`mongo`	Document store backend: `mongo` or `postgres`
`POSTGRES_URI`	`postgresql+asyncpg://postgres:postgres@pgvector:5432/tbd_agents`	Async-compatible PostgreSQL connection string (uses `asyncpg` driver)
`POSTGRES_DB_NAME`	`tbd_agents`	PostgreSQL database name
`COMPOSE_PROFILES`	`qdrant` in `.env.example`	Docker Compose profile: `qdrant` or `pgvector`; empty starts no vector-store service
`VECTOR_STORE_BACKEND`	`qdrant`	Vector store backend: `qdrant` or `pgvector` — must match `COMPOSE_PROFILES`

Important: DB_BACKEND=postgres and COMPOSE_PROFILES=pgvector must always be set together. Using one without the other will result in a mismatch between the running containers and the application's data layer.

Schema Design¶

The PostgreSQL backend maps each logical MongoDB collection to a dedicated table with proper typed columns — each field is represented as its native SQL type (TEXT, INTEGER, BOOLEAN, TIMESTAMPTZ, TEXT[], JSONB, etc.) rather than a generic data JSONB blob. This gives you full SQL query power, type safety, and the ability to add column-level indexes where needed.

-- Example: the agents table (fully typed)
CREATE TABLE agents (
    id                  TEXT        PRIMARY KEY DEFAULT gen_random_uuid()::text,
    name                TEXT        NOT NULL,
    description         TEXT        NOT NULL DEFAULT '',
    system_prompt       TEXT        NOT NULL DEFAULT 'You are a helpful assistant.',
    model               TEXT,
    mcp_server_ids      TEXT[]      NOT NULL DEFAULT '{}',
    mcp_server_tags     TEXT[]      NOT NULL DEFAULT '{}',
    tool_definitions    JSONB       NOT NULL DEFAULT '[]',
    knowledge_source_ids TEXT[]     NOT NULL DEFAULT '{}',
    knowledge_tags      TEXT[]      NOT NULL DEFAULT '{}',
    builtin_tools       TEXT[]      NOT NULL DEFAULT '{}',
    custom_tool_ids     TEXT[]      NOT NULL DEFAULT '{}',
    provider_id         TEXT,
    created_at          TIMESTAMPTZ NOT NULL DEFAULT now(),
    updated_at          TIMESTAMPTZ NOT NULL DEFAULT now()
);

JSONB is retained only for genuinely schemaless or nested payloads (e.g. tool_definitions, connection_config, messages, usage). Array fields use the native PostgreSQL TEXT[] type. Where appropriate, tables carry targeted B-tree indexes (e.g. idx_chat_sessions_agent_id) instead of a catch-all GIN index.

All 15 tables¶

Table	Key typed columns
`agents`	`name`, `model`, `mcp_server_ids TEXT[]`, `tool_definitions JSONB`
`chat_sessions`	`agent_id`, `github_user`, `title`, `message_count INTEGER`
`chat_messages`	`session_id`, `role`, `content`, `usage JSONB`
`memories`	`agent_id`, `scope`, `key`, `value`, `embedding DOUBLE PRECISION[]`, `metadata JSONB`
`skills`	`name`, `instructions`, `tags TEXT[]`
`tokens`	`name UNIQUE`, `encrypted_value`, `created_by`
`providers`	`name UNIQUE`, `provider_type`, `api_key_token_name`, `auth_type`, `base_url`
`knowledge_items`	`source_id`, `content_type`, `text_content`, `file_name`, `file_size INTEGER`
`knowledge_sources`	`name`, `source_type`, `connection_config JSONB`, `status`
`custom_tools`	`name`, `source_code`, `parameters_schema JSONB`, `is_enabled BOOLEAN`
`guardrails`	`name`, `guardrail_type`, `enabled BOOLEAN`, `prompt_config JSONB`
`workflows`	`agent_id`, `status`, `max_turns INTEGER`, `messages JSONB`, `logs JSONB`
`task_executions`	`workflow_id`, `status`, `celery_task_id`, `tool_calls INTEGER`, `usage JSONB`
`scheduled_agents`	`workflow_id`, `interval_value INTEGER`, `interval_unit`, `enabled BOOLEAN`
`mcp_servers`	`name`, `transport_type`, `connection_config JSONB`, `allowed_tools TEXT[]`, `status`

Running Database Migrations (Alembic)¶

TBD Agents uses Alembic to manage the PostgreSQL schema. Migrations are a no-op when DB_BACKEND=mongo.

# Apply all pending migrations (create/update tables)
docker compose exec app alembic upgrade head

# Or run locally with uv (DB must be reachable)
DB_BACKEND=postgres uv run alembic upgrade head

# Check which revision is currently applied
alembic current

# Show the full migration history
alembic history --verbose

# Roll back one revision
alembic downgrade -1

# Generate a new migration after changing a model
# Note: autogenerate is not configured (target_metadata = None in alembic/env.py).
# Write the migration manually after creating the file.
alembic revision -m "describe_your_change"

Tip: Migrations are idempotent — running upgrade head on an already-migrated database is safe and produces no changes.

Data Migration from MongoDB to PostgreSQL¶

If you have an existing deployment running the qdrant profile and want to switch to pgvector, use the provided migration script.

Prerequisites¶

Both MongoDB and PostgreSQL must be running and reachable.
PostgreSQL tables must already exist (alembic upgrade head first).

Run the migration¶

# Connection info is read from environment variables
# (MONGO_URI, MONGO_DB_NAME, POSTGRES_URI) with sensible defaults.
MONGO_URI=mongodb://localhost:27017 \
MONGO_DB_NAME=copilot_agent_hub \
POSTGRES_URI=postgresql://postgres:postgres@localhost:5432/tbd_agents \
python scripts/migrate_mongo_to_postgres.py \
  [--dry-run]             # preview without writing any data
  [--collection NAME]     # migrate one specific collection (default: all 15)
  [--batch-size N]        # documents per INSERT batch (default: 500)
  [--verbose]             # print per-batch progress

MongoDB database name

The running application defaults to MONGO_DB_NAME=copilot_agent_hub, while the migration helper's historical default is tbd_agents. Set MONGO_DB_NAME=copilot_agent_hub explicitly when migrating data from the default Docker/app deployment, otherwise you may migrate an empty database.

Flag	Description
`--dry-run`	Print what would be migrated without touching PostgreSQL
`--collection NAME`	Migrate only the named collection (default: all 15)
`--batch-size N`	Number of documents per INSERT batch (default: 500)
`--verbose`	Print progress after each batch

Verify the migration¶

# Set env vars first (script reads them automatically):
#   MONGO_URI, MONGO_DB_NAME, POSTGRES_URI
python scripts/verify_migration.py \
  [--collection agents]    # verify a single named collection (default: all 15)

The verification script compares document counts per collection and reports any discrepancies.

Switch over¶

Once verification passes, update .env:

COMPOSE_PROFILES=pgvector
DB_BACKEND=postgres

Then restart the application:

docker compose down && docker compose up -d

Observability¶

Table sizes and row counts¶

SELECT
    relname                                         AS table,
    n_live_tup                                      AS rows,
    pg_size_pretty(pg_total_relation_size(relid))   AS total_size
FROM pg_stat_user_tables
ORDER BY n_live_tup DESC;

Inspect recent agents¶

-- Most recently created agents
SELECT id, name, model, created_at
FROM agents
ORDER BY created_at DESC
LIMIT 10;

Query by column value¶

-- All agents using a specific model
SELECT id, name, model
FROM agents
WHERE model = 'claude-opus-4-5';

Aggregate across typed fields¶

-- Memory count per agent
SELECT
    agent_id,
    COUNT(*) AS memories
FROM memories
GROUP BY agent_id
ORDER BY memories DESC;

Index usage health check¶

-- Confirm B-tree indexes are being used
SELECT indexrelname, idx_scan, idx_tup_read, idx_tup_fetch
FROM pg_stat_user_indexes
WHERE indexrelname LIKE 'idx_%'
ORDER BY idx_scan DESC;

Backup & Restore¶

Because vectors and documents live in the same PostgreSQL instance, a single pg_dump covers your entire dataset.

Backup¶

# Via Docker Compose
docker compose exec pgvector \
  pg_dump -U postgres tbd_agents > backup.sql

# Or with a timestamp
docker compose exec pgvector \
  pg_dump -U postgres -F custom tbd_agents \
  -f /tmp/tbd_$(date +%Y%m%d_%H%M%S).dump

Restore¶

# Plain SQL dump
docker compose exec -i pgvector \
  psql -U postgres tbd_agents < backup.sql

# Custom-format dump (parallel restore, faster for large databases)
docker compose exec pgvector \
  pg_restore -U postgres -d tbd_agents -j 4 /tmp/tbd_backup.dump

Troubleshooting¶

Symptom	Likely Cause	Fix
`connection refused` on startup	PostgreSQL not running or wrong profile active	Verify `COMPOSE_PROFILES=pgvector` in `.env`; run `docker compose ps` to confirm the `pgvector` container is `Up`
`relation "agents" does not exist`	Tables have not been created	Run `alembic upgrade head` inside the app container
`column "X" does not exist`	asyncpg type mismatch or column name wrong	Confirm the column name matches the typed schema; ensure `json_serializer` / `json_deserializer` are set in the SQLAlchemy engine config (see `app/db.py`)
App still reads/writes MongoDB	`DB_BACKEND` not set to `postgres`	Add `DB_BACKEND=postgres` to `.env` and restart the app
Slow queries on foreign-key-like columns	Missing B-tree index	Run `CREATE INDEX CONCURRENTLY idx_<table>_<col> ON <table> (<col>);` — no downtime required
pgvector extension not found	PostgreSQL image without pgvector	Use the `pgvector/pgvector:pg16` image (configured automatically by the `pgvector` compose profile)
Migration script fails with permission errors	PostgreSQL user lacks `CREATE` privilege	Grant superuser or `CREATE ON DATABASE` to the connection user