# AbstractGateway — llms-full

This file is a single-document snapshot of the local Markdown files (`*.md`) linked in `llms.txt`, intended for LLM/agent ingestion.
It is generated by `scripts/generate-llms-full.py` in the same order the links first appear in `llms.txt` (de-duplicated).
Relative links are normalized to repo-root paths.

---

## README.md

# AbstractGateway

AbstractGateway is a **deployable Run Gateway host** for AbstractRuntime runs:
- start durable runs
- accept a durable command inbox
- replay/stream a durable ledger (replay-first)
- enforce a security baseline (token + origin allowlist + limits)

This decouples the gateway service from any specific UI (AbstractFlow, AbstractCode, web/PWA thin clients).

Start here: [docs/getting-started.md](docs/getting-started.md)

## AbstractFramework ecosystem

AbstractGateway is part of the **AbstractFramework** ecosystem:

- **AbstractRuntime** (required): durable run model + workflow registry + stores (`pyproject.toml`, `src/abstractgateway/runner.py`)
- **AbstractRuntime + transitive capability packages** (required by the default server install): Runtime owns the LLM/tool/media integration boundary; Gateway uses its discovery/run facades for prompt-cache controls, generated and edited image/video plus voice/audio/music capabilities, and KG-backed bundle execution (`src/abstractgateway/hosts/bundle_host.py`)
- Higher-level UIs (optional): AbstractFlow (authoring/bundling), AbstractCode / AbstractObserver / thin clients (rendering + operations)

Related repos:
- AbstractFramework: https://github.com/lpalbou/AbstractFramework
- AbstractCore: https://github.com/lpalbou/abstractcore
- AbstractRuntime: https://github.com/lpalbou/abstractruntime

## Quickstart (HTTP server, bundle mode)

```bash
pip install abstractgateway

export ABSTRACTGATEWAY_DATA_DIR="$PWD/runtime/gateway"

# Optional: set only for a custom bundle registry. When unset, Gateway uses
# the packaged shipped bundle directory containing basic-agent.
# export ABSTRACTGATEWAY_FLOWS_DIR="/path/to/bundles"

# Required by default: the server refuses to start without a token.
export ABSTRACTGATEWAY_AUTH_TOKEN="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
# Browser-origin allowlist (glob patterns). Default allows localhost; customize when exposing remotely.
export ABSTRACTGATEWAY_ALLOWED_ORIGINS="http://localhost:*,http://127.0.0.1:*"

abstractgateway serve --host 127.0.0.1 --port 8080
```

OpenAPI docs (Swagger UI): `http://127.0.0.1:8080/docs`

Smoke checks:

```bash
curl -sS "http://127.0.0.1:8080/api/health"

curl -sS -H "Authorization: Bearer $ABSTRACTGATEWAY_AUTH_TOKEN" \
  "http://127.0.0.1:8080/api/gateway/bundles"
```

## Hosted user auth

Local mode keeps `ABSTRACTGATEWAY_AUTH_TOKEN` as a Gateway admin token. Hosted
mode can enable file-backed user principals and one runtime/data plane per user:

```bash
export ABSTRACTGATEWAY_USER_AUTH=1
export ABSTRACTGATEWAY_AUTH_TOKEN="<admin-bootstrap-token>"
```

Admins manage users through `/api/gateway/admin/users`; generated user bearer
tokens are returned once and stored only as hashes under
`<ABSTRACTGATEWAY_DATA_DIR>/auth/users.json`. User clients call
`GET /api/gateway/me` after connecting to confirm the resolved principal and
routing mode. Gateway rejects duplicate runtime ids within the same tenant, so
the default hosted model remains `1 user = 1 runtime`. Deleting a user reserves
the retained runtime id; admins can explicitly purge retained runtime data or
transfer it to an existing same-tenant user. See
[docs/security.md](docs/security.md) for the current hosted isolation boundary
and remaining operator-route hardening work.

Gateway also serves a built-in control-plane console at `/console`. The console
uses the same browser-session contract as hosted apps: sign in with a Gateway
user id and its token. Because the console is served by Gateway, it uses the
current origin and does not ask for a Gateway URL. You can then manage the
current account, admin-only user records
with optional email metadata, retained runtime reservations, and capability defaults selected from
Gateway-discovered provider/model catalogs without storing the bearer token in
browser storage.

Browser apps should not keep user bearer tokens. They exchange the user token at
`POST /api/gateway/session/login` for an opaque Gateway browser session and use
that session for proxied Gateway calls. The login response body does not return
the session id or CSRF token. Session writes require the Gateway CSRF token, and
`POST /api/gateway/session/logout` revokes the session.

## Docker server

Release images are published to GHCR. The default image is the light,
portable server image:

```bash
docker pull ghcr.io/lpalbou/abstractgateway:0.2.24
```

NVIDIA hosts can try the experimental full GPU image when local
vLLM/HuggingFace/Diffusers engines are wanted. This image is published
best-effort until it has a real CUDA build and smoke gate:

```bash
docker pull ghcr.io/lpalbou/abstractgateway:0.2.24-gpu
```

Legacy `abstractgateway-server` and `abstractgateway-server-nvidia` GHCR aliases
are still published for existing deployments; new deployments should use
`abstractgateway`.

The image installs the base `abstractgateway` package: HTTP server,
`AbstractRuntime`, Runtime-owned provider/tool and
multimodal facades, OpenAI-compatible text/media providers,
provider/session prompt-cache helpers, AbstractMemory/LanceDB KG support,
AbstractAgent, and AbstractFlow compatibility. Local sentence-transformer
embeddings and hardware-local inference engines are explicit extras so the
light server image does not pull PyTorch/CUDA runtime packages. Remote text
embeddings remain part of the light profile through the `embedding.text`
capability route: point it at OpenAI, OpenRouter, Portkey, LM Studio, vLLM,
any OpenAI-compatible embeddings endpoint, or a remote AbstractCore server.

AbstractFlow note:
- You do **not** need the `abstractflow` Python package to run `.flow` bundles (bundle mode). You only need it to author bundles. VisualFlow directory mode was intentionally removed from the gateway to keep the dependency direction clean.

```bash
docker run --rm --name abstractgateway \
  -p 8080:8080 \
  -e ABSTRACTGATEWAY_DATA_DIR=/data \
  -e ABSTRACTGATEWAY_USER_AUTH=1 \
  -e OPENAI_COMPATIBLE_BASE_URL="http://host.docker.internal:1234/v1" \
  -v "$PWD/runtime:/data" \
  ghcr.io/lpalbou/abstractgateway:latest
```

On first start, the container creates `default/admin` and writes the admin user
token to `runtime/auth/bootstrap-admin-token`. Use that token in `/console`,
then rotate it or create named users from the console.

Configure framework model defaults through execution-host capability routes:

```bash
docker exec abstractgateway abstractgateway-config set-default output.text \
  --provider openai-compatible \
  --model your-model \
  --base-url http://host.docker.internal:1234/v1
```

On Apple Silicon, keep Metal/MLX inference native on macOS and run the
lightweight Gateway container as the transport/control plane. Point
`OPENAI_COMPATIBLE_BASE_URL` at a host-native OpenAI-compatible endpoint such
as Docker Model Runner (`http://model-runner.docker.internal/engines/v1`), LM
Studio (`http://host.docker.internal:1234/v1`), Ollama
(`http://host.docker.internal:11434/v1`), or `mlx_lm.server` on a host port.
For native non-Docker installs with local engines, use
`pip install "abstractgateway[apple]"` on Apple Silicon, and
`pip install "abstractgateway[gpu]"` on GPU workstations or NVIDIA Docker builds.
For a minimal Apple-local Gateway + Flow setup, see
[docs/apple-local-gateway-flow.md](docs/apple-local-gateway-flow.md).

Compose and deployment details: [docs/deployment.md](docs/deployment.md).

## Current capability scope

Current direct Gateway APIs:
- `GET /api/gateway/runs/{run_id}/input_data`
- `GET /api/gateway/runs/{run_id}/history_bundle`
- `POST /api/gateway/runs/{run_id}/voice/tts`
- `POST /api/gateway/runs/{run_id}/audio/transcribe`
- `POST /api/gateway/runs/{run_id}/images/generate`
- `POST /api/gateway/runs/{run_id}/images/edit`
- `POST /api/gateway/runs/{run_id}/videos/generate`
- `POST /api/gateway/runs/{run_id}/videos/from_image`
- `POST /api/gateway/runs/{run_id}/music/generate`
- `GET /api/gateway/voice/voices`
- `GET /api/gateway/audio/speech/models`
- `GET /api/gateway/audio/transcriptions/models`
- `GET /api/gateway/audio/music/providers`
- `GET /api/gateway/audio/music/models`
- `GET /api/gateway/vision/provider_models`
- `GET /api/gateway/vision/models`
- `/api/gateway/artifacts/search` cross-run/session/run artifact search with
  modality, content-type, text, and tag filters
- `/api/gateway/prompt_cache/*` provider/model operator controls
- `/api/gateway/prompt_cache/saved|save|load` Runtime-backed host-local export/import admin aliases
- `/api/gateway/sessions/{session_id}/prompt_cache/*` session lifecycle controls
- `/api/gateway/kg/query` with configurable `lancedb` or in-memory AbstractMemory stores, plus `sqlite` when the installed AbstractMemory build exposes `SQLiteTripleStore`
- `/api/gateway/discovery/capabilities` package, plugin, and thin-client contract discovery

Discovery note:
- the capability contract is versioned and stable for endpoint discovery and
  feature gating
- provider/model/voice catalog routes now add a stable Gateway-owned envelope:
  `catalog.contract = gateway_catalog_v1` plus canonical `items`
- the shared thin-client contract now also exposes `common.readiness` as a
  compact Gateway-owned surface summary derived from endpoint descriptors
- legacy fields such as `models`, `providers`, `provider_models`, `profiles`,
  and `voices` remain in place for compatibility
- richer deployment/readiness truth is still separate from the catalog
  envelope and depends on lower-layer Runtime/Core surfaces

Workflow/Core-backed capabilities:
- Generated images and videos are available to Runtime workflows through
  Runtime's media backend integrations, and the direct Gateway image/video
  routes use the same Runtime/Core output-selector contracts. Video generation
  exposes child-run progress through `abstract.progress` ledger records.
- Generated music is available through Gateway's direct Runtime-backed child-run
  route, with provider/model discovery exposed through Gateway capability
  contracts and music catalog endpoints for higher apps.
- Catalog routes now return a canonical `items` array and a `catalog` metadata
  block so higher apps can stop parsing route-local payload variants.
- Audio transcription is available through a direct Runtime-backed child-run
  route, and the capability contract also exposes `voice.listen` as a
  host-capture command surface for higher apps that record locally before
  emitting events or uploading audio.
- Prompt-cache support depends on the active provider/model. Session lifecycle
  routes provide Gateway-owned naming and orchestration, not a provider-
  independent local KV cache.
- Prompt-cache export/import admin remains local-only. Local runtimes keep those
  artifacts under the Gateway data dir; remote and hybrid runtimes return a
  structured `prompt_cache_local_only` response.

## Client contract (replay-first)

- Clients **start runs**: `POST /api/gateway/runs/start`
- Clients can **schedule runs** (bundle mode): `POST /api/gateway/runs/schedule`
- Clients **act** by submitting durable commands: `POST /api/gateway/commands`
  - supported types: `pause|resume|cancel|emit_event|update_schedule|compact_memory`
- Clients **render** by replaying/streaming the durable ledger:
  - replay: `GET /api/gateway/runs/{run_id}/ledger?after=...`
  - stream (SSE): `GET /api/gateway/runs/{run_id}/ledger/stream?after=...`

See [docs/api.md](docs/api.md) for curl examples and the live OpenAPI spec (`/openapi.json`).

## Install

### Base remote-light server

Requires Python `>=3.10` (see `pyproject.toml`).

The base install is the remote-light HTTP/SSE server: Gateway, Runtime,
Agent, Flow compatibility, Runtime-owned provider/tool and multimodal facades,
and LanceDB-backed Memory. It intentionally excludes local sentence-transformer
embeddings and hardware-local inference engines so Linux installs do not pull
PyTorch/CUDA packages. Remote embeddings and remote multimodal input/output
still work in this profile through hosted providers, OpenAI-compatible
endpoints, or a remote AbstractCore server.

```bash
pip install abstractgateway
```

### Optional extras

- `abstractgateway[apple]`: full native macOS Python profile with Apple-local engines and all non-NVIDIA framework capabilities
- `abstractgateway[gpu]`: full local GPU profile with vLLM/HuggingFace, local Diffusers image generation, local voice engines, music, and KG memory; this is also the NVIDIA Docker install profile
- `abstractgateway[embeddings]`: local sentence-transformer embeddings for semantic KG queries
- `abstractgateway[docs]`: MkDocs site tooling
- `abstractgateway[dev]`: local test/dev deps

KG memory nodes use Gateway's memory resolver. The default durable/vector
backend is LanceDB; `memory` is process-local dev/test storage, and `sqlite` is
structured-only when the installed AbstractMemory build exposes
`SQLiteTripleStore`. A fresh persistent store is still reported as available
when the backend resolves; empty queries return empty results instead of hiding
KG authoring surfaces.

Gateway has a first-class config helper:

```bash
abstractgateway-config status
abstractgateway config init --env-file .env
```

For details on capability route defaults, store backends, and workflow sources, see [docs/configuration.md](docs/configuration.md).

## Creating a `.flow` bundle (authoring)

Use AbstractFlow to pack a bundle:

```bash
abstractflow bundle pack /path/to/root.json --out /path/to/bundles/my.flow --flows-dir /path/to/flows
```

See [docs/getting-started.md](docs/getting-started.md) for running, split API/runner, and file→SQLite migration.

## Docs

Published docs site: https://www.lpalbou.info/AbstractGateway/

### Project docs

- Changelog: [CHANGELOG.md](CHANGELOG.md) (compat: `CHANGELOD.md`)
- Contributing: [CONTRIBUTING.md](CONTRIBUTING.md)
- Security policy (vulnerability reporting): [SECURITY.md](SECURITY.md)
- Acknowledgments: [ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md) (compat: `ACKNOWLEDMENTS.md`)

### Package docs

- Docs index: [docs/README.md](docs/README.md)
- Getting started: [docs/getting-started.md](docs/getting-started.md)
- FAQ: [docs/faq.md](docs/faq.md)
- Architecture: [docs/architecture.md](docs/architecture.md)
- Configuration: [docs/configuration.md](docs/configuration.md)
- Deployment: [docs/deployment.md](docs/deployment.md)
- API overview: [docs/api.md](docs/api.md)
- Security: [docs/security.md](docs/security.md)
- Operator tooling (optional): [docs/maintenance.md](docs/maintenance.md)

---

## docs/README.md

# AbstractGateway docs

Start here if you’re new to the project.

## AbstractFramework ecosystem

AbstractGateway is one component in the larger AbstractFramework ecosystem:

- **AbstractRuntime** (required): durable runs + stores
- **AbstractRuntime + transitive capability packages** (required by the default server install): Runtime owns the LLM/tool/media integration boundary; Gateway uses its discovery/run facades for prompt-cache controls, generated and edited image/video plus voice/audio/music capabilities, and KG-backed bundle execution
- Higher-level UIs (optional): AbstractFlow / AbstractObserver / AbstractCode / thin clients

Related repos:
- AbstractFramework: https://github.com/lpalbou/AbstractFramework
- AbstractCore: https://github.com/lpalbou/abstractcore
- AbstractRuntime: https://github.com/lpalbou/abstractruntime

## Docs map

- Quickstart + stores (file/SQLite): [getting-started.md](docs/getting-started.md)
- FAQ / troubleshooting: [faq.md](docs/faq.md)
- Architecture (durable contract + components): [architecture.md](docs/architecture.md)
- Configuration (env vars + install extras): [configuration.md](docs/configuration.md)
- Apple Silicon local Gateway + Flow quickstart: [apple-local-gateway-flow.md](docs/apple-local-gateway-flow.md)
- Deployment (Docker/GHCR/Compose): [deployment.md](docs/deployment.md)
- API overview (client contract + OpenAPI, including direct image/video, STT, and music generation): [api.md](docs/api.md)
- Security guide (auth/origin/limits/audit log): [security.md](docs/security.md)
- Operator tooling (triage/backlog/process manager): [maintenance.md](docs/maintenance.md)

## API docs (generated)

Published static docs site: https://www.lpalbou.info/AbstractGateway/

When the HTTP server is running (`abstractgateway serve`):
- Health: `GET /api/health`
- OpenAPI JSON: `GET /openapi.json`
- Interactive Swagger UI: `GET /docs`

## Project docs

- Package README: [../README.md](README.md)
- Changelog: [../CHANGELOG.md](CHANGELOG.md) (compat: `CHANGELOD.md`)
- Contributing: [../CONTRIBUTING.md](CONTRIBUTING.md)
- Security policy (vulnerability reporting): [../SECURITY.md](SECURITY.md)
- Acknowledgments: [../ACKNOWLEDGMENTS.md](ACKNOWLEDGMENTS.md) (compat: `ACKNOWLEDMENTS.md`)

---

## docs/getting-started.md

# AbstractGateway — Getting started

AbstractGateway is a deployable HTTP/SSE host for **durable AbstractRuntime runs**:
- clients **start runs** and submit **durable commands**
- clients **render** by replaying/streaming the durable ledger (replay-first)

This guide gets a new installation running in **bundle mode** (recommended), then covers **file vs SQLite** durability and a best-effort **file → SQLite** migration.

## AbstractFramework ecosystem (context)

AbstractGateway is one component in the larger **AbstractFramework** ecosystem:
- **AbstractRuntime** (required): durable runs + workflow registry + stores
- **AbstractRuntime + transitive capability packages** (required by the default server install): Runtime owns the LLM/tool/media integration boundary; Gateway uses its discovery/run facades for prompt-cache controls, generated and edited image/video plus voice/audio/music capabilities, and KG-backed bundle execution

Related repos:
- AbstractFramework: https://github.com/lpalbou/AbstractFramework
- AbstractCore: https://github.com/lpalbou/abstractcore
- AbstractRuntime: https://github.com/lpalbou/abstractruntime

## Prerequisites

- Python `>=3.10` (see `pyproject.toml`)
- Workflow source:
  - **Bundle mode** (recommended): one `.flow` file or a directory of `*.flow` bundles
    - You can also upload bundles after startup via `POST /api/gateway/bundles/upload` (see below)
  - **VisualFlow directory mode** (compat): a directory of `*.json` VisualFlow files; the base install includes the compiler dependency

## Install

```bash
# Remote-light server package (HTTP/SSE + runner + stores + KG memory)
pip install abstractgateway

# Native Apple local engines
pip install "abstractgateway[apple]"

# Native/container GPU local engines, also used by the NVIDIA Docker image
pip install "abstractgateway[gpu]"
```

With the base install and a configured provider stack, Gateway can surface
run-scoped direct TTS, STT, image generation, image edit, and music generation
for higher apps through one shared capability contract.

## 1) Run (bundle mode, file-backed stores)

File-backed stores are the default and easiest for dev.

```bash
export ABSTRACTGATEWAY_WORKFLOW_SOURCE=bundle
export ABSTRACTGATEWAY_DATA_DIR="$PWD/runtime/gateway"

# Optional: set only for a custom bundle registry. When unset, Gateway uses
# the packaged shipped bundle directory containing basic-agent.
# export ABSTRACTGATEWAY_FLOWS_DIR="/path/to/bundles"

# Required by default: the server refuses to start without a token.
export ABSTRACTGATEWAY_AUTH_TOKEN="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
export ABSTRACTGATEWAY_ALLOWED_ORIGINS="http://localhost:*,http://127.0.0.1:*"

abstractgateway serve --host 127.0.0.1 --port 8080
```

OpenAPI docs (Swagger UI): `http://127.0.0.1:8080/docs` (use **Authorize** for bearer token)

Smoke checks:

```bash
curl -sS "http://127.0.0.1:8080/api/health"

curl -sS -H "Authorization: Bearer $ABSTRACTGATEWAY_AUTH_TOKEN" \
  "http://127.0.0.1:8080/api/gateway/bundles"
```

If `bundles.items` is empty, either:
- point `ABSTRACTGATEWAY_FLOWS_DIR` at the shipped bundle directory or another
  directory containing `*.flow` files (or a single `.flow` file), or
- upload a bundle via the API:

```bash
curl -sS -H "Authorization: Bearer $ABSTRACTGATEWAY_AUTH_TOKEN" \
  -F "file=@./my-bundle@0.1.0.flow" \
  -F "overwrite=false" \
  -F "reload=true" \
  "http://127.0.0.1:8080/api/gateway/bundles/upload"
```

You can also create a local env file and inspect readiness with:

```bash
abstractgateway-config init --env-file .env
abstractgateway-config status
```

## 2) Start a run (bundle mode)

First, discover entrypoints from `GET /api/gateway/bundles`. Then start a run:

```bash
curl -sS -H "Authorization: Bearer $ABSTRACTGATEWAY_AUTH_TOKEN" -H "Content-Type: application/json" \
  -d '{"bundle_id":"my-bundle","input_data":{"prompt":"Hello"}}' \
  "http://127.0.0.1:8080/api/gateway/runs/start"
```

Notes:
- If a bundle has multiple entrypoints and no default, you must pass `flow_id`.
- See [api.md](docs/api.md) for ledger replay/stream and durable commands.

## 2b) (Optional) Schedule a run (bundle mode)

To launch a workflow periodically, start a **scheduled parent run**:

```bash
curl -sS -H "Authorization: Bearer $ABSTRACTGATEWAY_AUTH_TOKEN" -H "Content-Type: application/json" \
  -d '{"bundle_id":"my-bundle","flow_id":"ac-echo","input_data":{"prompt":"Ping"},"start_at":"now","interval":"1h","repeat_count":3}' \
  "http://127.0.0.1:8080/api/gateway/runs/schedule"
```

Tip: to stop a schedule, cancel the scheduled parent run (`POST /api/gateway/commands`, type `cancel`).

## 3) Split API vs runner (recommended for upgrades)

By default, `abstractgateway serve` starts the HTTP API **and** the runner loop in the same process.

To restart the HTTP API without pausing durable execution, run two processes sharing the same `ABSTRACTGATEWAY_DATA_DIR`:

```bash
# Process 1 (runner worker, no HTTP deps needed):
abstractgateway runner

# Process 2 (HTTP API only):
abstractgateway serve --no-runner --host 127.0.0.1 --port 8080
```

## 3b) Docker / Compose

For a containerized remote-light deployment with
`AbstractRuntime`, Runtime-owned provider/tool and
multimodal support, KG memory, and provider/session prompt-cache controls
included. Remote embeddings are available in this profile when `embedding.text`
points at a remote provider, an OpenAI-compatible embeddings endpoint, or a
remote AbstractCore server; local HuggingFace/sentence-transformer embeddings
require `abstractgateway[embeddings]`.

```bash
docker run --rm --name abstractgateway \
  -p 8080:8080 \
  -v "$PWD/runtime:/data" \
  -e ABSTRACTGATEWAY_DATA_DIR=/data \
  -e ABSTRACTGATEWAY_USER_AUTH=1 \
  ghcr.io/lpalbou/abstractgateway:latest
```

See [deployment.md](docs/deployment.md) for Compose, provider keys, and image
customization.

On first start, the container creates `default/admin` and writes the token to
`runtime/auth/bootstrap-admin-token`. NVIDIA hosts can try
`ghcr.io/lpalbou/abstractgateway:0.2.24-gpu` with the compose overlay in
`docker/abstractgateway-server/compose.nvidia.yml`.
It is experimental until a real CUDA build/smoke gate is part of release
validation.
Apple MLX inference should run natively on macOS rather than in Docker because
Linux containers do not get access to Apple's Metal/MLX runtime. The container
can still use native macOS inference through an OpenAI-compatible endpoint:
point `OPENAI_COMPATIBLE_BASE_URL` at Docker Model Runner on
`http://model-runner.docker.internal/engines/v1`, LM Studio on
`http://host.docker.internal:1234/v1`, `mlx_lm.server`, or Ollama's
OpenAI-compatible API on `http://host.docker.internal:11434/v1` when the
native Ollama model path uses MLX. For native non-Docker installs, use
`pip install "abstractgateway[apple]"` on Apple Silicon, and
`pip install "abstractgateway[gpu]"` on GPU workstations or NVIDIA Docker builds.

## 4) What’s stored in `ABSTRACTGATEWAY_DATA_DIR` (file backend)

When `ABSTRACTGATEWAY_STORE_BACKEND=file` (default), the gateway persists (via `abstractruntime` stores):
- `run_<run_id>.json` (checkpointed run state)
- `ledger_<run_id>.jsonl` (append-only step records)
- `commands.jsonl` and `commands_cursor.json` (durable inbox + runner cursor)
- `artifacts/` (offloaded blobs/attachments)
- `dynamic_flows/` (gateway-generated wrapper flows, e.g. schedules)
- `workspaces/` (per-run workspaces created at run start when `workspace_root` is not provided)

## 5) Enable SQLite-backed stores

SQLite-backed stores eliminate directory scanning and move run/ledger/inbox data into indexed tables.

Artifacts remain file-backed under `ABSTRACTGATEWAY_DATA_DIR/artifacts/`.

```bash
export ABSTRACTGATEWAY_STORE_BACKEND=sqlite

# Optional; when omitted, defaults to: <ABSTRACTGATEWAY_DATA_DIR>/gateway.sqlite3
export ABSTRACTGATEWAY_DB_PATH="$PWD/runtime/gateway/gateway.sqlite3"
#
# Safety invariant: when using sqlite, the DB file must live under ABSTRACTGATEWAY_DATA_DIR.
# The gateway will refuse to start if ABSTRACTGATEWAY_DB_PATH points outside (prevents UAT/prod cross-wiring).

abstractgateway serve --host 127.0.0.1 --port 8080
```

## 6) Migrate an existing file-backed data dir → SQLite

This is a **best-effort** local migration (`abstractgateway migrate`) that reads:
- `run_*.json`
- `ledger_*.jsonl`
- `commands.jsonl`
- `commands_cursor.json`

and writes a single SQLite DB file. It does **not** delete the original files.

```bash
cp -a runtime/gateway "runtime/gateway.file-backup.$(date +%Y%m%d-%H%M%S)"

abstractgateway migrate --from=file --to=sqlite \
  --data-dir runtime/gateway \
  --db-path runtime/gateway/gateway.sqlite3
```

## Related docs

- Docs index: [README.md](docs/README.md)
- FAQ: [faq.md](docs/faq.md)
- Architecture: [architecture.md](docs/architecture.md)
- Configuration (env vars + optional deps): [configuration.md](docs/configuration.md)
- Deployment: [deployment.md](docs/deployment.md)
- API overview: [api.md](docs/api.md)
- Security: [security.md](docs/security.md)
- Operator tooling (optional): [maintenance.md](docs/maintenance.md)

---

## CHANGELOG.md

# Changelog

All notable changes to this project are documented in this file.

The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]

## [0.2.24] - 2026-05-31

### Added
- Added `abstractgateway-config bootstrap-admin` to create or recover a file-backed `default/admin` Gateway user for hosted/container user-auth deployments.
- Added a Gateway Docker entrypoint that bootstraps the admin user token into `/data/auth/bootstrap-admin-token` before starting the server.
- Added first-class GHCR tags for `ghcr.io/lpalbou/abstractgateway:<version>`, `latest`, `<version>-gpu`, and `gpu-latest`, while preserving the legacy `abstractgateway-server` tags during transition.

### Changed
- Gateway Docker and Compose defaults now use `/data`, enable hosted user auth, and build release images from the just-published PyPI wheel instead of local source.
- Gateway startup now accepts hosted user-auth deployments without the legacy shared `ABSTRACTGATEWAY_AUTH_TOKEN`.

### Fixed
- Fixed the PyPI/GHCR release path so container images can start cleanly from the published Gateway wheel and still provide an initial admin login token.

## [0.2.23] - 2026-05-31

### Fixed
- Fixed local-source Gateway container builds so the packaged `basic-agent` workflow bundle is present when Hatch builds the wheel inside the release image.

## [0.2.22] - 2026-05-31

### Added
- Added hosted user-principal auth with `GET /api/gateway/me`, admin-only `/api/gateway/admin/users` CRUD, and a file-backed user registry storing bearer-token hashes.
- Added request-scoped Gateway service routing so hosted user-auth mode maps each principal to a separate GatewayService data plane under `<DATA_DIR>/users/<tenant_id>/<runtime_id>/`.
- Added the built-in Gateway Console at `/console` for browser-session sign-in, account/runtime summary, admin user management, token rotation, and per-principal capability default editing.
- Added per-principal capability-default overlays in hosted user-auth mode so users can set provider/model defaults for their own runtime without mutating the global AbstractCore config.
- Added provider endpoint profiles for Gateway-stored OpenAI-compatible or hosted endpoints. Profiles keep API keys server-side, discover endpoint models on demand, and surface as virtual providers in Gateway defaults and Flow node selectors.

### Changed
- Raised dependency floors to `AbstractRuntime>=0.4.26`, `abstractagent>=0.3.10`, and `abstractcore[embeddings]>=2.13.31` so Gateway installs inherit the latest light-profile, media, and provider-profile contracts.

### Fixed
- Fixed the Gateway Console sign-in page so generated inline JavaScript parses correctly, the sign-in form posts to `/api/gateway/session/login`, and signed-out users see only the same-origin Gateway user/token login card.
- Made `abstractgateway.security` export session and middleware helpers lazily so direct `abstractgateway.users` imports are not order-sensitive.
- Kept the base `pip install abstractgateway` remote-light on Linux while relying on the base `AbstractRuntime` install for MCP and remote multimodal routing. Local sentence-transformer embeddings moved behind `abstractgateway[embeddings]`, and Gateway no longer declares direct base `sentence-transformers` or `numpy` dependencies, avoiding PyTorch/NVIDIA CUDA runtime wheels unless an explicit local-engine profile is selected.
- Kept remote/provider-backed embeddings in the base light profile through `embedding.text` routes and remote AbstractCore delegation, while surfacing embedding setup errors instead of reporting a generic missing integration.
- Gateway admin user routes now fail closed when request principal context is absent while Gateway security is enabled.
- Gateway route-family authorization now keeps operator/admin surfaces and server-workspace file helpers admin-only in hosted user-auth mode while regular users remain able to operate within their own runtime data plane.

## [0.2.21] - 2026-05-29

### Added
- Gateway artifact search/import/export endpoints for thin clients, including scoped artifact lookup by run, session, or all stored artifacts with modality, content type, text, and tag filters.
- Capability discovery now advertises artifact search, workspace import, and workspace export descriptors in the shared thin-client contract.

### Changed

- Removed legacy compatibility install extras (`abstractgateway[http]`, `[server]`, `[multimodal]`, `[memory]`, `[voice]`, `[vision]`, `[telegram]`, `[visualflow]`, `[all]`, `[all-apple]`, `[all-gpu]`, `[server-nvidia]`). The supported install surface is now:
  - `pip install abstractgateway`
  - `pip install "abstractgateway[apple]"`
  - `pip install "abstractgateway[gpu]"`
- Raised dependency floors to `AbstractRuntime[multimodal,mcp-worker]>=0.4.25` and `abstractagent>=0.3.9`.
- KG memory readiness now treats a resolvable fresh persistent AbstractMemory store as available, so empty stores return empty query results instead of hiding Flow authoring surfaces.

### Fixed
- Media model-residency discovery now keeps image editing distinct from image generation when Runtime/Core expose task-specific residency state.

## [0.2.20] - 2026-05-26

### Added
- Direct Runtime-backed video generation routes:
  - `POST /api/gateway/runs/{run_id}/videos/generate` for text-to-video
  - `POST /api/gateway/runs/{run_id}/videos/from_image` for image-to-video
- Thin-client capability contracts and readiness metadata now advertise `generated_video` and `image_to_video`, including `provider_models_task` values and `abstract.progress` child-run progress events.
- Model-residency capability reporting now includes video tasks (`text_to_video`, `image_to_video`, and `video_generation`) when Runtime/Core expose them.

### Changed
- Raised the Runtime floor to `AbstractRuntime[multimodal,mcp-worker]>=0.4.24`.
- Gateway documentation now describes direct video routes, video provider/model catalog tasks, and progress-event handling for long-running media jobs.

## [0.2.19] - 2026-05-26

### Added
- Gateway capability-default routing and configuration helpers so downstream thin clients can discover provider/model defaults without hardcoded fallbacks.
- Run-retention cleanup support for draft and ephemeral Flow runs.

### Changed
- Raised dependency floors to `AbstractRuntime[multimodal,mcp-worker]>=0.4.23` and `abstractagent>=0.3.8`.
- Refined Gateway model-residency and catalog proxy responses around Runtime/Core discovery truth, including the latest MLX-Gen vision and OmniVoice catalog surfaces.
- Refreshed Docker and deployment docs for the new release image tags.

### Fixed
- Removed brittle catalog payload assertions by normalizing Gateway-owned catalog envelopes at the route boundary.

## [0.2.18] - 2026-05-23

### Added
- Catalog and provider discovery routes now include a stable Gateway-owned envelope (`catalog.contract=gateway_catalog_v1`, `catalog.version=1`) plus one canonical `items` array, while preserving legacy lower-layer fields for compatibility.
- Capability discovery now also exposes `common.readiness` (`gateway_surface_readiness_v1`): a compact surface-level summary derived from endpoint descriptors, memory readiness, prompt-cache, media gates, and Runtime/Core truth.

### Changed
- Raised the Runtime floor to `AbstractRuntime[multimodal,mcp-worker]>=0.4.22`.
- Removed VisualFlow directory mode and fully removed the `abstractflow` package dependency from Gateway. VisualFlow JSON is stored/published via Gateway endpoints and executed as `.flow` WorkflowBundles (bundle mode).

## [0.2.17] - 2026-05-22

### Added
- Gateway now exposes Runtime-backed image editing for thin clients through `POST /api/gateway/runs/{run_id}/images/edit`.

### Changed
- Raised the Runtime floor to `AbstractRuntime[multimodal,mcp-worker]>=0.4.21`.
- Gateway capability discovery and thin-client contracts now advertise edited-image and generated-music availability, richer voice `tts|stt|listen` contracts, and Runtime-backed model residency truth instead of hard-coded media support flags.
- Direct STT now forwards `prompt`, `response_format`, `temperature`, and source `format` hints through the Runtime transcription surface.
- Release-facing docs now describe the current higher-app surface more precisely, including the stable route/contract layer and the current best-effort catalog payload limitation.

## [0.2.16] - 2026-05-21

### Changed
- Raised the Runtime floor to `AbstractRuntime[multimodal,mcp-worker]>=0.4.20` across the base, Apple, and GPU install profiles.
- Gateway's legacy prompt-cache snapshot aliases, `GET /api/gateway/prompt_cache/saved` and `POST /api/gateway/prompt_cache/save|load`, now delegate to Runtime's public host facade instead of using provider-private prompt-cache state directly.
- Local bundle runtimes now keep host-local prompt-cache exports under `<DATA_DIR>/prompt_cache_exports` through Runtime's export root policy.

### Fixed
- Removed the last Gateway-side prompt-cache boundary bypass (`runtime._abstractcore_llm_client`, direct provider-instance access, and provider-private `_prompt_cache_store` / GGUF cache hooks) from the public route surface.
- Removed the stale internal Core catalog proxy module after discovery routing fully moved to Runtime's public discovery facade.

## [0.2.15] - 2026-05-21

### Added
- Added Runtime-backed durable bloc prompt-cache control-plane routes under `/api/gateway/blocs/*`, including KV manifest/list/ensure/load/delete/prune helpers for exact-reuse workflows.
- Added Gateway-owned workspace file helper support plus focused route and contract coverage for durable blocs, model residency, notifier behavior, and Runtime-backed capability discovery.

### Changed
- Raised the Runtime floor to `AbstractRuntime[multimodal,mcp-worker]>=0.4.19` and moved Gateway's public provider/media/tool boundary behind Runtime facades rather than direct package imports.
- Updated Apple/GPU install profiles to cascade through Runtime's aggregate extras and excluded internal `tests/`, `flows/`, and backlog notes from source distributions.
- Expanded the docs and capability contract to cover durable blocs, media/model residency, Runtime-backed email/Telegram helpers, and the current Docker/runtime dependency shape.

### Fixed
- Gateway no longer reads AbstractCore config for LLM helper defaults; provider/model resolution now follows request values, Gateway env, and flow defaults with a clear config error when unset.
- Gateway's operator email, Telegram, and notification paths now use Runtime's AbstractCore host facades, while local file/workspace helpers stay owned by Gateway.
- Capability discovery and prompt-cache readiness reporting now better reflect the actual state of generated-media, voice/audio, and provider-backed cache controls.

## [0.2.14] - 2026-05-19

### Fixed
- Gateway now carries explicit modern OpenAI/httpx/anyio dependency bounds in its base install metadata, preventing Python 3.10 resolver backtracking while preserving the Apple/GPU profile cascade into `[all-apple]` and `[all-gpu]` framework dependencies.

### Changed
- Raised the Runtime floor to `AbstractRuntime>=0.4.14` so Gateway profiles consume Runtime's resolver bounds for AbstractCore provider/tool extras.

## [0.2.13] - 2026-05-19

### Fixed
- Gateway's base install now avoids mixing Core's narrow base media/embeddings extras with Core `[all-apple]` and `[all-gpu]` profile dependencies, while still installing the media, compression, and embeddings dependency set needed by the remote-capable base package.
- Gateway's base media dependency set now uses a Python-3.10-compatible `unstructured` line and bounds `python-pptx` to supported modern releases so document-capable installs do not backtrack into broken legacy setup packages.
- Gateway's base web dependency set now prefers current compatible FastAPI/Uvicorn/Requests/urllib3 releases to keep CI and user installs out of unnecessary resolver backtracking.
- Gateway now applies a compatible setuptools lower bound so Apple/GPU installs satisfy Torch's `<82` constraint without resolving into ancient broken setuptools releases.

### Changed
- Raised the Runtime floor to `AbstractRuntime>=0.4.13` so Gateway profiles consume Runtime's updated multimodal dependency metadata, and raised the Music floor to `abstractmusic>=0.1.2`.

## [0.2.12] - 2026-05-19

### Fixed
- Gateway Apple install profiles now preserve the entrypoint contract by cascading `[all-apple]` through Runtime, Agent, Core, Vision, Voice, Music, and Memory dependencies; GPU profiles continue to cascade `[all-gpu]`.

### Changed
- Gateway's base remote-capable install now includes Core embeddings dependencies alongside remote providers, media, tools, tokens, compression, voice/audio, and vision while preserving the published Core dependency floor.

## [0.2.11] - 2026-05-19

### Fixed
- Gateway voice, TTS, STT, and vision catalog routes now use the AbstractCore capability abstractions as the source of truth for provider and provider-model discovery.
- Direct Gateway TTS and STT routes now dispatch through the AbstractCore capability registry, preserving explicitly selected media providers and models through execution.
- Gateway LLM provider/model discovery can proxy configured AbstractCore Server catalog routes while keeping Flow's existing response contract.

### Changed
- Raised dependency floors to Runtime `>=0.4.12`, Core `>=2.13.15`, Flow `>=0.3.11`, Vision `>=0.3.6`, and Voice `>=0.10.3`.

## [0.2.10] - 2026-05-13

### Fixed
- Gateway capability discovery now builds its embedded capability registry with Gateway-scoped media configuration, keeping discovery contracts aligned with the concrete voice, TTS, STT, and image catalog routes.
- Gateway media catalog proxy calls now avoid forwarding unset optional query params, preventing stale `None` values from breaking downstream capability discovery.

### Changed
- Raised dependency floors to Runtime `>=0.4.11`, Core `>=2.13.14`, Flow `>=0.3.11`, Vision `>=0.3.5`, and Voice `>=0.9.4`.


## [0.2.9] - 2026-05-12

### Added
- Gateway discovery now advertises `/api/gateway/audio/transcriptions/models` for STT catalog lookup.
- Added local and proxied STT model catalog responses backed by AbstractCore/AbstractVoice.

### Fixed
- Gateway capability catalogs now map Gateway-scoped voice and vision env vars into the embedded capability registry, so local Gateway deployments expose configured voice/TTS/STT/image models without requiring duplicate lower-level env names.
- Catalog proxy calls now omit unset optional query params instead of forwarding `None` values.

### Changed
- Raised dependency floors to Runtime `>=0.4.10`, Core `>=2.13.13`, Flow `>=0.3.10`, and Voice `>=0.9.3`.

## [0.2.8] - 2026-05-10

### Added

- Capability discovery now advertises
  `capabilities.contracts.common.runs.input_data` and
  `capabilities.contracts.common.runs.history_bundle` so thin clients can
  feature-detect the run input and RunHistoryBundle endpoints from the shared
  Gateway contract.

## [0.2.7] - 2026-05-10

### Updated

- Bumped abstractagent floor to >=0.3.6 to match the new abstractagent release that requires abstractruntime>=0.4.9.

## [0.2.6] - 2026-05-09

### Fixed

- Raised the AbstractVision floor to `abstractvision>=0.3.4` across Gateway
  install profiles so `abstractgateway[gpu]` and the NVIDIA image inherit the
  stable-diffusion.cpp binding constraint that avoids the broken
  `stable-diffusion-cpp-python==0.4.6` Linux sdist.
- Updated release-facing Docker examples and package metadata from `0.2.5` to
  `0.2.6`.
- Release/CI installs now bypass the restored pip dependency cache for editable
  dependency resolution, avoiding stale package indexes immediately after
  lower-package releases.

## [0.2.5] - 2026-05-09

### Changed

- Promoted the base `abstractgateway` install to the remote-light HTTP/SSE
  server profile. It now includes Runtime multimodal support, AbstractAgent,
  AbstractCore remote/media/tools/tokens/compression/vision/voice/audio,
  AbstractVision, AbstractVoice, AbstractFlow compatibility,
  AbstractMemory/LanceDB KG support, FastAPI, multipart uploads, and Uvicorn.
- Raised Runtime and Agent floors to `AbstractRuntime>=0.4.9` and
  `abstractagent>=0.3.6`.
- Simplified install guidance around `abstractgateway`, `abstractgateway[apple]`,
  and `abstractgateway[gpu]`. The older `http`, `server`, `multimodal`,
  `memory`, `voice`, `vision`, `all`, and `server-nvidia` extras remain as
  compatibility aliases.
- The NVIDIA Docker image now installs `abstractgateway[gpu]`; `server-nvidia`
  remains only as a compatibility alias.

## [0.2.4] - 2026-05-08

### Added

- Explicit install profiles for the Gateway package: minimal base,
  `http`, `multimodal`, `server`, `memory`, `apple`, `gpu`, `all-apple`,
  `all-gpu`, and `server-nvidia`.
- `abstractgateway-config` plus `abstractgateway config` for operator status and
  private `.env` bootstrap without taking ownership of AbstractCore provider
  configuration.
- Gateway memory store resolver for AbstractMemory-backed LanceDB, SQLite, and
  in-memory stores, including `/kg/query` store metadata.
- Core catalog proxy endpoints for thin clients:
  `GET /api/gateway/voice/voices`,
  `GET /api/gateway/audio/speech/models`, and
  `GET /api/gateway/vision/provider_models`.
- Added a `server-nvidia` extra plus an experimental CUDA/PyTorch-based
  `abstractgateway-server-nvidia` Docker image recipe for full NVIDIA machines.
- Release and manual GHCR image workflows now publish the light default server
  image and attempt an experimental best-effort NVIDIA full image.

### Changed

- Base installs are now intentionally minimal again:
  `AbstractRuntime>=0.4.8` only.
- Server and multimodal profiles now use the aligned Runtime/Core/Voice/Vision
  floors: `AbstractRuntime>=0.4.8`, `abstractcore>=2.13.12`,
  `abstractvision>=0.3.3`, and `abstractvoice>=0.9.2`.
- Server, native Apple, native GPU, and NVIDIA profiles now require
  `abstractagent>=0.3.5`, so Gateway-hosted agent nodes resolve against the
  same Core/Runtime baseline as Gateway itself.
- Release tests now reset Gateway's process-global service between cases and
  pass explicit provider/model overrides for ledger summary/chat generation
  tests.
- Native Python hardware profiles are full deployment aggregates:
  `abstractgateway[apple]` and `abstractgateway[all-apple]` install the
  Apple-local stack and all relevant non-NVIDIA framework capabilities, while
  `abstractgateway[gpu]` and `abstractgateway[all-gpu]` install the matching
  local GPU stack.
- Gateway-owned runtime handoff now seeds `_runtime.prompt_cache`,
  `_runtime.max_attachment_bytes`, and `_runtime.workflow_bundles_dir` from
  Gateway configuration.
- Gateway LLM helper defaults now resolve through the same deployment cascade as
  runtime execution instead of hardcoded local model fallbacks.
- Docker Compose local builds can override `ABSTRACTGATEWAY_EXTRAS`; the
  default examples use port `8080`, and an NVIDIA compose overlay is available
  for GPU hosts.
- The default Docker server image now composes `abstractgateway[server,memory]`
  so KG workflows and `/kg/query` have the AbstractMemory/LanceDB store package
  available without making memory a base-package dependency.
- The `memory` profile now depends on `AbstractMemory[lancedb]>=0.2.6`.

### Fixed

- `memory_kg_*` effects and `/kg/query` no longer assume LanceDB directly;
  in-memory stores work, SQLite structured queries work when the installed
  AbstractMemory build exposes `SQLiteTripleStore`, and semantic queries fail
  clearly when the selected store has no vector/search capability.
- Dynamic voice/audio/vision catalog discovery now delegates to the AbstractCore
  server catalog boundary when configured, with bounded static fallback when it
  is not.
- Observer/chat/backlog/discovery helpers now return a clear provider/model
  configuration error when no request, Gateway env, or AbstractCore default is
  available.

### Notes

- The default Docker image remains the release-grade light, portable image for
  `linux/amd64` and `linux/arm64`. The NVIDIA image is `linux/amd64` only and
  is experimental/best-effort because vLLM/Torch/Diffusers dependency
  resolution is much heavier than the default server profile and still needs a
  CUDA host smoke gate before production positioning.
- There is no practical MLX Docker image target for Apple Silicon today: MLX
  depends on Apple's Metal stack and Docker Desktop runs Linux containers
  without Metal/MPS device access. Apple local inference should stay native on
  macOS, not containerized; the Gateway container can point at Docker Model
  Runner, native LM Studio, `mlx_lm.server`, or Ollama OpenAI-compatible
  endpoints via `model-runner.docker.internal` or `host.docker.internal`.

## [0.2.3] - 2026-05-08

### Added

- Versioned thin-client capability contracts for Gateway common features, AbstractFlow editor/runtime support, AbstractAssistant media/cache controls, and AbstractCode-facing prompt-cache controls.
- AbstractFlow gateway-first editor contract validation, including VisualFlow CRUD/publish/start/observe coverage and a bundled flow input-schema endpoint.
- Gateway-owned session prompt-cache lifecycle routes:
  - `GET /api/gateway/sessions/{session_id}/prompt_cache/status`
  - `POST /api/gateway/sessions/{session_id}/prompt_cache/prepare`
  - `POST /api/gateway/sessions/{session_id}/prompt_cache/rebuild`
  - `POST /api/gateway/sessions/{session_id}/prompt_cache/clear`
- Generated-media contract fields in capability discovery, including direct-vs-workflow generated-image availability.
- Direct generated-image route, `POST /api/gateway/runs/{run_id}/images/generate`, backed by Runtime/Core image output selectors, artifact storage, and `abstract.media.image.generated` ledger events.
- Backlog completion ledger for the capability contract, Flow editor contract, session prompt-cache lifecycle, and generated-media gateway contract.

### Changed

- Capability discovery now truthfully reports provider-level and session-level prompt-cache controls, plus direct Gateway voice/audio/image endpoints where configured.
- API, configuration, deployment, Docker, README, FAQ, and LLM ingestion docs now describe generated images as both workflow-backed and directly available through the Gateway route when a Runtime/Core image backend is installed and configured.
- Docker/Compose release examples now point at the `0.2.3` server image.

### Fixed

- Fixed stale release-facing docs that said Gateway had no direct image-generation endpoint after the direct route landed.
- Fixed an order-dependent test import leak so the full local pytest suite can run cleanly after the AbstractFlow editor contract tests.

### Notes

- Direct image generation still depends on a configured Runtime/Core/AbstractVision-compatible backend; Gateway does not bundle heavy local image engines.
- Session prompt-cache lifecycle is Gateway-owned naming and orchestration over provider/model controls. It is not a provider-independent local KV cache or full CachedSession persistence system.

## [0.2.2] - 2026-05-06

### Added

- MkDocs Material configuration for the documentation site.
- CI docs build job and release docs gate.
- Release workflow deployment to GitHub Pages via `mkdocs gh-deploy`.
- PyPI-backed GHCR server image publishing for `ghcr.io/lpalbou/abstractgateway-server`.
- CI validation build for the local server Docker image recipe.
- Docker server image, Compose profile, and deployment documentation.
- `docs`, `server`, `vision`, and `multimodal` optional dependency extras.
- Discovery metadata for AbstractCore capability plugins (`voice`, `audio`, `vision`, and future `music`).

### Changed

- Version metadata aligned across `pyproject.toml`, package `__version__`, and FastAPI app metadata.
- The server install profile now mirrors the newer AbstractRuntime/Core multimodal stack: `AbstractRuntime[multimodal]>=0.4.6`, `abstractcore[remote,media,tools,tokens,compression,vision,voice,audio]>=2.13.10`, `abstractvision>=0.3.1`, and `abstractvoice>=0.9.0`.
- The server Docker/Compose profile now documents workflow-backed image generation through AbstractVision, direct Gateway TTS/STT through AbstractVoice, and provider-dependent prompt-cache controls.
- Gateway voice/audio endpoints now accept AbstractVoice's newer local/remote backend environment knobs in addition to the existing Gateway-scoped settings.

### Notes

- Release scope is intentionally explicit: TTS and STT have direct Gateway endpoints; generated images are available through Runtime/Core workflows with AbstractVision installed and configured, but Gateway does not yet expose a direct image-generation HTTP endpoint.
- Prompt-cache support is provider-level control-plane support. This release does not add a Gateway-owned CachedSession lifecycle API.
- `flows/bundles/article@dev.flow` was inspected and left untracked. It is a local `dev` bundle generated by the Gateway publisher, not a release artifact.

## [0.2.1] - 2026-02-09

### Changed

- Dependency bumps (see `pyproject.toml`):
  - `AbstractRuntime>=0.4.2` (and `AbstractRuntime[abstractcore]>=0.4.2` for HTTP/voice/telegram/all extras)
  - `abstractagent>=0.3.1`, `abstractvoice>=0.6.3`, `abstractflow>=0.3.7`
  - `abstractcore[media,tools]>=2.11.8` (via `abstractgateway[all]`)
- Documentation refresh for external users:
  - added explicit AbstractFramework ecosystem context
  - updated minimum versions in install snippets to match `pyproject.toml`
  - kept the architecture diagram as the canonical “shape of the system”
- Version metadata alignment:
  - `pyproject.toml`, `src/abstractgateway/__init__.py`, and `src/abstractgateway/app.py` now agree on `0.2.1`

## [0.1.1] - 2026-02-04

### Changed

- Documentation refresh for external users:
  - new FAQ (`docs/faq.md`)
  - clarified quickstart + smoke checks in `README.md`
  - tightened getting started, configuration, security, and API overview docs
  - improved cross-linking in `CONTRIBUTING.md` and `SECURITY.md`
  - refreshed `llms.txt` / `llms-full.txt` for agent ingestion (index + full snapshot)
- Version bump to reflect the documentation release (`0.1.0` → `0.1.1`).

### Notes

- No intentional runtime behavior changes in this release; it is documentation-focused.

## [0.1.0] - 2026-02-03

### Added

- Initial public package for AbstractGateway (`abstractgateway`).

---

## CONTRIBUTING.md

# Contributing

Thanks for your interest in improving AbstractGateway.

This repo is a Python package (`src/` layout) with a FastAPI server, a durable runner worker, and contract tests under `tests/`.

## Quick start (dev)

```bash
python -m venv .venv
source .venv/bin/activate

python -m pip install -U pip
pip install -e ".[dev]"
```

Run the test suite:

```bash
pytest
```

If you only want the fast/unit/contract layer:

```bash
pytest -m basic
```

Notes:
- `integration` and `e2e` tests may require optional dependencies and/or external services (e.g. an LLM provider).
- The CLI entrypoint is `abstractgateway` (see `pyproject.toml`).

## How to contribute

1. **Open an issue** (or a draft PR) describing what you want to change and why.
2. Keep changes **small and reviewable**.
3. Add/adjust tests where it improves confidence.
4. Update docs so they remain truthful and user-facing:
   - README is the entrypoint.
   - `docs/getting-started.md` is the step-by-step guide.
   - Prefer adding FAQ entries for recurring “gotchas”.
   - Regenerate the LLM snapshot: `python scripts/generate-llms-full.py` (updates `llms-full.txt`).

## Project conventions

- Source of truth is the code in `src/`.
- Keep public docs concise, actionable, and aligned with the current behavior.
- Prefer explicit env var names as used in code (see `docs/configuration.md`).

## Release checklist (maintainers)

1. Update `CHANGELOG.md`.
2. Bump version in:
   - `pyproject.toml`
   - `src/abstractgateway/__init__.py`
   - `src/abstractgateway/app.py` (FastAPI version string)
3. Run `pytest`.
4. Build artifacts (optional): `python -m build`

## Related docs

- Package overview + quickstart: [README.md](README.md)
- Docs index: [docs/README.md](docs/README.md)
- Getting started: [docs/getting-started.md](docs/getting-started.md)

---

## SECURITY.md

# Security policy

Thanks for helping keep AbstractGateway and its users safe.

## Reporting a vulnerability

Please **do not** open a public GitHub issue for security vulnerabilities.

Instead, use GitHub’s **private vulnerability reporting** / **Security Advisories** for this repository:
- Go to the repository’s **Security** tab
- Open **Advisories**
- Click **Report a vulnerability** (or create a draft advisory)

If you cannot use GitHub advisories, contact the maintainers privately (e.g. via GitHub profile contact links).

## What to include

To help us triage quickly, include:
- a clear description of the issue and impact
- minimal reproduction steps or a PoC
- affected versions and environments (OS/Python version/config)
- any suggested mitigation or patch

## Coordinated disclosure

We appreciate responsible disclosure and will work with you to:
- confirm the issue
- assess severity and affected versions
- produce a fix and release

Please avoid active exploitation, privacy violations, or destructive testing.

## Related docs

- Security configuration (auth/origin/limits): [docs/security.md](docs/security.md)
- Getting started: [docs/getting-started.md](docs/getting-started.md)

---

## ACKNOWLEDGMENTS.md

# Acknowledgments

AbstractGateway stands on the shoulders of many open-source projects and contributors.

This list is **non-exhaustive**. The canonical dependency list for this package is in `pyproject.toml`.

## Core dependencies

- **AbstractRuntime**: durable run model, workflow registry, file/SQLite stores, and runtime tick loop.

## Optional integrations (feature-dependent)

These are not required for the base gateway, but are used by optional modes/features:

- **FastAPI** (via **Starlette**) + **Pydantic**: HTTP API surface and request/response models (base install).
- **Uvicorn**: ASGI server used by `abstractgateway serve` (base install).
- **python-multipart**: multipart upload support for bundle/attachment endpoints (base install).
- **AbstractFlow**: workflow authoring/bundling workflows (Gateway does not depend on it; it runs `.flow` bundles).
- **AbstractCore** integration (via base `abstractruntime`): LLM/tool execution wiring, embeddings client, Telegram TDLib wrapper.
- **AbstractAgent**: Visual Agent nodes in bundle mode.
- **AbstractMemory** + **LanceDB**: `memory_kg_*` nodes in bundle mode (knowledge graph storage).
- **TDLib**: Telegram Secret Chats support when using the TDLib transport.

## Dev/test tooling

- **pytest** and **httpx**: test suite and HTTP client utilities used under `tests/`.
- **hatchling**: Python packaging/build backend.

## Contributors

Thank you to everyone who reports issues, improves documentation, and contributes code.

---

## docs/api.md

# AbstractGateway — API overview

The HTTP API is implemented with FastAPI under the `/api` prefix:
- Health: `GET /api/health`
- Gateway surface: `/api/gateway/*` (durable runs + operator tooling)

The API is documented at runtime:
- OpenAPI JSON: `GET /openapi.json`
- Swagger UI: `GET /docs` (use **Authorize** to paste the bearer token)

Context:
- In the AbstractFramework ecosystem, UIs and automations call this API to operate **AbstractRuntime** runs.
- Architecture diagram and core concepts: [architecture.md](docs/architecture.md)

## Auth

By default, `/api/gateway/*` is protected by `GatewaySecurityMiddleware` (bearer token + origin allowlist).
See: [security.md](docs/security.md).

All examples below assume:

```bash
export BASE_URL="http://127.0.0.1:8080"
export AUTH="Authorization: Bearer $ABSTRACTGATEWAY_AUTH_TOKEN"
```

## Provider endpoint profiles

Gateway-owned provider endpoint profiles let users create reusable hosted
endpoints without putting raw API keys in workflow JSON or browser storage.

- `GET /api/gateway/config/provider-endpoint-profiles`: list visible profiles.
- `POST /api/gateway/config/provider-endpoint-profiles`: create a user- or
  admin-owned profile.
- `POST /api/gateway/config/provider-endpoint-profiles/discover-models`:
  discover models for a draft or saved profile by calling the configured
  provider family and base URL with the entered or server-side key. The raw key
  is never returned.
- `PUT` or `DELETE /api/gateway/config/provider-endpoint-profiles/{profile_id}`:
  update or delete a profile.

Enabled profiles appear in `GET /api/gateway/discovery/providers` as virtual
providers such as `endpoint:office-vllm`. Model discovery through
`GET /api/gateway/discovery/providers/{provider_name}/models` returns either the
fixed profile allowlist or the live endpoint model catalog.

## Core workflow lifecycle

### 1) List bundles (bundle mode)

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/bundles"
```

Upload a bundle:

```bash
curl -sS -H "$AUTH" \
  -F "file=@./my-bundle@0.1.0.flow" \
  -F "overwrite=false" \
  -F "reload=true" \
  "$BASE_URL/api/gateway/bundles/upload"
```

### 2) Start a run

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"bundle_id":"my-bundle","input_data":{"prompt":"Hello"}}' \
  "$BASE_URL/api/gateway/runs/start"
```

If you need a specific entrypoint:

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"bundle_id":"my-bundle","flow_id":"ac-echo","input_data":{"prompt":"Hello"}}' \
  "$BASE_URL/api/gateway/runs/start"
```

Evidence: request/response models live in `src/abstractgateway/routes/gateway.py` (`StartRunRequest`, `start_run`).

### 2b) Schedule a run (bundle mode)

`POST /api/gateway/runs/schedule` starts a **scheduled parent run** that launches the target workflow as child runs over time.

Example (run 3 times, every hour, starting now):

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"bundle_id":"my-bundle","flow_id":"ac-echo","input_data":{"prompt":"Ping"},"start_at":"now","interval":"1h","repeat_count":3,"share_context":true,"session_id":"sess-1"}' \
  "$BASE_URL/api/gateway/runs/schedule"
```

Notes:
- `start_at`: ISO 8601 timestamp (recommended) or `"now"`.
- `interval`: e.g. `"15m"`, `"1h"`, `"2d"`. If omitted, runs once.
- `repeat_count`: if omitted and `interval` is set, repeats forever. Alternatively use `repeat_until` (ISO 8601).
- To stop a schedule, cancel the scheduled parent run via `POST /api/gateway/commands` with type `cancel`.

Evidence: `ScheduleRunRequest`, `start_scheduled_run` in `src/abstractgateway/routes/gateway.py`.

### 2c) Shared workflow catalog

Private `/api/gateway/bundles` routes are scoped to the signed-in user's routed
runtime. Shared/default workflows use the Gateway workflow catalog instead:

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/workflow-catalog"
```

Admin-only catalog operations live under
`/api/gateway/admin/workflow-catalog/*`:

- upload or promote immutable `.flow` versions;
- move a bundle's default pointer;
- set ACLs;
- deprecate, block, or tombstone a version without deleting bundle bytes.

Start a catalog workflow in the requesting user's runtime by setting
`registry_scope`:

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"registry_scope":"tenant_catalog","bundle_id":"basic-agent","flow_id":"root","input_data":{"prompt":"Hello"}}' \
  "$BASE_URL/api/gateway/runs/start"
```

If `bundle_version` is omitted, Gateway uses the admin-managed catalog default
pointer. Exact older versions keep working until that specific version is
deprecated, blocked, or tombstoned.

Catalog scope is explicit: omitting `registry_scope` starts only private
runtime bundles. Flow/schema inspection for catalog workflows should use the
ACL-aware catalog endpoints:

- `GET /api/gateway/workflow-catalog/{bundle_id}/versions/{bundle_version}/flows/{flow_id}`
- `GET /api/gateway/workflow-catalog/{bundle_id}/versions/{bundle_version}/flows/{flow_id}/input_schema`

`framework_catalog` is reserved but not loadable yet; use `tenant_catalog`.

### 3) Replay the ledger (cursor-based)

Ledger pages are replayed using `after` as “number of items already consumed”.

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/runs/<run_id>/ledger?after=0&limit=200"
```

Response shape:
- `items`: list of durable ledger records
- `next_after`: the next cursor to use

Evidence: `src/abstractgateway/routes/gateway.py` (`get_ledger`).

### 3b) Replay ledgers for multiple runs (batch)

Use `POST /api/gateway/runs/ledger/batch` to reduce request fanout when observing many runs/subflows.

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"limit":200,"runs":[{"run_id":"<run_id_1>","after":0},{"run_id":"<run_id_2>","after":0}]}' \
  "$BASE_URL/api/gateway/runs/ledger/batch"
```

Evidence: `src/abstractgateway/routes/gateway.py` (`get_ledger_batch`).

### 4) Stream ledger updates (SSE)

SSE is an optimization; clients should always be able to reconnect by replaying from the last `next_after`.

```bash
curl -N -H "$AUTH" "$BASE_URL/api/gateway/runs/<run_id>/ledger/stream?after=0"
```

Evidence: `src/abstractgateway/routes/gateway.py` (`stream_ledger`).

## Artifacts and filesystem handoff

Gateway artifacts are the cross-package representation for files, media, and
large payloads. Thin clients should pass artifact refs across runs instead of
raw bytes or local paths:

```json
{
  "$artifact": "abc123",
  "artifact_id": "abc123",
  "run_id": "session_memory_sess-1",
  "content_type": "image/png",
  "filename": "input.png"
}
```

List run artifacts:

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/runs/<run_id>/artifacts"
```

List artifacts visible to a session:

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/sessions/sess-1/artifacts"
```

Search artifacts across Gateway storage:

```bash
curl -sS -H "$AUTH" \
  "$BASE_URL/api/gateway/artifacts/search?scope=all&modality=image&query=logo&tags=pin_id=image"
```

`scope` can be `all`, `session`, or `run`. Use `session_id` with
`scope=session` and `run_id` with `scope=run`; omit both for `scope=all`.
`modality` filters normalized artifact type (`image`, `audio`, `video`,
`text`, `document`, `music`, `voice`, or `artifact`), `content_type` accepts
exact values or prefixes such as `image/*`, and `tags` accepts either a JSON
object or comma-separated `key=value` filters.

Import a server workspace path into a session artifact:

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"session_id":"sess-1","source":{"kind":"workspace_path","path":"inputs/photo.png"},"pin_id":"image"}' \
  "$BASE_URL/api/gateway/artifacts/import"
```

Export an artifact back into the server workspace:

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"path":"outputs/photo.png","create_parent_dirs":true,"overwrite":false}' \
  "$BASE_URL/api/gateway/runs/<run_id>/artifacts/<artifact_id>/export"
```

Import and export use the same Gateway workspace policy as file helpers:
workspace roots, mounted roots, ignored paths, and size limits are enforced on
the server. Browser-local files should be uploaded through
`POST /api/gateway/attachments/upload`; browser-local file paths are not
interpreted as Gateway workspace paths. In hosted user-auth mode, server
workspace import/export and `/files/*` helpers require an admin principal.
Ordinary users can still upload browser-local files and list/search artifacts in
their own routed runtime.

## Durable commands (`POST /api/gateway/commands`)

Commands are appended to a durable inbox and applied asynchronously by the runner.

Request fields (see `SubmitCommandRequest` in `src/abstractgateway/routes/gateway.py`):
- `command_id`: client-supplied idempotency key (UUID recommended)
- `run_id`: target run id (or session id for some event use-cases)
- `type`: `pause|resume|cancel|emit_event|update_schedule|compact_memory`
- `payload`: command-specific object

### Pause / cancel

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"command_id":"'"$(python -c 'import uuid; print(uuid.uuid4())')"'", "run_id":"<run_id>", "type":"pause", "payload":{"reason":"operator_pause"}}' \
  "$BASE_URL/api/gateway/commands"
```

### Resume a paused run

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"command_id":"'"$(python -c 'import uuid; print(uuid.uuid4())')"'", "run_id":"<run_id>", "type":"resume", "payload":{}}' \
  "$BASE_URL/api/gateway/commands"
```

### Resume a WAITING run with a payload (WAIT resume)

When `payload.payload` is present, the runner interprets this as “resume a WAITING run with a durable payload”:

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"command_id":"'"$(python -c 'import uuid; print(uuid.uuid4())')"'", "run_id":"<run_id>", "type":"resume", "payload":{"wait_key":"<optional_wait_key>", "payload":{"approved":true}}}' \
  "$BASE_URL/api/gateway/commands"
```

Evidence: `src/abstractgateway/runner.py` (`_apply_command`, `_apply_run_control`).

### Emit an external event

Minimal form:

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"command_id":"'"$(python -c 'import uuid; print(uuid.uuid4())')"'", "run_id":"<session_id>", "type":"emit_event", "payload":{"name":"chat.message","payload":{"text":"hi"}}}' \
  "$BASE_URL/api/gateway/commands"
```

Evidence: `src/abstractgateway/runner.py` (`_apply_emit_event`).

## Beyond the core

`/api/gateway/*` also includes optional operator/tooling endpoints (reports inbox, triage queue, backlog browsing + exec runner, process manager, file/attachment helpers, embeddings, voice, discovery, …).
See: [maintenance.md](docs/maintenance.md).

## Discovery endpoints (optional)

These exist to help thin clients adapt to the deployed gateway.

- Capabilities (best-effort): `GET /api/gateway/discovery/capabilities`
- Providers/models discovery (best-effort): `GET /api/gateway/discovery/providers`, `GET /api/gateway/discovery/providers/{provider}/models`
- Dynamic capability catalogs: `GET /api/gateway/voice/voices`, `GET /api/gateway/audio/speech/models`, `GET /api/gateway/audio/transcriptions/models`, `GET /api/gateway/audio/music/providers`, `GET /api/gateway/audio/music/models`, `GET /api/gateway/vision/provider_models`

The capabilities payload includes package presence (`abstractruntime`,
`abstractcore`, `abstractmemory`, `abstractvoice`, `abstractvision`), existing
gateway helpers (`tools`, `visualflow`, `media`), memory-store readiness, and
AbstractCore capability plugin status for `voice`, `audio`, `vision`, and
`music`.

Today the route paths and contract descriptors are the stable part of this
surface. Catalog routes now also include a stable Gateway-owned envelope:

- `catalog.contract = gateway_catalog_v1`
- `catalog.version = 1`
- `items = [...]`

Legacy lower-layer fields are still preserved for compatibility. New thin
clients should read `catalog` plus `items`; older clients can keep using route-
specific fields like `models`, `provider_models`, `profiles`, or `voices`.

Provider discovery also reports the resolved default provider/model when one is
configured. The resolver follows request values, flow pins, and the execution-host
`output.text` capability route; if no pair exists, the response includes
`default_error` rather than a hardcoded local model.

It also includes a versioned thin-client contract:

- `capabilities.contracts.version`: currently `1`
- `capabilities.contracts.common`: shared run start/list/summary/input/history,
  ledger, artifact, attachment, workspace, discovery, and provider prompt-cache
  controls. `common.artifacts` includes run listing/content, session artifact
  listing, artifact search, workspace import, and workspace export descriptors
  when available. Permission-sensitive descriptors are principal-aware:
  ordinary users see admin-only workspace import/export and provider
  prompt-cache controls marked unavailable with `admin_required` metadata.
- `capabilities.contracts.common.readiness`: compact Gateway-owned
  `gateway_surface_readiness_v1` summary derived from the shared endpoint/media/
  residency descriptors
- `capabilities.contracts.flow_editor`: the AbstractFlow editor/runtime surface
- `capabilities.contracts.assistant`: assistant-facing voice/audio/media/cache
  feature gates
- `capabilities.contracts.abstractcode`: code-client run/history/workspace/cache
  feature gates

Contract booleans are intentionally conservative. Package `installed=true` is
not the same thing as endpoint `available=true`; clients should branch on the
versioned contract fields when enabling controls.

`common.readiness` is intentionally narrower than provider/backend health. It
summarizes Gateway surface availability from existing descriptors, but it does
not invent selected backend/provider/model truth or stable degraded-state
reason codes.

Evidence: `src/abstractgateway/routes/gateway.py` (`discovery_capabilities`, `discovery_providers`).

## AbstractFlow gateway-first editor contract

The browser editor can use AbstractGateway as its runtime and storage host.

Draft VisualFlow records:

- `GET /api/gateway/visualflows`
- `POST /api/gateway/visualflows`
- `GET /api/gateway/visualflows/{flow_id}`
- `PUT /api/gateway/visualflows/{flow_id}`
- `DELETE /api/gateway/visualflows/{flow_id}`
- `POST /api/gateway/visualflows/{flow_id}/publish`

Bundle inspection and editor run-schema helpers:

- `GET /api/gateway/bundles`
- `GET /api/gateway/bundles/{bundle_id}`
- `GET /api/gateway/bundles/{bundle_id}/flows/{flow_id}`
- `GET /api/gateway/bundles/{bundle_id}/flows/{flow_id}/input_schema`

The input-schema endpoint returns a versioned payload with:

- `version`
- `bundle_id`, `bundle_version`, `bundle_ref`, `flow_id`, `workflow_id`
- `inputs`: entrypoint input pins derived from the `on_flow_start` node
- `defaults`: pin defaults from VisualFlow JSON
- `input_data_schema`: a small JSON Schema object for the Run Flow modal

Example:

```bash
curl -sS -H "$AUTH" \
  "$BASE_URL/api/gateway/bundles/my-bundle/flows/ac-echo/input_schema"
```

The editor observes runs with the core lifecycle endpoints above:
`/runs/start`, `/runs/{run_id}`, `/runs/{run_id}/ledger`,
`/runs/{run_id}/ledger/stream`, `/runs/ledger/batch`,
`/runs/{run_id}/input_data`, `/runs/{run_id}/history_bundle`, and
`/runs/{run_id}/artifacts`.

## Optional multimodal scope

Current direct Gateway endpoints:
- `POST /api/gateway/runs/{run_id}/voice/tts`
- `POST /api/gateway/runs/{run_id}/audio/transcribe`
- `POST /api/gateway/runs/{run_id}/images/generate`
- `POST /api/gateway/runs/{run_id}/images/edit`
- `POST /api/gateway/runs/{run_id}/videos/generate`
- `POST /api/gateway/runs/{run_id}/videos/from_image`
- `POST /api/gateway/runs/{run_id}/music/generate`
- `GET /api/gateway/voice/voices`
- `GET /api/gateway/audio/speech/models`
- `GET /api/gateway/audio/transcriptions/models`
- `GET /api/gateway/audio/music/providers`
- `GET /api/gateway/audio/music/models`
- `GET /api/gateway/vision/provider_models`

The catalog endpoints proxy AbstractCore Server routes when
`ABSTRACTCORE_SERVER_BASE_URL`
is configured. Gateway uses explicit Core auth settings for that hop and never
reuses the Gateway bearer token as a Core/provider secret. Without a configured
Core server, the voice/model routes return bounded static descriptors from
Gateway and capability-package environment variables.

Each route now adds:

- `catalog`: Gateway-owned route metadata (`contract`, `version`, `kind`,
  `scope`, `route_source`, optional `upstream_source`, and route filters)
- `items`: one canonical primary array for thin clients

Examples:

- `/voice/voices`: `items` contain voice/profile records with `id`, `label`,
  optional `provider`, optional `model`, and `voice_kind`
- `/audio/*/models`: `items` contain model records with `id`, `label`,
  optional `provider`, optional `tasks`, and optional `parameters`
- `/audio/music/providers` and `/discovery/providers`: `items` contain provider
  records with `id`, `label`, and `provider`

Generated images are available through Runtime workflows when a compatible
image backend is installed and configured. Gateway also exposes a direct image
generation endpoint that uses the Runtime/Core output-selector contract rather
than a provider-specific image client. The route stores the generated image as a
run artifact and emits `abstract.media.image.generated` with:

- `run_id`, `request_id`, `prompt`
- optional `provider`, `model`, `size`, `width`, `height`, and `format`
- `image_artifact`: `{"$artifact", "content_type", "filename", "sha256", "size_bytes"}`

If the active workflow runtime already has an AbstractCore LLM client, the route
uses it. For tools-only workflows, the route can create a direct Runtime/Core
client from request `provider`/`model` or the execution-host capability route
default. Unsupported or unconfigured deployments return a structured `ok=false`
response instead of a failed run.

Gateway also exposes a direct image-edit sibling route:

- `POST /api/gateway/runs/{run_id}/images/edit`

The request uses a source `image_artifact`, optional `mask_artifact`, the same
provider/model and image backend selectors as image generation, and returns an
artifact-backed edited image. Thin clients should feature-detect it from
`capabilities.contracts.flow_editor.media.edited_image` or
`capabilities.contracts.assistant.media.edited_image`.

Generated music follows the same direct child-run pattern. Thin clients should
discover it from `capabilities.contracts.flow_editor.media.generated_music` or
`capabilities.contracts.assistant.media.generated_music`, list providers/models
from the music catalog routes, and treat the returned `child_run_id` plus
`music_artifact` as the durable output handle.

Generated video also follows the direct child-run pattern:

- `POST /api/gateway/runs/{run_id}/videos/generate` uses the Runtime/Core
  `output.modality=video` / `task=text_to_video` contract.
- `POST /api/gateway/runs/{run_id}/videos/from_image` accepts a run-visible
  source `image_artifact` and uses `task=image_to_video`.
- Thin clients should discover these routes from
  `capabilities.contracts.flow_editor.media.generated_video` and
  `capabilities.contracts.flow_editor.media.image_to_video` (or the matching
  `assistant.media.*` entries), use
  `GET /api/gateway/vision/provider_models?task=text_to_video|image_to_video`
  for model catalogs, and stream the returned `child_run_id` ledger for
  `abstract.progress` events.

STT and listen contract notes:

- `POST /api/gateway/runs/{run_id}/audio/transcribe` accepts a run-visible
  `audio_artifact` plus optional `language`, `prompt`, `response_format`,
  `temperature`, `format`, `provider`, and `model` hints.
- `capabilities.contracts.flow_editor.voice.stt` and
  `capabilities.contracts.assistant.voice.stt` point to that upload route.
- `capabilities.contracts.flow_editor.voice.listen` and
  `capabilities.contracts.assistant.voice.listen` are host-capture contracts,
  not a live microphone socket. They tell higher apps to capture locally and
  emit an event or upload the resulting audio artifact.

## KG memory

`POST /api/gateway/kg/query` queries the configured AbstractMemory TripleStore.
Gateway resolves the store through:

- `ABSTRACTGATEWAY_MEMORY_STORE_BACKEND=lancedb|memory` (`sqlite` when the installed AbstractMemory build exposes `SQLiteTripleStore`)
- `ABSTRACTGATEWAY_MEMORY_STORE_PATH`
- `ABSTRACTGATEWAY_MEMORY_REQUIRE_VECTOR`

Structured queries work with LanceDB and in-memory stores. SQLite also works
when the installed AbstractMemory build exposes `SQLiteTripleStore`. Semantic
`query_text` requires a vector-capable backend plus the execution-host
`embedding.text` route; SQLite returns a clear 400 instead of pretending to
support semantic recall.

Capability discovery reports KG memory as available when AbstractMemory is
installed and the configured backend can be resolved. A fresh persistent store
does not need to exist yet; empty-store structured queries return an empty
result rather than making Flow authoring nodes unavailable.

## Prompt-cache control plane (operator API)

The gateway exposes prompt-cache operator endpoints under `/api/gateway/prompt_cache/*`.
Provider prompt-cache controls affect process-local or remote provider state
and require an admin principal in hosted user-auth mode.

Core endpoints:

- `GET /api/gateway/prompt_cache/capabilities?provider=...&model=...`
- `GET /api/gateway/prompt_cache/stats?provider=...&model=...`
- `POST /api/gateway/prompt_cache/set`
- `POST /api/gateway/prompt_cache/update`
- `POST /api/gateway/prompt_cache/fork`
- `POST /api/gateway/prompt_cache/clear`
- `POST /api/gateway/prompt_cache/prepare_modules`

Behavior:

- These routes use the runtime's AbstractCore prompt-cache client contract rather than directly depending on provider-instance access.
- In local mode they delegate to the in-process provider.
- In remote/hybrid mode they follow whatever `/acore/prompt_cache/*` surface the configured AbstractCore server exposes.
- All core prompt-cache responses include `operation` and `capabilities`, with structured unsupported/error cases (`code="prompt_cache_unsupported"` / `code="prompt_cache_error"` / `code="prompt_cache_unavailable"`).
- These endpoints remain provider/model controls, not a Gateway-owned CachedSession persistence system.

Session lifecycle endpoints:

- `GET /api/gateway/sessions/{session_id}/prompt_cache/status`
- `POST /api/gateway/sessions/{session_id}/prompt_cache/prepare`
- `POST /api/gateway/sessions/{session_id}/prompt_cache/rebuild`
- `POST /api/gateway/sessions/{session_id}/prompt_cache/clear`

These routes derive a deterministic bounded namespace/key from `session_id`,
`bundle_id`, `bundle_version`, `flow_id`, `provider`, `model`, optional
`template_id`, and `version`. The private hash also includes the authenticated
principal scope, so two hosted users using the same session id/provider/model do
not collide in a shared provider control plane; the returned `identity` remains
portable app-level data and does not expose that private scope. These routes
expose three honest modes:

- `unsupported`: provider/model does not expose prompt-cache support; responses include `supported=false`, `ok=false`, and capabilities.
- `keyed`: gateway returns a stable `runtime_hint`/`prompt_cache_key` for Runtime/Core injection, but does not claim module preparation occurred.
- `local_control_plane`: gateway uses supported provider operations such as `prepare_modules`, `fork`, `set`, `clear`, and `stats`.

`status` is read-only. `prepare` accepts optional modules (`system_prompt`,
`workflow_instructions`, `tools`, `pinned_attachments`) and returns either
provider operation results or a key hint. `rebuild` is clear-plus-prepare for
providers that expose clear controls.

Durable bloc exact-reuse endpoints:

- `POST /api/gateway/blocs/upsert_text`
- `GET /api/gateway/blocs/record`
- `GET /api/gateway/blocs`
- `POST /api/gateway/blocs/delete`
- `GET /api/gateway/blocs/kv/manifest`
- `GET /api/gateway/blocs/kv/list`
- `POST /api/gateway/blocs/kv/ensure`
- `POST /api/gateway/blocs/kv/load`
- `POST /api/gateway/blocs/kv/delete`
- `POST /api/gateway/blocs/kv/prune`

These routes are the primary app-facing durable prompt-cache path:

- create or identify a durable text bloc;
- ensure or load a KV artifact for a target local provider/model;
- use the returned `prompt_cache_binding` in later Runtime-backed generation;
- list/delete/prune artifacts without reaching into provider-private cache state.

They delegate through Runtime's public AbstractCore host facade rather than
proxying Core directly. They are operator-style host controls, so the routes
themselves are not ledgered run execution; the ledgered exact-reuse path is the
later `LLM_CALL.params.prompt_cache_binding` used inside real Runtime runs.

Host-local prompt-cache export/import admin aliases:

- `GET /api/gateway/prompt_cache/saved`
- `POST /api/gateway/prompt_cache/save`
- `POST /api/gateway/prompt_cache/load`

These remain explicitly local/operator-oriented:

- the route paths are compatibility aliases, but the implementation delegates to Runtime's public host facade:
  - `saved` -> `list_prompt_cache_exports(...)`
  - `save` -> `prompt_cache_export(...)`
  - `load` -> `prompt_cache_import(...)`
- local bundle/file runtimes store these exports under the Gateway data dir at `prompt_cache_exports/`
- remote and hybrid runtimes return `code=prompt_cache_local_only`
- response payloads follow Runtime's host-local export/import contract, including `operation`, `local_only`, `artifact_*`, `capabilities`, and `provider_response`

## Email inbox (operator UI; optional)

These endpoints power AbstractObserver’s **Inbox → Email** UI. They are **account-scoped**: the browser cannot supply arbitrary IMAP/SMTP host/user credentials. The gateway host must be configured with one or more email accounts (multi-account YAML or env vars).

Endpoints:
- `GET /api/gateway/email/accounts`
- `GET /api/gateway/email/messages?account=…&mailbox=…&since=…&status=…&limit=…`
- `GET /api/gateway/email/messages/{uid}?account=…&mailbox=…&max_body_chars=…`
- `POST /api/gateway/email/send`

Examples:

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/email/accounts"
```

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/email/messages?status=unread&since=7d&limit=20"
```

```bash
curl -sS -H "$AUTH" "$BASE_URL/api/gateway/email/messages/12345?max_body_chars=20000"
```

```bash
curl -sS -H "$AUTH" -H "Content-Type: application/json" \
  -d '{"to":"you@example.com","subject":"Hello","body_text":"Hi!"}' \
  "$BASE_URL/api/gateway/email/send"
```

Configuration notes (gateway host):
- Multi-account: set `ABSTRACT_EMAIL_ACCOUNTS_CONFIG=/path/to/emails.yaml` (recommended).
- Single-account env fallback: set `ABSTRACT_EMAIL_IMAP_*` and/or `ABSTRACT_EMAIL_SMTP_*`.
- The secret itself must be present in the env var referenced by `*_PASSWORD_ENV_VAR` (e.g. `EMAIL_PASSWORD=...`).

Evidence: `src/abstractgateway/routes/gateway.py` (`/email/accounts|messages|send`) which proxies to the Runtime AbstractCore comms facade.

Troubleshooting and common questions: [faq.md](docs/faq.md).

---

## docs/security.md

# AbstractGateway — Security guide

AbstractGateway secures the **gateway API surface** (`/api/gateway/*`) using an ASGI middleware:
`GatewaySecurityMiddleware` in `src/abstractgateway/security/gateway_security.py`.

Notes:
- `/api/health` is intentionally not protected.
- `/api/triage/action/*` uses signed action tokens and is not under `/api/gateway` (see `src/abstractgateway/routes/triage.py`).
- Vulnerability reporting policy: see [../SECURITY.md](SECURITY.md).

## Default behavior (token required)

By default, `abstractgateway serve` refuses to start if write endpoints are protected and no auth token is configured.
Evidence: startup self-check in `src/abstractgateway/cli.py` (`load_gateway_auth_policy_from_env`).

Recommended dev setup:

```bash
export ABSTRACTGATEWAY_AUTH_TOKEN="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
```

Clients must send:

```text
Authorization: Bearer <token>
```

### Tenant and user isolation

In local/single-user mode, `ABSTRACTGATEWAY_AUTH_TOKEN` remains a gateway-level
control-plane token and maps to the `local-admin` principal. Treat that token as
full authority for the Gateway instance.

Hosted user-auth mode is enabled with `ABSTRACTGATEWAY_USER_AUTH=1` or
`ABSTRACTGATEWAY_AUTH_MODE=users`. In that mode, Gateway bearer tokens resolve
to concrete principals with `tenant_id`, `user_id`, roles/scopes, and a token
fingerprint. `GET /api/gateway/me` returns the resolved principal and routing
mode. The presence of an `auth/users.json` registry file is readiness state; it
does not silently enable hosted user auth unless `ABSTRACTGATEWAY_USER_AUTH_AUTO=1`
is set for compatibility. Admin principals can manage users through:

- `GET /api/gateway/admin/users`
- `POST /api/gateway/admin/users`
- `GET /api/gateway/admin/users/{user_id}?tenant_id=...`
- `PATCH /api/gateway/admin/users/{user_id}?tenant_id=...`
- `DELETE /api/gateway/admin/users/{user_id}?tenant_id=...`
- `GET /api/gateway/admin/runtime-reservations`
- `POST /api/gateway/admin/runtime-reservations/{runtime_id}/transfer`
- `POST /api/gateway/admin/runtime-reservations/{runtime_id}/purge`

Gateway stores user token hashes in `<ABSTRACTGATEWAY_DATA_DIR>/auth/users.json`
by default. Generated or rotated bearer tokens are returned once from the admin
create/update response and are never stored in plaintext.

Browser apps should exchange user bearer tokens for Gateway browser sessions
instead of storing bearer tokens. `POST /api/gateway/session/login` accepts a
Gateway user id and user token, validates them against the registry, and sets an
opaque signed session id plus a CSRF token as cookies. The JSON response body
does not expose those values. Gateway stores session records in
`<ABSTRACTGATEWAY_DATA_DIR>/auth/sessions.json` by default.
The session cookie is HTTP-only; the CSRF cookie is readable by the hosting app
so it can send the CSRF header. Both cookies use path `/` and `SameSite=Lax`.
Plain HTTP local-dev responses do not set `Secure`; HTTPS responses, including
requests forwarded with `X-Forwarded-Proto: https`, do set `Secure`.
Non-remembered sessions omit `Max-Age`; remembered sessions include one.
Session-authenticated mutating requests must send:

```text
X-AbstractGateway-Session: <session id>
X-AbstractGateway-CSRF: <csrf token>
```

`POST /api/gateway/session/logout` revokes the session. Disabling, deleting, or
rotating the Gateway user invalidates existing browser sessions for that user.

When user auth is active, the Gateway service composition root routes each
principal to an isolated service/data plane under:

```text
<ABSTRACTGATEWAY_DATA_DIR>/users/<tenant_id>/<runtime_id>/runtime
<ABSTRACTGATEWAY_DATA_DIR>/users/<tenant_id>/<runtime_id>/flows
```

Gateway rejects duplicate `runtime_id` values within the same tenant during
user creation and update. This keeps the default multi-user invariant at
`1 user = 1 runtime`. Deleting a user removes the credential but reserves the
retained runtime id for that principal, so another same-tenant user cannot be
assigned to retained data by accident. Reusing the same runtime id in a
different tenant remains valid.

Admins can intentionally resolve retained runtime reservations through
admin-only lifecycle routes. Transfer assigns a retained runtime to an existing
same-tenant user and reserves that user's previous runtime id. Purge requires an
exact `confirm_runtime_id`, deletes the retained runtime root under
`<ABSTRACTGATEWAY_DATA_DIR>/users/<tenant_id>/<runtime_id>/`, then releases the
runtime id for reuse. Regular users cannot list, transfer, or purge retained
runtime reservations.

Clients must not send authoritative `user_id`, `tenant_id`, `runtime_id`, or
workspace-root values. Runtime fields such as `actor_id` and `session_id`, and
references such as `run_id`, `artifact_id`, and memory `owner_id`, remain
correlation and lookup fields; they do not authorize access by themselves.

Hosted multi-user mode is an incremental surface: the core request path now has
principal auth and per-principal services. Gateway also applies a central
route-family authorization table for operator/admin surfaces. Admin-only route
families include user management, audit, process control, backlog/triage/report
operations, email bridge routes, host metrics, model residency list/load/unload,
server workspace file helpers, server-workspace artifact import/export, and
global prompt-cache/bloc mutation routes. Regular users remain able to use
their own runtime data plane for run, ledger, artifact upload, discovery, and
per-principal capability-default routes.

The route table is intentionally conservative around server filesystem access:
browser-local files should use `/api/gateway/attachments/upload`; server
workspace reads/imports/exports require an admin principal until a stronger
per-user workspace grant model exists.

Capability discovery follows the same policy. Regular users can still discover
ordinary run, ledger, artifact, upload, provider/model catalog, KG, and
per-principal defaults surfaces, but admin-only workspace artifact
import/export and provider prompt-cache controls are advertised as unavailable
with machine-readable `admin_required` metadata. Session-level prompt-cache
keys remain available for users; the private hash includes the current
principal scope, so two users using the same session id/provider/model tuple do
not collide in a shared provider control plane.

Hosted provider secrets are supported through Gateway provider endpoint
profiles. Profiles are stored under the relevant Gateway data plane, expose only
non-secret metadata and a virtual provider id such as `endpoint:office-vllm`,
and inject the raw key only into the transient Runtime provider call. Normal
users can manage user-scoped profiles; Gateway-scoped profiles require an admin
principal. The current capability-default cascade still uses execution-host Core
defaults, then the Gateway/root baseline, then a per-user overlay under the
user's Gateway data plane. A stronger encrypted vault, audit model, and
bridge/delegated-tool propagation policy remain future hardening work.

### Shared workflow catalog

Do not share workflows by pointing multiple users at another user's private
bundle directory. Private `/api/gateway/bundles` routes stay scoped to the
current principal's runtime. Shared/default workflows belong in the Gateway
workflow catalog:

- catalog versions are immutable by `scope + tenant + bundle_id +
  bundle_version + sha256`;
- admins move explicit default pointers instead of overwriting existing
  versions;
- catalog ACLs are checked at run start against the authenticated principal's
  tenant, roles, and user id;
- catalog runs execute in the requesting user's runtime by default;
- catalog run policy is Gateway-issued and HMAC-signed before it is handed to
  Runtime state; client-supplied `_runtime.workflow_policy` values are stripped;
- private bundle inspection routes reject catalog-internal bundle ids, so
  catalog flow/schema inspection remains ACL-aware;
- deprecate/block/tombstone changes block new starts without deleting stored
  bundle bytes.

Catalog mutation routes are admin-only under
`/api/gateway/admin/workflow-catalog/*`. User-visible catalog discovery is
available at `GET /api/gateway/workflow-catalog`.

## Origin allowlist (browser/origin defense)

If the request includes an `Origin` header, the middleware enforces `ABSTRACTGATEWAY_ALLOWED_ORIGINS` using glob-style patterns (fnmatch).

Examples:

```bash
export ABSTRACTGATEWAY_ALLOWED_ORIGINS="http://localhost:*,http://127.0.0.1:*"
# or (example) an ngrok domain:
export ABSTRACTGATEWAY_ALLOWED_ORIGINS="https://*.ngrok-free.app"
```

Evidence: `GatewayAuthPolicy.allowed_origins` and `_origin_allowed()` in `src/abstractgateway/security/gateway_security.py`.

Important nuance:
- FastAPI’s CORS middleware in `src/abstractgateway/app.py` is permissive, but **origin enforcement for gateway endpoints is done by this security middleware**.

## Workspace filesystem scope (blacklist/whitelist)

AbstractGateway supports “thin clients” (browser UIs, bridges) that can trigger **filesystem-ish tools** (e.g. `list_files`, `read_file`, `write_file`). To avoid a thin client expanding server filesystem access, the gateway enforces a **workspace policy**.

Key point: the **main configuration** for filesystem allowlisting/denylisting is set when you **launch the gateway** (operator-controlled env vars). Thin clients can only request broader scopes when the gateway is started in a permissive mode.

### Default (safe): everything outside the run workspace is blocked

- When a run is started via `POST /api/gateway/runs/start` and `workspace_root` is missing (or rejected), the gateway creates a **per-run workspace** under:
  - `<ABSTRACTGATEWAY_DATA_DIR>/workspaces/<uuid>`
- AbstractRuntime applies workspace scoping to filesystem-ish tool arguments. The default is:
  - `workspace_access_mode=workspace_only`
  - absolute paths must stay under `workspace_root`

This means that by default, **all absolute paths are effectively “blacklisted”** except the run’s `workspace_root`.

Evidence:
- Run default workspace injection: `src/abstractgateway/routes/gateway.py` (`start_run`)
- Client scope clamping: `src/abstractgateway/routes/gateway.py` (`_sanitize_run_workspace_policy`, `_client_workspace_scope_overrides_enabled`)
- Runtime tool scoping: `abstractruntime/integrations/abstractcore/workspace_scoped_tools.py`
- Tests: `tests/test_gateway_workspace_policy_enforcement.py`

### Operator-controlled allowlist roots (recommended)

- `ABSTRACTGATEWAY_WORKSPACE_DIR`: base directory used to resolve relative workspace paths and as the default root for `/files/*` helpers.
- `ABSTRACTGATEWAY_WORKSPACE_MOUNTS`: additional allowed roots (newline-separated `name=/abs/path`).

Thin clients can discover the server policy via:
- `GET /api/gateway/workspace/policy`
  Note: it returns **mount names only** (no absolute paths).

### Permissive mode: allow thin clients to choose scope (trusted machines only)

To honor client-provided workspace knobs (`workspace_root`, `workspace_access_mode`, `workspace_allowed_paths`, `workspace_ignored_paths`) beyond the operator roots, enable one of:

- `ABSTRACTGATEWAY_ALLOW_CLIENT_WORKSPACE_SCOPE=1`
- `ABSTRACTGATEWAY_TRUST_CLIENT_WORKSPACE_SCOPE=1`

In this mode, a client can request:
- `workspace_access_mode=all_except_ignored` (“full access” unless explicitly blocked)

Do **not** enable this when serving untrusted browser origins: a compromised thin client can request access to arbitrary server paths.

### Important limitation (still true in all modes)

`execute_command` is **not** an OS sandbox: even if the runtime sets the default working directory under `workspace_root`, the command itself can reference absolute paths or `cd ..`.

## Common security env vars

All are loaded by `load_gateway_auth_policy_from_env()` (see `src/abstractgateway/security/gateway_security.py`).

### Enable/disable

- `ABSTRACTGATEWAY_SECURITY=1|0` (default: enabled)

### Tokens

- `ABSTRACTGATEWAY_AUTH_TOKEN` (single shared secret)
- `ABSTRACTGATEWAY_AUTH_TOKENS` (comma-separated list)
- `ABSTRACTGATEWAY_USER_AUTH=1` or `ABSTRACTGATEWAY_AUTH_MODE=users`: enable
  file-backed user principals and per-principal service routing
- `ABSTRACTGATEWAY_USER_AUTH_AUTO=1`: compatibility mode that also enables
  user auth when the registry file exists
- `ABSTRACTGATEWAY_USERS_FILE`: optional user registry path; defaults to
  `<ABSTRACTGATEWAY_DATA_DIR>/auth/users.json`
- `ABSTRACTGATEWAY_SESSIONS_FILE`: optional browser session registry path;
  defaults to `<ABSTRACTGATEWAY_DATA_DIR>/auth/sessions.json`
- `ABSTRACTGATEWAY_SESSION_TTL_S`: default browser session lifetime in seconds
  (default: 8 hours; bounded)
- `ABSTRACTGATEWAY_REMEMBER_SESSION_TTL_S`: browser session lifetime when an
  app requests "remember me" (default: 30 days; bounded)

### Protect reads vs writes

- `ABSTRACTGATEWAY_PROTECT_WRITE=1|0` (default: `1`)
- `ABSTRACTGATEWAY_PROTECT_READ=1|0` (default: `1`)
- `ABSTRACTGATEWAY_DEV_READ_NO_AUTH=1|0`
  Dev escape hatch: allow unauthenticated reads **from loopback only**.

### Limits (abuse resistance)

- `ABSTRACTGATEWAY_MAX_BODY_BYTES` (default: `256000`)
- `ABSTRACTGATEWAY_MAX_ATTACHMENT_BYTES` (default: `25MB`)
- `ABSTRACTGATEWAY_MAX_BUNDLE_BYTES` (default: `75MB`)
- `ABSTRACTGATEWAY_MAX_CONCURRENCY` (default: `64`)
- `ABSTRACTGATEWAY_MAX_SSE` (default: `32`)

### Auth lockout (brute-force safety net)

- `ABSTRACTGATEWAY_LOCKOUT_AFTER` (default: `5`)
- `ABSTRACTGATEWAY_LOCKOUT_BASE_S` (default: `1.0`)
- `ABSTRACTGATEWAY_LOCKOUT_MAX_S` (default: `60.0`)

### Audit log (write requests)

- `ABSTRACTGATEWAY_AUDIT_LOG=1|0` (default: enabled for writes)
- `ABSTRACTGATEWAY_AUDIT_LOG_MAX_BYTES` (default: `50MB`)
- `ABSTRACTGATEWAY_AUDIT_LOG_ROTATIONS` (default: `10`)
- `ABSTRACTGATEWAY_AUDIT_LOG_HEADERS` (comma-separated allowlist; default: `x-client-id,x-client-version,x-forwarded-for`)

### Reverse proxies

- `ABSTRACTGATEWAY_TRUST_PROXY=1|0`
  If enabled, `X-Forwarded-For` is used for IP attribution and lockout tracking.

## Production checklist (minimal)

- Run behind TLS (reverse proxy) and bind `--host 127.0.0.1` (proxy in front) or lock down your network if binding `0.0.0.0`.
- Use a strong random token and set exact `ABSTRACTGATEWAY_ALLOWED_ORIGINS` (avoid public wildcards).
- Keep `ABSTRACTGATEWAY_SECURITY=1`.

## Related docs

- Configuration overview: [configuration.md](docs/configuration.md)
- API overview: [api.md](docs/api.md)
- FAQ: [faq.md](docs/faq.md)

---

## docs/configuration.md

# AbstractGateway — Configuration

AbstractGateway is configured primarily via **environment variables** (plus a few CLI flags).

## Install extras (recommended)

The base install (`pip install abstractgateway`) is the remote-light server
profile: HTTP/SSE, durable stores, `AbstractRuntime`,
Runtime-owned provider/tool and multimodal support, AbstractAgent, AbstractFlow
compatibility (runs bundles produced by AbstractFlow; does not require the
`abstractflow` package), and AbstractMemory/LanceDB KG support. Local
sentence-transformer embeddings and hardware-local inference engines are
opt-in, so the base Linux install does not pull PyTorch/CUDA packages.

Remote embeddings are part of this base light profile. Configure
`embedding.text` for OpenAI, OpenRouter, Portkey, LM Studio, vLLM, another
OpenAI-compatible embeddings endpoint, or a remote AbstractCore server. The
`abstractgateway[embeddings]` extra is only for local HuggingFace/
sentence-transformer embeddings on the Gateway host.

Optional extras (see `pyproject.toml`):
- `abstractgateway[embeddings]`: local sentence-transformer embeddings for semantic KG queries
- `abstractgateway[apple]`: full native macOS Python profile with Apple-local engines and all non-NVIDIA framework capabilities; this is for native macOS, not Docker
- `abstractgateway[gpu]`: full native/container GPU profile with local GPU engines and all relevant framework capabilities; the NVIDIA Docker image uses this profile
- `abstractgateway[docs]`: MkDocs site tooling
- `abstractgateway[dev]`: local dev/test deps

Default dependency floors:
- `AbstractRuntime>=0.4.26`
- `abstractagent>=0.3.10`
- `AbstractMemory[lancedb]>=0.2.6`

Gateway's KG resolver targets AbstractMemory's TripleStore API. It does not use
the newer memory-agent API directly.

## Configuration helper

Gateway has a first-class configuration helper:

```bash
abstractgateway-config status
abstractgateway-config init --env-file .env
abstractgateway-config bootstrap-admin --print-token
abstractgateway config status --json
```

It reports Gateway auth/data/store/runtime defaults, Core-server handoff
configuration, memory-store selection, and package readiness. `init` writes a
private env file with a generated Gateway token. General Core defaults remain
available through `abstractcore-config`; hosted reusable endpoint profiles can
also be managed by the Gateway Console and API.
`bootstrap-admin` is the non-interactive setup path used by Docker images:
when user auth is enabled, it ensures `default/admin` exists, stores only the
token hash in `auth/users.json`, and can write the raw bootstrap token to
`auth/bootstrap-admin-token` for first login.

## Core environment variables

### Paths + workflow source

- `ABSTRACTGATEWAY_DATA_DIR`: durable data directory (default: `./runtime`)
  Evidence: `src/abstractgateway/config.py` (`GatewayHostConfig.from_env`)
- `ABSTRACTGATEWAY_FLOWS_DIR`: workflows directory. When unset, Gateway first
  uses the packaged shipped bundle directory containing `basic-agent`. If the
  shipped bundle is unavailable, Gateway fails clearly instead of starting with
  an empty default registry.
  Evidence: `src/abstractgateway/config.py`
- `ABSTRACTGATEWAY_WORKFLOW_SOURCE`: `bundle` (default) or `visualflow`
  Evidence: `src/abstractgateway/service.py` (`create_default_gateway_service`)

### Authentication and user routing

Default local mode uses a Gateway bearer token:

- `ABSTRACTGATEWAY_AUTH_TOKEN`: single Gateway admin token
- `ABSTRACTGATEWAY_AUTH_TOKENS`: comma-separated Gateway admin tokens

Hosted user-auth mode resolves bearer tokens to Gateway principals and routes
each principal to a separate service/data plane:

- `ABSTRACTGATEWAY_USER_AUTH=1` or `ABSTRACTGATEWAY_AUTH_MODE=users`: enable
  file-backed user principals and per-principal runtime routing
- `ABSTRACTGATEWAY_USER_AUTH_AUTO=1`: compatibility mode that also enables
  user auth when the registry file already exists
- `ABSTRACTGATEWAY_USERS_FILE`: optional user registry path; default:
  `<ABSTRACTGATEWAY_DATA_DIR>/auth/users.json`
- `ABSTRACTGATEWAY_SESSIONS_FILE`: optional browser session registry path;
  default: `<ABSTRACTGATEWAY_DATA_DIR>/auth/sessions.json`
- `ABSTRACTGATEWAY_SESSION_TTL_S`: default browser session lifetime
- `ABSTRACTGATEWAY_REMEMBER_SESSION_TTL_S`: browser session lifetime when a
  browser app requests "remember me"
- `ABSTRACTGATEWAY_ADMIN_USES_DEFAULT_RUNTIME`: keep the default
  `default/admin` admin principal on the Gateway's base data plane when its
  `runtime_id` is `default` or `admin` (default: enabled)
- `GET /api/gateway/me`: returns the resolved principal and routing mode
- `/api/gateway/admin/users`: admin-only user list/create/read/update/delete
- `/api/gateway/admin/runtime-reservations`: admin-only retained runtime
  list/transfer/purge lifecycle
- `/console`: built-in same-origin Gateway Console for session sign-in with
  Gateway user + token, account/runtime summary, admin user management, optional
  account email metadata, token rotation, retained runtime transfer/purge, and
  capability defaults selected from Gateway-discovered provider/model catalogs

User records include `tenant_id`, `user_id`, roles/scopes, enabled state, and a
`runtime_id`. The registry stores password-grade bearer-token hashes only.
Generated or rotated user tokens are returned once from the admin response.
Gateway rejects duplicate `runtime_id` values within the same tenant when users
are created or updated, preserving `1 user = 1 runtime` for independent hosted
users. Deleting a user reserves its retained runtime id. Admins must explicitly
purge retained runtime data before the id can be reused by another user, or
transfer the retained runtime to an existing same-tenant user.

When user auth is active, `src/abstractgateway/service.py` keeps normal users
isolated in a per-principal service directory:

```text
<ABSTRACTGATEWAY_DATA_DIR>/users/<tenant_id>/<runtime_id>/runtime
<ABSTRACTGATEWAY_DATA_DIR>/users/<tenant_id>/<runtime_id>/flows
```

The bootstrap `default/admin` admin principal is a local-setup compatibility
exception by default: with `ABSTRACTGATEWAY_ADMIN_USES_DEFAULT_RUNTIME=1`, it
uses the base Gateway data plane and bundle registry. That keeps the admin
connected to the default runtime and shipped `basic-agent` bundle while regular
users remain on `1 user = 1 runtime` routing.

Browser apps should exchange a Gateway user token for an opaque Gateway browser
session through `/api/gateway/session/login`; the raw bearer token should not be
kept in browser storage, and the login response body does not expose the session
id or CSRF token. Session-authenticated writes carry
`X-AbstractGateway-Session` plus `X-AbstractGateway-CSRF`, and
`/api/gateway/session/logout` revokes the session. Apps such as AbstractFlow,
AbstractCode, AbstractAssistant, and AbstractObserver should authenticate as the
current user/session in hosted mode. They should not share one app-server
Gateway token for all users.

### Per-principal capability defaults

In hosted user-auth mode, `GET /api/gateway/config/capability-defaults` returns
the execution-host Core capability routes plus the Gateway/root baseline and any
defaults configured for the current Gateway principal. The bootstrap
`default/admin` principal edits the Gateway baseline when it uses the default
runtime. Normal user writes to
`PUT /api/gateway/config/capability-defaults/{kind}/{modality}` are stored under
that principal's Gateway data plane and override the Gateway baseline only for
that user:

```text
<principal-runtime>/config/capability_defaults.json
```

This lets operators set a Gateway default and lets hosted users choose
remote-provider defaults for their own runtime without mutating the operator's
global AbstractCore config or other users. The route schema and normalization
still come from AbstractCore capability-default contracts. Provider API keys and
raw secrets are not returned by these routes and require a separate per-request
secret-injection boundary before hosted user secrets can be treated as fully
isolated execution credentials.

### Provider endpoint profiles

Gateway Console and `POST /api/gateway/config/provider-endpoint-profiles` let
signed-in users define reusable provider endpoint profiles. A profile includes a
stable id, display name, description, provider family such as
`openai-compatible`, optional base URL, optional API key, capabilities, and an
optional model allowlist. The raw API key is write-only: responses include only
`api_key_set` and a short fingerprint.

For OpenAI-compatible and other discoverable endpoints, the console can call the
endpoint through `POST /api/gateway/config/provider-endpoint-profiles/discover-models`
and populate a model picker. Leave all models unselected to keep live discovery
active, or select one or more models to store a fixed allowlist.

Enabled profiles appear in `GET /api/gateway/discovery/providers` as virtual
provider ids such as `endpoint:office-vllm`. Use that virtual id in Flow nodes
or Gateway capability defaults. At runtime the Gateway host resolves the virtual
provider to the real provider family, base URL, and API key for the transient
AbstractRuntime call; workflow JSON and browser storage do not contain the raw
secret. Normal users can manage user-scoped profiles. Gateway-scoped profiles
require an admin principal.

### Workspace policy (filesystem scope)

The gateway enforces a server-side workspace policy so thin clients cannot expand filesystem access by sending arbitrary paths.

Operator-controlled roots:
- `ABSTRACTGATEWAY_WORKSPACE_DIR`: base directory used for `/api/gateway/files/*` helpers and to clamp run-provided `workspace_root` / `workspace_allowed_paths`.
- `ABSTRACTGATEWAY_WORKSPACE_MOUNTS`: additional allowed roots, newline-separated `name=/abs/path`.

Client scope overrides (permissive; trusted machines only):
- `ABSTRACTGATEWAY_ALLOW_CLIENT_WORKSPACE_SCOPE=1` (or `ABSTRACTGATEWAY_TRUST_CLIENT_WORKSPACE_SCOPE=1`) enables honoring client-provided `workspace_*` knobs, including `workspace_access_mode=all_except_ignored`.

Discoverability:
- `GET /api/gateway/workspace/policy` returns `{policy: {...}}` including whether client overrides are enabled (mount names only; no absolute paths).

Evidence: `src/abstractgateway/routes/gateway.py` (`_workspace_root`, `_workspace_mounts`, `_sanitize_run_workspace_policy`, `_client_workspace_scope_overrides_enabled`, `start_run`).

### Durability backend

- `ABSTRACTGATEWAY_STORE_BACKEND`: `file` (default) or `sqlite`
  Evidence: `src/abstractgateway/service.py`
- `ABSTRACTGATEWAY_DB_PATH`: SQLite DB file path (optional; default: `<DATA_DIR>/gateway.sqlite3`)
  Evidence: `src/abstractgateway/stores.py` (`build_sqlite_stores`)
  Note: for safety, when `ABSTRACTGATEWAY_STORE_BACKEND=sqlite`, the DB path must be **under** `ABSTRACTGATEWAY_DATA_DIR`.
  The gateway fails fast if `ABSTRACTGATEWAY_DB_PATH` points elsewhere (prevents cross-wiring UAT/prod durable state).

### KG memory store

Gateway selects an AbstractMemory TripleStore through a small resolver; it does
not implement memory stores itself.

- `ABSTRACTGATEWAY_MEMORY_STORE_BACKEND`: `lancedb` (default), `memory`, or `sqlite` when the installed AbstractMemory build exposes `SQLiteTripleStore`
- `ABSTRACTGATEWAY_MEMORY_STORE_PATH`: optional explicit store path
- `ABSTRACTGATEWAY_MEMORY_REQUIRE_VECTOR=1`: fail fast when the selected backend cannot satisfy semantic/vector recall

Backend behavior:

- `lancedb`: persistent and vector-capable; semantic `query_text` requires the execution-host
  `embedding.text` capability route.
- `sqlite`: persistent and structured-query only when `SQLiteTripleStore` is available; semantic `query_text` fails clearly.
- `memory`: process-local test/dev backend; non-durable.

The same resolver is used for bundle `memory_kg_*` nodes and
`POST /api/gateway/kg/query`. Capability discovery reports memory backend,
persistence, vector support, and embedder status. A missing on-disk store is not
an unavailable state by itself: when AbstractMemory is installed and the backend
resolves, fresh stores are authoring-ready and structured queries simply return
no matches until assertions are written.

### Runner tuning (advanced)

These map to `GatewayHostConfig` and `GatewayRunnerConfig`:
- `ABSTRACTGATEWAY_RUNNER`: `1` (default) / `0` to disable runner in-process
  Evidence: `src/abstractgateway/config.py`, `src/abstractgateway/cli.py`
- `ABSTRACTGATEWAY_POLL_S` (default `0.25`)
- `ABSTRACTGATEWAY_COMMAND_BATCH_LIMIT` (default `200`)
- `ABSTRACTGATEWAY_TICK_MAX_STEPS` (default `100`)
- `ABSTRACTGATEWAY_TICK_WORKERS` (default `4`)
- `ABSTRACTGATEWAY_RUN_SCAN_LIMIT` (default `200`)

Evidence: `src/abstractgateway/config.py`, `src/abstractgateway/runner.py`.

## LLM/tool defaults (bundle mode)

Only needed when the loaded bundle(s) contain LLM/tool/agent nodes.

- `output.text` capability route
  Default text route for LLM execution and Gateway LLM helper endpoints. Configure it through
  `abstractgateway-config set-default output.text ...` or `abstractcore --set-global-default ...`.
  If no pair is configured, helpers return a clear configuration error instead of falling back to a
  hardcoded model.
  Evidence: `src/abstractgateway/provider_defaults.py`, `src/abstractgateway/hosts/bundle_host.py`
- `ABSTRACTGATEWAY_TOOL_MODE`:
  - `approval` (default): execute safe tools locally; require explicit approval for dangerous/unknown tools
  - `passthrough`: require explicit approval for *all* tools (then execute in-process on resume)
  - `delegated`: do not execute tools; tool calls yield a durable `JOB` wait for external executors
  - `local` (or `local_all`): execute all tools inside the gateway process (dev only; higher risk)
  Evidence: `src/abstractgateway/hosts/bundle_host.py` (tool executor selection)

### Embeddings

The gateway exposes an embeddings API when the execution host has an explicit `embedding.text`
capability default. Remote/provider-backed embeddings work with the base
remote-light install; local HuggingFace/sentence-transformer embeddings require
`abstractgateway[embeddings]`.

Configure it through the same capability-default control plane used by Flow:

```bash
abstractgateway-config set-default embedding.text \
  --provider lmstudio \
  --model text-embedding-nomic-embed-text-v1.5 \
  --base-url http://127.0.0.1:1234/v1
```

In embedded deployments Gateway uses the local Core embedding manager. In split deployments it
delegates to the remote AbstractCore `/v1/embeddings` route so provider `base_url` is evaluated
from the Core host.

Evidence: `src/abstractgateway/embeddings_config.py`

### Prompt cache controls (provider-dependent)

Gateway prompt-cache endpoints are available when the AbstractCore integration
for the active provider/model exposes them. Remote providers usually provide
server-managed cache hints; local in-process providers can expose stronger
control-plane operations when installed in a custom runtime image.
Provider-level endpoints remain available for operators, and session-level
endpoints provide a deterministic gateway-owned namespace/key lifecycle for thin
apps without pretending unsupported providers have local KV state.

- `GET /api/gateway/prompt_cache/capabilities`
- `GET /api/gateway/prompt_cache/stats`
- `POST /api/gateway/prompt_cache/set`
- `POST /api/gateway/prompt_cache/update`
- `POST /api/gateway/prompt_cache/fork`
- `POST /api/gateway/prompt_cache/clear`
- `POST /api/gateway/prompt_cache/prepare_modules`
- `POST /api/gateway/blocs/upsert_text`
- `GET /api/gateway/blocs/record`
- `GET /api/gateway/blocs`
- `POST /api/gateway/blocs/delete`
- `GET /api/gateway/blocs/kv/manifest`
- `GET /api/gateway/blocs/kv/list`
- `POST /api/gateway/blocs/kv/ensure`
- `POST /api/gateway/blocs/kv/load`
- `POST /api/gateway/blocs/kv/delete`
- `POST /api/gateway/blocs/kv/prune`
- `GET /api/gateway/prompt_cache/saved`
- `POST /api/gateway/prompt_cache/save`
- `POST /api/gateway/prompt_cache/load`
- `GET /api/gateway/sessions/{session_id}/prompt_cache/status`
- `POST /api/gateway/sessions/{session_id}/prompt_cache/prepare`
- `POST /api/gateway/sessions/{session_id}/prompt_cache/rebuild`
- `POST /api/gateway/sessions/{session_id}/prompt_cache/clear`

Session lifecycle responses distinguish `unsupported`, `keyed`, and
`local_control_plane` modes. Keyed providers receive a stable `runtime_hint`;
local-control-plane providers can prepare, clear, and rebuild when their
AbstractCore provider exposes those operations.

Treat the three prompt-cache surfaces separately:

- `/prompt_cache/*`: provider/model prompt-cache controls
- `/sessions/{session_id}/prompt_cache/*`: gateway-owned volatile session lifecycle
- `/blocs/*`: durable exact-reuse bloc/KV contract that returns `prompt_cache_binding`

The `saved` / `save` / `load` aliases are Runtime-backed host-local admin
operations. Local runtimes write under `<DATA_DIR>/prompt_cache_exports`; remote
and hybrid runtimes report `prompt_cache_local_only`.

### Multimodal provider/plugin controls

The base install already includes the Gateway HTTP/SSE server and the Runtime
multimodal integration layer. Direct Gateway routes for voice/audio, image/video,
and music become available when the corresponding lower-layer capability
packages are installed on the gateway host (or when Gateway is configured to
proxy to a remote AbstractCore server).

Local heavy engines remain explicit opt-ins in the provider packages; Gateway
does not implicitly install them.

- `output.text` capability route: default text model for bundle LLM nodes
- `OPENAI_COMPATIBLE_BASE_URL` / `OPENAI_COMPATIBLE_API_KEY`: OpenAI-compatible text endpoint for AbstractCore providers
  - Apple/MLX Docker deployments should point the lightweight Gateway container
    at host-native inference, for example
    `http://model-runner.docker.internal/engines/v1`,
    `http://host.docker.internal:1234/v1`, or
    `http://host.docker.internal:11434/v1`.
- `ABSTRACTGATEWAY_VISION_BACKEND` / `ABSTRACTGATEWAY_VISION_BASE_URL` / `ABSTRACTGATEWAY_VISION_API_KEY` / `ABSTRACTGATEWAY_VISION_MODEL_ID`: Gateway-scoped image backend settings. Legacy `ABSTRACTVISION_*` names are still accepted by the lower package.
- `ABSTRACTGATEWAY_VOICE_TTS_ENGINE` / `ABSTRACTGATEWAY_VOICE_STT_ENGINE`: Gateway-scoped voice engine settings. Legacy `ABSTRACTVOICE_*` names are still accepted by the lower package.
- `ABSTRACTGATEWAY_VOICE_TTS_MODEL` / `ABSTRACTGATEWAY_VOICE_STT_MODEL`: Gateway-scoped TTS/STT model defaults.
- `ABSTRACTGATEWAY_VOICE_REMOTE_BASE_URL` / `ABSTRACTGATEWAY_VOICE_REMOTE_API_KEY`: remote voice endpoint used by AbstractVoice.
- `GET /api/gateway/discovery/capabilities`: reports installed packages plus AbstractCore capability plugins for `voice`, `audio`, `vision`, and `music`; also returns `capabilities.contracts.version=1` with thin-client feature gates for AbstractFlow, AbstractAssistant, AbstractCode, shared run input/history endpoints, artifact search/import/export, direct voice/audio/image/video/music endpoints, workflow-backed image/video generation, and provider/session prompt-cache controls
- `GET /api/gateway/voice/voices`: proxies AbstractCore `/v1/audio/voices` when `ABSTRACTCORE_SERVER_BASE_URL` is configured; otherwise returns static Gateway/env voice descriptors.
- `GET /api/gateway/audio/speech/models`: proxies AbstractCore `/v1/audio/speech/models` when configured.
- `GET /api/gateway/audio/transcriptions/models`: proxies AbstractCore `/v1/audio/transcriptions/models` when configured.
- `GET /api/gateway/audio/music/providers`: proxies AbstractCore `/v1/audio/music/providers` when configured.
- `GET /api/gateway/audio/music/models`: proxies AbstractCore `/v1/audio/music/models` when configured.
- `GET /api/gateway/vision/provider_models`: proxies AbstractCore `/v1/vision/provider_models` when configured.
- `GET /api/gateway/vision/models`: reports locally known/cached AbstractVision model ids when the in-process capability path is available.
- `POST /api/gateway/runs/{run_id}/images/edit`: creates a durable Runtime child run for image-to-image edits and optional mask-guided edits.
- `POST /api/gateway/runs/{run_id}/videos/generate`: creates a durable Runtime child run for text-to-video and returns an artifact-backed video result.
- `POST /api/gateway/runs/{run_id}/videos/from_image`: creates a durable Runtime child run for image-to-video and returns an artifact-backed video result.
- `POST /api/gateway/runs/{run_id}/music/generate`: creates a durable Runtime child run and returns an artifact-backed music result for thin clients.

Core catalog proxy settings:

- `ABSTRACTCORE_SERVER_BASE_URL`: explicit Core server base URL for catalog proxying.
- `ABSTRACTGATEWAY_ABSTRACTCORE_SERVER_AUTH_TOKEN` / `ABSTRACTGATEWAY_ABSTRACTCORE_SERVER_API_KEY`
  (or Core's `ABSTRACTCORE_AUTH_TOKEN` / `ABSTRACTCORE_SERVER_API_KEY`): Core server auth token.
  This is separate from Gateway auth.
- `ABSTRACTGATEWAY_CORE_CATALOG_TIMEOUT_S`: catalog proxy timeout (default `3.0` seconds).

## CLI flags

`abstractgateway --help` shows all subcommands (serve/runner/migrate/triage/…).

Most-used:
- `abstractgateway serve --host 127.0.0.1 --port 8080 [--no-runner] [--reload]`
  Evidence: `src/abstractgateway/cli.py`
- `abstractgateway runner` (worker only)
- `abstractgateway config status --json`
- `abstractgateway migrate --from=file --to=sqlite --data-dir <DIR> --db-path <FILE>`

## Related docs

- Getting started: [getting-started.md](docs/getting-started.md)
- FAQ: [faq.md](docs/faq.md)
- Security configuration: [security.md](docs/security.md)
- Deployment: [deployment.md](docs/deployment.md)
- API overview: [api.md](docs/api.md)
- Operator tooling env vars: [maintenance.md](docs/maintenance.md)

---

## docs/deployment.md

# AbstractGateway deployment

AbstractGateway can run as a Python process or as a containerized server. The
container path is the recommended baseline for a single self-contained Gateway
deployment because it packages the HTTP API, durable runner, AbstractRuntime,
and the Runtime-owned provider/tool stack together.

## Published image

Release images are published to GHCR. The default image is the light,
portable server image:

```bash
docker pull ghcr.io/lpalbou/abstractgateway:0.2.24
```

NVIDIA hosts can try the experimental full GPU image when local
vLLM/HuggingFace/Diffusers engines are wanted. This image is published
best-effort until it has a real CUDA build and smoke gate:

```bash
docker pull ghcr.io/lpalbou/abstractgateway:0.2.24-gpu
```

Legacy aliases `ghcr.io/lpalbou/abstractgateway-server:*` and
`ghcr.io/lpalbou/abstractgateway-server-nvidia:*` are still published for a
transition period. New deployments should use `abstractgateway`.

The default image installs the base `abstractgateway` package, which includes:

- `AbstractRuntime`
- `AbstractMemory[lancedb]>=0.2.6`
- `abstractagent`
- FastAPI/Uvicorn

This profile supports hosted/commercial providers, OpenAI-compatible text
and multimodal provider routing, Runtime-owned tool execution, KG memory, and
provider/session prompt-cache controls. Remote embeddings are included through
the `embedding.text` capability route for hosted providers, LM Studio, vLLM,
other OpenAI-compatible endpoints, or a remote AbstractCore server. Local
sentence-transformer embeddings and hardware-local model runtimes remain
explicit opt-ins, so the base Linux image does not pull PyTorch/CUDA runtime
packages. MLX, vLLM, HuggingFace
Transformers, local Diffusers/sdcpp, AbstractVoice local engines, and local
AbstractMusic engines belong in native `abstractgateway[apple]` or
`abstractgateway[gpu]` installs.

The NVIDIA image installs `abstractgateway[gpu]` and uses a CUDA/PyTorch base.
It is experimental and release automation publishes it as
best-effort for `linux/amd64`; the default image remains the release-grade
portable `linux/amd64` and `linux/arm64` image. Treat the NVIDIA image as
production-ready only after a CUDA host build/smoke gate is added and passes.

### Apple Silicon / MLX

There is no Apple/MLX Gateway Docker image target. MLX uses Apple's Metal
stack, while Docker Desktop runs Linux containers without Metal/MPS device
access. The supported Docker shape is a lightweight Gateway container calling a
host-native OpenAI-compatible inference endpoint:

```bash
docker run --rm --name abstractgateway \
  -p 8080:8080 \
  -e ABSTRACTGATEWAY_DATA_DIR=/data \
  -e ABSTRACTGATEWAY_USER_AUTH=1 \
  -e OPENAI_COMPATIBLE_BASE_URL="http://model-runner.docker.internal/engines/v1" \
  -v "$PWD/runtime:/data" \
  ghcr.io/lpalbou/abstractgateway:latest
```

Set the execution-host text route separately:

```bash
docker exec abstractgateway abstractgateway-config set-default output.text \
  --provider openai-compatible \
  --model your-model \
  --base-url http://model-runner.docker.internal/engines/v1
```

Other host-native endpoints are also valid: LM Studio at
`http://host.docker.internal:1234/v1`, Ollama's OpenAI-compatible API at
`http://host.docker.internal:11434/v1`, or `mlx_lm.server` exposed on a host
port. For fully native non-Docker installs with local engines, use
`pip install "abstractgateway[apple]"` on Apple Silicon, and
`pip install "abstractgateway[gpu]"` on GPU workstations or NVIDIA Docker builds.

## Compose quickstart

Create an env file from the template, adjust provider keys/defaults, then start
the server. The default env keeps user auth enabled and bootstraps
`default/admin` if missing:

```bash
cp docker/abstractgateway-server/.env.example docker/abstractgateway-server/.env
docker compose --env-file docker/abstractgateway-server/.env \
  -f docker/abstractgateway-server/compose.yml up -d
```

For the experimental NVIDIA image on a GPU host with the NVIDIA Container
Toolkit:

```bash
docker compose --env-file docker/abstractgateway-server/.env \
  -f docker/abstractgateway-server/compose.yml \
  -f docker/abstractgateway-server/compose.nvidia.yml up -d
```

The default compose profile binds to `127.0.0.1:8080`, mounts a durable Gateway
data volume at `/data`, mounts bundles from `flows/bundles` at `/data/flows`,
and exposes a container workspace at `/workspace`.

Smoke checks:

```bash
curl http://127.0.0.1:8080/api/health

ADMIN_TOKEN="$(docker compose -f docker/abstractgateway-server/compose.yml exec -T abstractgateway cat /data/auth/bootstrap-admin-token)"
curl -H "Authorization: Bearer $ADMIN_TOKEN" \
  http://127.0.0.1:8080/api/gateway/me
```

## Core configuration

Required for hosted/container user-auth mode:

- `ABSTRACTGATEWAY_USER_AUTH=1`: enables Gateway user tokens and per-user routing
- `ABSTRACTGATEWAY_BOOTSTRAP_ADMIN=1`: creates `default/admin` if missing

Optional:

- `ABSTRACTGATEWAY_AUTH_TOKEN`: legacy shared admin bearer token for
  compatibility/bootstrap APIs; browser apps should use Gateway user tokens

Common:

- `ABSTRACTGATEWAY_ALLOWED_ORIGINS`: browser origin allowlist
- `output.text` capability route: default for LLM/agent nodes
- `ABSTRACTGATEWAY_TOOL_MODE`: `approval`, `passthrough`, `delegated`, or local dev modes
- `ABSTRACTGATEWAY_STORE_BACKEND`: `file` or `sqlite`
- `ABSTRACTGATEWAY_DB_PATH`: SQLite file, when using `sqlite`
- `ABSTRACTGATEWAY_RUNNER`: `1` for combined API+runner, `0` for API-only
- `ABSTRACTGATEWAY_MEMORY_STORE_BACKEND`: `lancedb` or `memory` for KG workflows and `/kg/query`; `sqlite` works when the installed AbstractMemory build exposes `SQLiteTripleStore`

Provider keys and endpoints:

- `OPENAI_API_KEY`
- `ANTHROPIC_API_KEY`
- `OPENROUTER_API_KEY`
- `PORTKEY_API_KEY` / `PORTKEY_CONFIG`
- `OPENAI_COMPATIBLE_BASE_URL` / `OPENAI_COMPATIBLE_API_KEY`
- `LMSTUDIO_BASE_URL`
- `OLLAMA_BASE_URL`
- `VLLM_BASE_URL`

Image/voice plugin endpoints:

- `ABSTRACTVISION_BACKEND`: `openai`, `openai-compatible`, `diffusers`, or `sdcpp`
- `ABSTRACTGATEWAY_VISION_BACKEND` / `ABSTRACTGATEWAY_VISION_BASE_URL` / `ABSTRACTGATEWAY_VISION_API_KEY` / `ABSTRACTGATEWAY_VISION_MODEL_ID` (legacy `ABSTRACTVISION_*` names still work)
- `ABSTRACTGATEWAY_VOICE_TTS_ENGINE` / `ABSTRACTGATEWAY_VOICE_STT_ENGINE` (`openai` by default in the server image; legacy `ABSTRACTVOICE_*` names still work)
- `ABSTRACTGATEWAY_VOICE_REMOTE_BASE_URL` / `ABSTRACTGATEWAY_VOICE_REMOTE_API_KEY`
- `ABSTRACTGATEWAY_VOICE_TTS_MODEL` / `ABSTRACTGATEWAY_VOICE_STT_MODEL`

Core catalog proxying:

- `ABSTRACTCORE_SERVER_BASE_URL`: explicit standalone Core server URL for voice, TTS/STT, and vision catalog routes
- `ABSTRACTGATEWAY_ABSTRACTCORE_SERVER_AUTH_TOKEN`: Core server auth token, separate from Gateway auth
- `ABSTRACTGATEWAY_CORE_CATALOG_TIMEOUT_S`: timeout for catalog routes

Filesystem/media controls from AbstractCore remain available:

- `ABSTRACTCORE_SERVER_BASE_URL_ALLOWLIST`
- `ABSTRACTCORE_SERVER_URL_FETCH_ALLOWLIST`
- `ABSTRACTCORE_SERVER_MEDIA_ROOT`
- `ABSTRACTCORE_SERVER_ALLOW_LOCAL_FILES`

## Cache and auth notes

Gateway auth is controlled by `ABSTRACTGATEWAY_*` variables and protects
`/api/gateway/*`. AbstractCore provider/server auth variables control upstream
provider access inside AbstractCore integrations. Keep those two layers
separate: clients receive only the Gateway token, while provider keys stay in
the server environment.

Prompt-cache control endpoints are exposed under `/api/gateway/prompt_cache/*`
where supported by the active provider/model. Session lifecycle routes under
`/api/gateway/sessions/{session_id}/prompt_cache/*` provide Gateway-owned
naming/status/prepare/clear/rebuild orchestration on top of those provider
controls. They are not a provider-independent local KV cache or full
CachedSession persistence system.

## Local-source image

Before a version is published to PyPI, build from the checkout:

```bash
ABSTRACTGATEWAY_INSTALL_MODE=local \
ABSTRACTGATEWAY_IMAGE_TAG=0.2.24-local \
docker compose -f docker/abstractgateway-server/compose.yml up -d --build
```

Release automation builds the published image from the PyPI package after the
PyPI release is available, matching the AbstractCore server image pattern.

---

## docs/architecture.md

# AbstractGateway — Architecture

> Status: implemented (main branch)
> Last reviewed: 2026-05-23

AbstractGateway is a **durable run gateway** for AbstractRuntime:
- **Start runs** (and optionally schedule them)
- Accept **durable commands** (`pause`, `resume`, `cancel`, `emit_event`, …)
- Let clients **replay** the durable ledger and optionally **stream** updates (SSE)

This document describes the code in this repository (see **Evidence** links).

## Ecosystem placement (AbstractFramework)

AbstractGateway is designed to sit between **thin clients / UIs** and **AbstractRuntime**:
- AbstractGateway: HTTP/SSE API + durability glue + baseline security (`src/abstractgateway/app.py`, `src/abstractgateway/routes/gateway.py`)
- AbstractRuntime (required): run model + tick loop + stores (`pyproject.toml`, `src/abstractgateway/runner.py`)
- AbstractRuntime + transitive capability packages (required by the default server install): Runtime owns the LLM/tool/media integration boundary; Gateway uses its discovery/run facades for prompt-cache controls, generated image/video/voice/audio/music capabilities, and KG-backed bundle execution (`src/abstractgateway/hosts/bundle_host.py`)

## High-level shape

```mermaid
flowchart LR
  subgraph Clients["Clients (thin/stateless UIs)"]
    UI["Web/PWA / TUI / 3rd-party"]
  end

  subgraph GW["AbstractGateway (this package)"]
    Sec["GatewaySecurityMiddleware\n(auth + origin + limits)"]
    API["FastAPI routes\n/api/gateway/*"]
    Runner["GatewayRunner\npoll commands + tick runs"]
    Host["Workflow host\n(bundle mode)"]
    Stores["Durable stores\nruns + ledger + commands + artifacts"]
  end

  subgraph RT["AbstractRuntime"]
    Runtime["Runtime.tick(...)"]
    Registry["WorkflowRegistry / WorkflowSpec"]
  end

  UI -->|HTTP| Sec --> API
  API -->|append commands / upload bundles| Stores
  API -->|ledger replay / SSE stream| Stores
  Runner -->|poll inbox| Stores
  Runner -->|load runtime+workflow| Host
  Host --> Registry
  Runner --> Runtime
  Runtime -->|append StepRecords| Stores
```

## Core components (code-mapped)

- **HTTP API**: `src/abstractgateway/app.py` mounts routers under `/api` (`/api/gateway/*` is the main surface).
- **Security layer** (ASGI middleware):
  - Protects `/api/gateway/*` with bearer token auth + origin allowlist + request limits.
  - Implemented in `src/abstractgateway/security/gateway_security.py`.
- **Durable stores** (file or SQLite):
  - Built by `src/abstractgateway/stores.py` (`build_file_stores`, `build_sqlite_stores`).
  - Store types come from `abstractruntime` (RunStore, LedgerStore, CommandStore, ArtifactStore).
- **Workflow host** (what “workflows” mean in this gateway):
  - `bundle` (default): load `.flow` WorkflowBundles and compile VisualFlow JSON via `abstractruntime.visualflow_compiler` (`src/abstractgateway/hosts/bundle_host.py`).
  - Wired in `src/abstractgateway/service.py` (`create_default_gateway_service`).
- **Workflow catalog**:
  - Gateway-owned metadata and immutable bundle bytes for shared/default
    workflows.
  - Private runtime bundles remain per-principal; catalog bundles are loaded
    into each user's host under internal bundle ids so private ids cannot
    shadow catalog ids.
  - Run start and schedule routes resolve catalog ACL/default policy before
    execution; catalog runs still execute in the caller's runtime.
  - Gateway signs catalog workflow-policy snapshots before Runtime receives
    them, and direct private-bundle routes reject catalog-internal ids.
- **Runner worker**:
  - Polls the durable command inbox and applies commands; ticks RUNNING runs forward (`src/abstractgateway/runner.py`).
  - A filesystem lock (`gateway_runner.lock`) prevents double-ticking in split-process deployments.

## Durable contract (replay-first)

The gateway is intentionally **replay-first**:
- The **durable ledger** is the source of truth.
- SSE (`/ledger/stream`) is an optimization; clients should reconnect by replaying from a cursor.

This contract is stated and implemented in `src/abstractgateway/routes/gateway.py` (ledger endpoints + SSE) and `src/abstractgateway/runner.py` (StepRecord append semantics).

## Thin-client control plane

Gateway also acts as the control plane for higher-level apps such as
AbstractFlow, AbstractAssistant, and AbstractObserver:

- `GET /api/gateway/discovery/capabilities` exposes a versioned shared contract
  for run input/history access, media endpoints, voice contracts, prompt-cache
  surfaces, and model residency truth.
- Provider/model catalogs are intentionally routed through Gateway. The legacy
  lower-layer payload fields are preserved, but Gateway now adds a stable
  `gateway_catalog_v1` envelope plus canonical `items` so higher apps can stop
  carrying route-local parsing logic.
- Gateway also exposes `common.readiness` as a compact surface-level summary
  for thin clients and operator UIs. That summary is deliberately limited to
  Gateway-owned contract truth; deeper backend/provider diagnostics still
  belong below Gateway.
- Direct run-scoped media routes currently include TTS, STT, image generation,
  image edit, and music generation.
- Voice listen is intentionally a host-capture contract, not a server-side
  microphone transport. Gateway tells clients how to emit or upload captured
  audio; clients keep ownership of live capture UX.
- Model residency, prompt-cache lifecycle, durable blocs, and discovery
  catalogs are server-owned control-plane surfaces so higher apps do not have
  to import Runtime/Core packages directly.
- The shared contract is principal-aware for high-trust actions. Regular users
  see admin-only workspace artifact import/export and provider prompt-cache
  controls as unavailable in discovery, while ordinary run/ledger/artifact
  upload, KG query, provider/model discovery, and per-principal defaults remain
  available in their routed runtime.

## Deployment shape: one process vs split API/runner

Supported patterns:
- **Single process**: `abstractgateway serve` starts both the HTTP API and the background runner (FastAPI lifespan + service composition).
- **Split**: run `abstractgateway runner` (worker) and `abstractgateway serve --no-runner` (API) against the same `ABSTRACTGATEWAY_DATA_DIR`.

Evidence:
- CLI flags and runner env toggles: `src/abstractgateway/cli.py`
- Runner lock file: `src/abstractgateway/runner.py`

## Workflow sources (bundle)

### Bundle mode (recommended)

- Input: `*.flow` files (WorkflowBundles) under `ABSTRACTGATEWAY_FLOWS_DIR` (file or directory).
- Internals:
  - Bundles are opened with `abstractruntime.workflow_bundle.open_workflow_bundle`.
  - VisualFlow JSON is namespaced (`bundle@version:flow`) and compiled via `compile_visualflow`.
  - “Dynamic flows” (e.g. schedules) are persisted under `<data_dir>/dynamic_flows/` and reloaded on startup.

Evidence: `src/abstractgateway/hosts/bundle_host.py` (`WorkflowBundleGatewayHost.load_from_dir`).

### Workflow catalog

The workflow catalog is a control-plane registry for shared workflows. It is
separate from the caller's private bundle directory:

- user-visible discovery: `GET /api/gateway/workflow-catalog`;
- admin mutation: `/api/gateway/admin/workflow-catalog/*`;
- immutable bundle bytes: existing `bundle_id@version` content cannot be
  replaced with different bytes;
- default pointers: omitted catalog versions resolve through the admin-managed
  pointer, not semver order;
- status: deprecated, blocked, and tombstoned versions cannot start new runs.

Gateway stores catalog metadata under the root Gateway data dir and loads
tenant catalog bundle bytes into each per-principal runtime host. Runtime still
executes workflows; Gateway owns the authorization decision.

The implemented catalog scope is `tenant_catalog`. `framework_catalog` is
reserved for a later cross-tenant distribution model and is rejected by the API
until host loading and policy semantics exist for it.

### VisualFlow directory mode

VisualFlow directory mode was intentionally removed. Store VisualFlows through
`/api/gateway/visualflows/*`, publish a `.flow` WorkflowBundle via
`POST /api/gateway/visualflows/{flow_id}/publish`, and run in bundle mode.

## Security model (gateway endpoints)

`GatewaySecurityMiddleware` applies only to paths starting with `/api/gateway`:
- **Bearer token auth** (`ABSTRACTGATEWAY_AUTH_TOKEN` / `ABSTRACTGATEWAY_AUTH_TOKENS`)
- **Origin allowlist** (`ABSTRACTGATEWAY_ALLOWED_ORIGINS`, glob patterns supported)
- **Abuse resistance** (body size caps, concurrency caps, auth lockouts, optional audit log)

Evidence: `src/abstractgateway/security/gateway_security.py` (`GatewayAuthPolicy`, `load_gateway_auth_policy_from_env`, middleware `__call__`).

Bearer tokens are gateway-level control-plane credentials, not tenant or
browser-session identities. `session_id`, `run_id`, `artifact_id`, and memory
owner ids are references/correlation fields, not authorization proofs. Deploy a
separate Gateway/runtime/data plane per independent user or tenant unless all
users are a trusted cohort that may share runs, artifacts, workflows, memory,
workspaces, tools, provider credentials, and audit scope. Hosted user-auth mode
routes each principal to a separate GatewayService data plane; browser apps
exchange user tokens for opaque session cookies plus CSRF rather than storing
raw bearer tokens. Session prompt-cache names include a private principal scope
in their hash to avoid cross-user key collisions. See [security.md](docs/security.md).

## Evidence (jump-to-code)

- Composition root: `src/abstractgateway/service.py` (`create_default_gateway_service`, `start_gateway_runner`)
- API surface: `src/abstractgateway/routes/gateway.py` (everything under `/api/gateway/*`)
- Runner semantics: `src/abstractgateway/runner.py` (`GatewayRunner`)
- Store backends: `src/abstractgateway/stores.py`
- Security policy + middleware: `src/abstractgateway/security/gateway_security.py`
- CLI + split runner: `src/abstractgateway/cli.py`

## Related docs

- Getting started (run + stores): [getting-started.md](docs/getting-started.md)
- FAQ: [faq.md](docs/faq.md)
- Configuration (env vars): [configuration.md](docs/configuration.md)
- API overview (client contract): [api.md](docs/api.md)
- Security guide: [security.md](docs/security.md)
- Operator tooling (triage/backlog/process manager): [maintenance.md](docs/maintenance.md)

---

## docs/faq.md

# AbstractGateway — FAQ

This FAQ is written for first-time users integrating or operating `abstractgateway`.
For the full API surface, rely on the live OpenAPI spec (`/openapi.json`, `/docs`) which is generated from code.

## Getting started

### What is AbstractGateway?

AbstractGateway is a **durable run gateway** for AbstractRuntime:
- starts runs from workflows (bundle mode or visualflow directory mode)
- accepts a **durable command inbox** (commands are appended, then applied asynchronously by the runner)
- exposes a **replay-first ledger** API (SSE is optional)

Evidence: `src/abstractgateway/routes/gateway.py`, `src/abstractgateway/runner.py`, `src/abstractgateway/service.py`.

### How does this fit in the AbstractFramework ecosystem?

- **AbstractRuntime** (required): the durable run model + tick loop + stores (declared in `pyproject.toml`).
- **AbstractGateway** (this repo): a deployable HTTP/SSE facade around AbstractRuntime runs (API in `src/abstractgateway/routes/gateway.py`).
- **AbstractRuntime + transitive capability packages** (required by the default server install): Runtime owns the LLM/tool/media integration boundary; Gateway uses its discovery/run facades for prompt-cache controls, generated and edited image plus voice/audio/music capabilities, and KG-backed bundle execution (`src/abstractgateway/hosts/bundle_host.py`).
- Higher-level UIs (optional): AbstractFlow (authoring/bundling), AbstractObserver / AbstractCode / thin clients (operations + rendering).

Related repos:
- AbstractFramework: https://github.com/lpalbou/AbstractFramework
- AbstractCore: https://github.com/lpalbou/abstractcore
- AbstractRuntime: https://github.com/lpalbou/abstractruntime

### Do I need AbstractFlow to run workflows?

Not for **bundle mode** (the default).

- Bundle mode loads `.flow` bundles and compiles VisualFlow JSON via `abstractruntime.visualflow_compiler` (no `abstractflow` import).
- You only need `abstractflow` to **author** bundles.

Evidence: `src/abstractgateway/hosts/bundle_host.py` (bundle compilation).

### What’s the difference between bundle mode and visualflow directory mode?

- **Bundle mode** (`ABSTRACTGATEWAY_WORKFLOW_SOURCE=bundle`, default):
  - input: one `.flow` file or a directory of `*.flow`
  - versioning: bundles are addressed as `bundle_id@bundle_version`
- **VisualFlow directory mode**: removed. Use VisualFlow CRUD + publish to `.flow` bundles, then run in bundle mode.

Evidence: `src/abstractgateway/service.py` (workflow source switch), `src/abstractgateway/hosts/bundle_host.py`.

## Security

### Why does `abstractgateway serve` refuse to start?

By default, the server requires an auth token for `/api/gateway/*` and will fail-fast if none is configured.

Fix:

```bash
export ABSTRACTGATEWAY_AUTH_TOKEN="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
```

Evidence: startup self-check in `src/abstractgateway/cli.py`, policy loading in `src/abstractgateway/security/gateway_security.py`.

### What’s the difference between `--host` and `ABSTRACTGATEWAY_ALLOWED_ORIGINS`?

- `abstractgateway serve --host ...` controls the **bind address** (network interfaces the server listens on).
- `ABSTRACTGATEWAY_ALLOWED_ORIGINS` controls an **Origin allowlist** for requests that include an `Origin` header (browser/origin defense) on `/api/gateway/*`.

Evidence: CLI flags in `src/abstractgateway/cli.py`, origin checks in `src/abstractgateway/security/gateway_security.py`.

### Why do I get `401` / `403` / `429` / `413` from `/api/gateway/*`?

Common causes:
- `401 Unauthorized`: missing/invalid `Authorization: Bearer <token>`
- `403 Forbidden (origin not allowed)`: browser `Origin` not matched by `ABSTRACTGATEWAY_ALLOWED_ORIGINS`
- `429 Too Many Requests (auth lockout)`: repeated auth failures from the same client IP (lockout backoff)
- `413 Payload Too Large`: request exceeds configured body/upload limits

Evidence: `GatewaySecurityMiddleware.__call__` in `src/abstractgateway/security/gateway_security.py`.

### Can I disable security (dev only)?

Prefer keeping security enabled, even in dev.

If you must relax it:
- disable the gateway security layer entirely: `ABSTRACTGATEWAY_SECURITY=0`
- or (safer) allow unauthenticated reads on loopback only: `ABSTRACTGATEWAY_DEV_READ_NO_AUTH=1`
- or fine-tune: `ABSTRACTGATEWAY_PROTECT_READ=0`, `ABSTRACTGATEWAY_PROTECT_WRITE=0`

Evidence: env policy loader in `src/abstractgateway/security/gateway_security.py`.

## Storage

### Where is data stored?

Everything is rooted at `ABSTRACTGATEWAY_DATA_DIR`:

- File backend (default): `run_*.json`, `ledger_*.jsonl`, `commands.jsonl`, `commands_cursor.json`, plus `artifacts/`
- SQLite backend: a single DB file (default `<DATA_DIR>/gateway.sqlite3`) plus `artifacts/`
- Gateway-generated workflows (e.g. schedules): `dynamic_flows/`
- Per-run workspaces (when `workspace_root` is not provided at start): `workspaces/`

Evidence: `src/abstractgateway/stores.py`, `src/abstractgateway/routes/gateway.py` (`start_run` workspace default), `src/abstractgateway/hosts/bundle_host.py` (dynamic flows).

### How do I switch to SQLite? Can I migrate?

- Switch by setting `ABSTRACTGATEWAY_STORE_BACKEND=sqlite` (and optionally `ABSTRACTGATEWAY_DB_PATH`).
- Migrate file → SQLite with `abstractgateway migrate --from=file --to=sqlite ...` (best-effort local migration).

Evidence: `src/abstractgateway/stores.py`, `src/abstractgateway/migrate.py`, CLI wiring in `src/abstractgateway/cli.py`.

## Runs, ledger, commands

### What is the ledger, and what does `after` mean?

- The ledger is an **append-only** list of step records.
- `after` is a cursor meaning “number of records already consumed”; responses return `next_after`.
- SSE streams ledger updates, but clients should always reconnect by replaying from the last cursor.

Evidence: `GET /runs/{run_id}/ledger` and `/ledger/stream` in `src/abstractgateway/routes/gateway.py`.

### How do durable commands work? When do they take effect?

`POST /api/gateway/commands` appends a command record to a durable inbox.
The background runner polls the inbox and applies commands asynchronously.

Supported command types:
`pause|resume|cancel|emit_event|update_schedule|compact_memory`

Evidence: `submit_command` in `src/abstractgateway/routes/gateway.py`, command application in `src/abstractgateway/runner.py`.

### Can I schedule a workflow to run periodically?

Yes (bundle mode).

Use `POST /api/gateway/runs/schedule` to start a scheduled parent run that launches the target workflow as child runs over time.

Notes:
- `interval` supports compact durations like `15m`, `1h`, `2d`.
- If `interval` is set and `repeat_count` is omitted, the schedule repeats forever (until you cancel it).
- To stop the schedule, cancel the scheduled parent run via `POST /api/gateway/commands` with type `cancel`.

Evidence: `ScheduleRunRequest` + `start_scheduled_run` in `src/abstractgateway/routes/gateway.py`.

## Bundles and workflow execution

### How do I run a specific bundle version?

When starting runs in bundle mode you can select versions in two ways:
- pass `bundle_id` + `bundle_version`
- or pass a namespaced `flow_id` like `bundle@version:flow` (this also works for selecting “latest” via `bundle:flow`)

Evidence: bundle selection in `src/abstractgateway/hosts/bundle_host.py` (`start_run`).

### My bundle fails with “LLM/tool execution requires AbstractCore integration”

AbstractRuntime’s AbstractCore integration is included by the base
`abstractgateway` install. If this error appears, verify the installed package set with
`pip show AbstractRuntime abstractcore`.

Evidence: `src/abstractgateway/hosts/bundle_host.py` (imports under `needs_llm/needs_tools`).

### My bundle fails with “LLM nodes but no default provider/model is configured”

Configure the execution-host `output.text` route:

```bash
abstractgateway-config set-default output.text \
  --provider lmstudio \
  --model qwen/qwen3.6-35b-a3b \
  --base-url http://127.0.0.1:1234/v1
```

Alternatives:
- Pin provider/model on at least one `llm_call` or `agent` node; the gateway scans the flow JSON for defaults.
- Keep provider secrets in `abstractcore-config`; use `--base-url` on the capability route when the
  selected provider endpoint is not the provider default.

Evidence: `_scan_flows_for_llm_defaults` + provider/model selection in `src/abstractgateway/hosts/bundle_host.py`.

### Why do tool calls not execute?

In bundle mode, tool execution is controlled by:

- `ABSTRACTGATEWAY_TOOL_MODE=approval` (default): safe tools execute immediately; dangerous/unknown tools pause for explicit approval.
- `ABSTRACTGATEWAY_TOOL_MODE=passthrough`: approval required for *all* tools (including safe ones); after approval, the runtime executes the tool batch in-process.
- `ABSTRACTGATEWAY_TOOL_MODE=delegated`: tools are not executed locally; workflows enter a durable `JOB` wait for external executors.
- `ABSTRACTGATEWAY_TOOL_MODE=local` (or `local_all`): tools execute inside the gateway process without approval (dev only; unsafe).

Evidence: tool executor selection in `src/abstractgateway/hosts/bundle_host.py`.

### Why do `/voice/tts` or `/audio/transcribe` fail with “capability unavailable”?

Those endpoints are surfaced through Runtime's voice/audio integration path and
the required capability packages are included by the base `abstractgateway`
install. Verify the installed package set with:

```bash
pip show abstractgateway AbstractRuntime abstractcore
```

By default, the gateway allows the configured voice backend to download models on first use. If you disabled downloads (or want to enable them explicitly), set:

```bash
export ABSTRACTGATEWAY_VOICE_ALLOW_DOWNLOADS=1
```

For remote/OpenAI-compatible voice backends, configure the Gateway-scoped voice
environment variables in the gateway process, for example
`ABSTRACTGATEWAY_VOICE_TTS_ENGINE`, `ABSTRACTGATEWAY_VOICE_STT_ENGINE`,
`ABSTRACTGATEWAY_VOICE_REMOTE_BASE_URL`, and
`ABSTRACTGATEWAY_VOICE_REMOTE_API_KEY` (legacy `ABSTRACTVOICE_*` names still
work).

### How do I enable generated images, edited images, generated music, or other Runtime-managed multimodal outputs?

Use the base install for the Gateway control plane and remote/provider-backed
routes:

```bash
pip install abstractgateway
```

The base install includes Runtime-owned tool and multimodal integration and can
proxy to configured remote/provider routes. Remote embeddings are supported
through the `embedding.text` capability route when it points at OpenAI,
OpenRouter, Portkey, LM Studio, vLLM, another OpenAI-compatible endpoint, or a
remote AbstractCore server. Local sentence-transformer embeddings and
hardware-local image, audio, voice, and music engines are explicit opt-ins so a
light Linux install does not pull PyTorch/CUDA packages. Use
`abstractgateway[apple]` or `abstractgateway[gpu]` only when this Gateway host
should execute those local engines itself.

Generated images are available both inside Runtime workflows and through
Gateway's direct run-scoped endpoint:

```text
POST /api/gateway/runs/{run_id}/images/generate
POST /api/gateway/runs/{run_id}/images/edit
POST /api/gateway/runs/{run_id}/videos/generate
POST /api/gateway/runs/{run_id}/videos/from_image
```

The direct image and video endpoints use Runtime/Core output selectors and
store the result as a run artifact, so they still require a configured
Runtime-compatible vision/video backend. For long video runs, stream the
returned `child_run_id` ledger and watch `abstract.progress` records.

Generated music is exposed through Gateway's direct Runtime child-run route and
its thin-client discovery/catalog contract:

```text
POST /api/gateway/runs/{run_id}/music/generate
GET /api/gateway/audio/music/providers
GET /api/gateway/audio/music/models
```

Higher apps should feature-detect music from
`capabilities.contracts.flow_editor.media.generated_music` or
`capabilities.contracts.assistant.media.generated_music`.

### What is `voice.listen` in the capabilities contract?

`voice.listen` is not a live server-side microphone transport. It is a
higher-app contract that tells clients how to handle local capture:

- capture audio on the client or host device
- either upload it to `POST /api/gateway/runs/{run_id}/audio/transcribe`
- or emit the configured event/command into the run contract

This keeps live capture UX owned by higher apps such as Assistant or Observer
while Gateway stays responsible for durable runs, artifacts, and transcription.

### Are catalog responses fully normalized by Gateway?

Not yet.

Gateway already owns the route family and the thin-client contract pointers for
provider/model discovery, but some catalog response bodies still preserve
lower-layer shape differences. Higher apps like Flow currently normalize a few
legacy variants when reading model/provider catalogs.

What is stable today:

- which discovery routes exist
- which contract fields point at those routes
- which media tasks and direct endpoints are available

What is not yet versioned as a strict Gateway contract:

- one canonical provider/model catalog response envelope across text, vision,
  voice, STT, and music
- a dedicated deployment/readiness block for operator dashboards

### What does Gateway session prompt-cache orchestration include?

The `/api/gateway/prompt_cache/*` routes expose provider/model prompt-cache
controls when the active AbstractCore integration supports them. Gateway also
provides session lifecycle routes under
`/api/gateway/sessions/{session_id}/prompt_cache/*` for status, prepare,
rebuild, and clear using deterministic session keys.

This is Gateway-owned naming and orchestration over provider controls, not a
provider-independent local KV cache or full CachedSession persistence system.

### My bundle fails with “Visual Agent nodes require AbstractAgent”

AbstractAgent is included by the base `abstractgateway` install. Verify the
installed package set with:

```bash
pip show abstractgateway abstractagent
```

Evidence: agent workflow registration in `src/abstractgateway/hosts/bundle_host.py`.

### My bundle fails with “memory_kg_* nodes … install abstractmemory”

`memory_kg_*` nodes use Gateway's AbstractMemory TripleStore integration,
included by the base `abstractgateway` install.

Keep the default `lancedb` backend for durable vector-capable memory, use
`memory` for process-local dev/test memory, or set
`ABSTRACTGATEWAY_MEMORY_STORE_BACKEND=sqlite` only when your installed
AbstractMemory build exposes `SQLiteTripleStore`.

A fresh persistent store does not make KG memory unavailable. Capability
discovery treats the surface as available once AbstractMemory is installed and
the configured backend resolves; empty structured queries return empty results
until a flow asserts triples.

Evidence: memory KG wiring in `src/abstractgateway/memory_store.py` and
`src/abstractgateway/hosts/bundle_host.py`.

## Deployment

### How do I run API and runner as separate processes?

Run:

```bash
abstractgateway runner
abstractgateway serve --no-runner --host 127.0.0.1 --port 8080
```

The runner uses a lock file (`gateway_runner.lock`) to prevent double-ticking on the same data dir.

Evidence: CLI flag `--no-runner` in `src/abstractgateway/cli.py`, lock acquisition in `src/abstractgateway/runner.py`.

## Related docs

- Docs index: [README.md](docs/README.md)
- Getting started: [getting-started.md](docs/getting-started.md)
- API overview: [api.md](docs/api.md)
- Security: [security.md](docs/security.md)
- Configuration: [configuration.md](docs/configuration.md)
- Architecture: [architecture.md](docs/architecture.md)
- Operator tooling (optional): [maintenance.md](docs/maintenance.md)

---

## docs/maintenance.md

# AbstractGateway — Operator tooling (optional)

`/api/gateway/*` includes “operator tooling” endpoints used by higher-level UIs and workflows (reports inbox, triage queue, backlog helpers, process manager, file/attachment helpers, …). These features are **not required** to use AbstractGateway as a durable run gateway.

This document groups the main non-core features and how to enable them safely.

## Safety model (read this first)

Some endpoints can:
- write files under `ABSTRACTGATEWAY_DATA_DIR`
- read files from configured workspace mounts
- start/stop local processes (process manager)
- execute queued backlog tasks (backlog exec runner)

Only enable these features on **trusted machines** and keep gateway auth enabled.
Security enforcement for `/api/gateway/*` is in `src/abstractgateway/security/gateway_security.py`.

## Reports inbox + triage queue

Implemented in `src/abstractgateway/routes/gateway.py` and `src/abstractgateway/maintenance/*`.

Key endpoints:
- `POST /api/gateway/bugs/report`
- `POST /api/gateway/features/report`
- `GET /api/gateway/reports/bugs` / `GET /api/gateway/reports/features`
- `POST /api/gateway/triage/run`
- `GET /api/gateway/triage/decisions`

CLI helpers:
- `abstractgateway triage-reports` (scan inbox → decision queue; optional draft writing)
- `abstractgateway triage-apply <decision_id> approve|reject|defer`

Notification helpers used by `triage-reports --notify`:
- Telegram: `ABSTRACT_BACKLOG_TELEGRAM_CHAT_ID` or `ABSTRACT_TRIAGE_TELEGRAM_CHAT_ID`
- Email recipients: `ABSTRACT_BACKLOG_EMAIL_TO` or `ABSTRACT_TRIAGE_EMAIL_TO`
- Optional email account override: `ABSTRACT_BACKLOG_EMAIL_ACCOUNT` or `ABSTRACT_TRIAGE_EMAIL_ACCOUNT`

Evidence: CLI wiring in `src/abstractgateway/cli.py`.

## Backlog browsing/editing (repo-dependent)

The gateway also exposes endpoints that read/write backlog Markdown files in a repository layout that includes `docs/backlog/*`.

To enable these endpoints, set the repo root:

```bash
export ABSTRACTGATEWAY_TRIAGE_REPO_ROOT="/path/to/your/repo"
```

Evidence: repo-root checks in `src/abstractgateway/routes/gateway.py` (process manager + backlog endpoints) and in `src/abstractgateway/maintenance/backlog_exec_runner.py`.

## Backlog execution runner (high risk; disabled by default)

The backlog exec runner consumes queued execution requests under `<DATA_DIR>/backlog_exec_queue/` and executes them (optionally using the `codex` CLI).

Enable:

```bash
export ABSTRACTGATEWAY_BACKLOG_EXEC_RUNNER=1
export ABSTRACTGATEWAY_BACKLOG_EXECUTOR="none"   # none|codex_cli|workflow_bundle
```

Additional knobs (see `BacklogExecRunnerConfig.from_env()`):
- `ABSTRACTGATEWAY_BACKLOG_EXEC_POLL_S`
- `ABSTRACTGATEWAY_BACKLOG_EXEC_WORKERS`
- `ABSTRACTGATEWAY_BACKLOG_CODEX_BIN`
- `ABSTRACTGATEWAY_BACKLOG_CODEX_MODEL`
- `ABSTRACTGATEWAY_BACKLOG_CODEX_REASONING_EFFORT` (`low|medium|high|xhigh`)
- `ABSTRACTGATEWAY_BACKLOG_CODEX_SANDBOX`
- `ABSTRACTGATEWAY_BACKLOG_CODEX_APPROVALS`

Evidence: `src/abstractgateway/service.py` (runner startup), `src/abstractgateway/maintenance/backlog_exec_runner.py`.

## Process manager (dev-only; disabled by default)

The process manager can start/stop a small allowlisted set of local processes and tail logs. It is intended for **trusted dev machines**.

Notes:
- **Process control** (`/api/gateway/processes`, start/stop, log tail) is **repo-root scoped** for safety and assumes a monorepo-style checkout (scripts like `./build.sh`, `./agw-uat.sh`, …).
- **Env-var management** (`/api/gateway/processes/env`) is **repo-root independent** and works in packaged installs (it persists under `ABSTRACTGATEWAY_DATA_DIR`).

Enable:

```bash
export ABSTRACTGATEWAY_ENABLE_PROCESS_MANAGER=1

# Optional (process control only): the AbstractFramework checkout root
export ABSTRACTGATEWAY_TRIAGE_REPO_ROOT="$PWD"
```

Optional config path:

```bash
export ABSTRACTGATEWAY_PROCESS_MANAGER_CONFIG="$PWD/runtime/gateway/processes.json"
```

Endpoints:
- `GET /api/gateway/processes` (requires `ABSTRACTGATEWAY_TRIAGE_REPO_ROOT`)
- `POST /api/gateway/processes/{id}/start|stop|restart|redeploy`
- `GET /api/gateway/processes/{id}/logs/tail`
- `GET /api/gateway/processes/env` (metadata only; never returns values; does not require repo root)
- `POST /api/gateway/processes/env` (write-only set/unset for allowlisted keys; does not require repo root)

Evidence: `src/abstractgateway/routes/gateway.py` (endpoint guards) and `src/abstractgateway/maintenance/process_manager.py`.

### Env var allowlist (write-only)

Env var editing is allowlist-only and values are write-only (they are never returned to the client). Overrides are persisted on the gateway host under:
- `<ABSTRACTGATEWAY_DATA_DIR>/process_manager/env_overrides.json`

When the gateway starts and `ABSTRACTGATEWAY_ENABLE_PROCESS_MANAGER=1`, it loads and applies persisted overrides to its own `os.environ` (best-effort).

To extend the allowlist, update:
- `src/abstractgateway/maintenance/process_manager.py` → `managed_env_var_allowlist()`

## File + attachment helpers (thin-client support)

The gateway exposes helpers used by thin clients and workflows:
- Workspace policy: `GET /api/gateway/workspace/policy`
- File access: `GET /api/gateway/files/search|read|skim`
- Attachments: `POST /api/gateway/attachments/ingest` and `POST /api/gateway/attachments/upload`

Workspace scope is **operator-controlled at gateway launch**:

- Default (safe): thin clients cannot expand filesystem scope. If a run is started without `workspace_root`, the gateway creates a per-run workspace under `<ABSTRACTGATEWAY_DATA_DIR>/workspaces/<uuid>`, and filesystem-ish tool calls are scoped to that workspace (`workspace_access_mode=workspace_only`).
- Allowlist additional roots for file helpers via `ABSTRACTGATEWAY_WORKSPACE_DIR` + `ABSTRACTGATEWAY_WORKSPACE_MOUNTS`.
- Permissive mode (trusted machines only): enable client-provided `workspace_*` overrides (including `all_except_ignored`) via `ABSTRACTGATEWAY_ALLOW_CLIENT_WORKSPACE_SCOPE=1` (or `ABSTRACTGATEWAY_TRUST_CLIENT_WORKSPACE_SCOPE=1`).

Note: `/api/gateway/files/*` + `/api/gateway/attachments/ingest` ignore client-provided scope overrides unless client overrides are enabled.

Server-side workspace mounts (operator-controlled):

```bash
# newline-separated: name=/absolute/path
export ABSTRACTGATEWAY_WORKSPACE_MOUNTS=$'repo=/abs/path/to/repo\\ndata=/abs/path/to/data'
```

Evidence: `_workspace_mounts()` and related policy helpers in `src/abstractgateway/routes/gateway.py`, tests in `tests/test_gateway_workspace_policy_enforcement.py`.

## Bridges (Telegram, email)

Background bridges can ingest external messages and start durable runs (thin-client semantics), and may also emit events for specialized workflows.

Enable (Telegram):
- `ABSTRACT_TELEGRAM_BRIDGE=1`
- transport + credentials depend on configuration (see `src/abstractgateway/integrations/telegram_bridge.py`):
  - Bot API (default when token is present): `ABSTRACT_TELEGRAM_BOT_TOKEN=...`
  - TDLib (E2EE): `ABSTRACT_TELEGRAM_TRANSPORT=tdlib` + TDLib setup
- access control (fail-closed defaults):
  - DMs default to allowlist: set `ABSTRACT_TELEGRAM_ALLOWED_USERS=...` (numeric Telegram user_id; discover via `/whoami`)
  - Groups default to disabled (opt-in via `ABSTRACT_TELEGRAM_GROUP_POLICY=allowlist|open`)
- Optional: override which workflow to run per message:
  - `ABSTRACT_TELEGRAM_BUNDLE_ID=...`
  - `ABSTRACT_TELEGRAM_FLOW_ID=...`
  - Default (when unset): shipped `basic-agent` bundle entrypoint.
- Tool approvals:
  - `ABSTRACTGATEWAY_TOOL_MODE=approval` (default): safe tools run in-process; dangerous/unknown tools require `/approve` or `/deny`.
  - `ABSTRACTGATEWAY_TOOL_MODE=passthrough`: approval required for *all* tools (including safe ones); after approval, the runtime executes the tool batch in-process.
  - `ABSTRACTGATEWAY_TOOL_MODE=delegated`: tools are not executed locally; workflows enter a durable `JOB` wait for external executors.
- optional knobs:
  - Telegram-only routing override: `ABSTRACT_TELEGRAM_MODEL` (and optionally `ABSTRACT_TELEGRAM_PROVIDER`)
  - Durable history limit: `ABSTRACT_TELEGRAM_MAX_HISTORY_MESSAGES`
  - `/reset` controls: `ABSTRACT_TELEGRAM_RESET_DELETE_MESSAGES`, `ABSTRACT_TELEGRAM_RESET_DELETE_MAX`, `ABSTRACT_TELEGRAM_RESET_MESSAGE`

Enable (Email):
- `ABSTRACT_EMAIL_BRIDGE=1`
- IMAP credentials + polling config (see `src/abstractgateway/integrations/email_bridge.py`)

Evidence: bridge startup in `src/abstractgateway/service.py` (`start_gateway_runner`).

## Email inbox endpoints (AbstractObserver Inbox → Email)

If email accounts are configured on the gateway host, the gateway exposes account-scoped endpoints used by AbstractObserver to list/read/send emails:
- `GET /api/gateway/email/accounts`
- `GET /api/gateway/email/messages`
- `GET /api/gateway/email/messages/{uid}`
- `POST /api/gateway/email/send`

These endpoints proxy through Gateway's Runtime comms facade and never accept arbitrary IMAP/SMTP host/user secrets from the browser.

Configuration notes:
- Multi-account: set `ABSTRACT_EMAIL_ACCOUNTS_CONFIG=/path/to/emails.yaml` (recommended).
- Single-account env fallback: `ABSTRACT_EMAIL_IMAP_*` / `ABSTRACT_EMAIL_SMTP_*`.

Evidence: `/api/gateway/email/*` routes in `src/abstractgateway/routes/gateway.py` which call the Runtime AbstractCore comms facade.

## Related docs

- API overview (core client contract): [api.md](docs/api.md)
- Security: [security.md](docs/security.md)
- FAQ: [faq.md](docs/faq.md)
