RT6 — Universal Data Adapter (UDA)¶
Durable Epic plan for the nickpclarke/backend-core spoke: a FastAPI-based Universal Data Adapter
that provides a single, provider-neutral query surface over heterogeneous storage backends (ArcadeDB,
Postgres, object storage, and future connectors), with per-connection RBAC, query caching, pagination,
and observability.
The backend-core repo board is the source of truth for status and issue numbers. Issues for this RT live at
https://github.com/nickpclarke/backend-core/issues. Stable local IDs (UDA-*) below are the durable planning references. Hub Epic cross-reference:UDA-E(Epic #13 in backend-core).
Theme¶
Universal Data Adapter: give every spoke and agent a single, authenticated, paginated query surface over the platform's heterogeneous storage layer — ArcadeDB (graph/ontology), Postgres (relational), object storage (blobs), and future connectors — without coupling consumers to backend specifics. The UDA is the data-plane contract that RT7 (middle-core runtime) and RT8 (agent analytics tools) depend on; its OpenAPI schema is the integration boundary.
This RT is a spoke-implementation train — all deliverables are in nickpclarke/backend-core.
Summary¶
| Epic | Phase | Features / Enablers | PI | Status |
|---|---|---|---|---|
| UDA-E Universal Data Adapter | Phase 1 (shipped) | Core adapter + ArcadeDB connector | PI-2 | Shipped via PR #31 |
| UDA-E | Phase 2 (in-flight) | Issues #33–#44 | PI-2/PI-3 | In progress |
| UDA-E | Phase 3 (planned) | Issues #45–#50 (Postgres, object storage, pagination, RBAC, caching, OpenLineage) | PI-3 | Not started |
Total Phase 3: 6 Features/Spikes = the planned work described in this document. Session estimate (Phase 3): ~3–4 sessions (implementation-heavy: real connectors, RBAC middleware, cache layer, observability spike).
Live issues:
- Epic: https://github.com/nickpclarke/backend-core/issues/13
- Phase 2 in-flight: https://github.com/nickpclarke/backend-core/issues #33–#44
- Phase 3 planned: #45–#50
Phase 1 — Shipped (reference only)¶
Phase 1 delivered the foundational UDA: FastAPI app skeleton, ArcadeDB HTTP connector, query dispatch layer, OpenAPI baseline, and Docker + CI. Merged to main in PR #31.
Phase 2 — In-flight (issues #33–#44)¶
Issues #33–#44 are active work tracked on the backend-core board. This document does not re-specify them in detail — see the live issues for acceptance criteria. Key themes in Phase 2: - Connector abstraction hardening - Query result normalisation - Error handling and retry semantics - OpenAPI schema refinement
Phase 3 — Backlog items (issues #45–#50)¶
SAFE fields per item:
Type,PI,Size,Estimate(Fibonacci pts),Priority. Definition of Ready = these set + acceptance criteria below. Definition of Done = PR merged withCloses #N, CI green, OpenAPI schema updated.
UDA-E — Epic: Universal Data Adapter (issue #13)¶
- Type: Epic · PI: PI-2 → PI-3 (multi-increment) · Priority: P0
- Outcome: A single FastAPI service exposes a provider-neutral query API over ArcadeDB, Postgres, and object storage; queries are paginated, cached, access-controlled per connection, and OpenLineage-traceable. The OpenAPI schema is the stable contract consumed by RT7 (MCR-F4 projections) and RT8 (GPM-F3 analytics tools).
- Children (Phase 3): UDA-F1, UDA-F2, UDA-F3, UDA-F4, UDA-F5, UDA-S1.
UDA-F1 — Feature: Postgres connector (issue #45)¶
- Type: Feature · Size: M · Estimate: 5 · Priority: P1 · Depends on: Phase 2 connector abstraction stable
- Scope:
connectors/postgres/—asyncpg-based connector; connection pool management; parameterised query execution; result normalisation to UDA canonical row format; schema introspection endpointGET /connectors/postgres/{id}/schema;DATABASE_URLenv var + Key Vault reference; integration test against a Postgres container in CI. - Acceptance:
SELECTand parameterised DML round-trip correctly; connection pool reused across requests (not re-created per call); schema endpoint returns table + column metadata; CI integration test runs againstpostgres:16Docker container.
UDA-F2 — Feature: Object-storage connector (issue #46)¶
- Type: Feature · Size: M · Estimate: 5 · Priority: P2 · Depends on: Phase 2 connector abstraction stable; parallel with UDA-F1
- Scope:
connectors/object-storage/— Azure Blob Storage + S3-compatible adapter;LIST /containers/{container},GET /objects/{container}/{key}, metadata-only mode; presigned-URL generation for large object download;AZURE_STORAGE_CONNECTION_STRING/AWS_S3_BUCKETenv vars; streaming response for large objects. - Acceptance: list + get round-trip against Azurite emulator in CI; presigned URL expires correctly; metadata-only mode returns content-length + content-type without body transfer.
UDA-F3 — Feature: Query pagination (issue #47)¶
- Type: Feature · Size: S · Estimate: 3 · Priority: P0 · Depends on: Phase 2 query dispatch stable
- Scope: Cursor-based pagination on all list/query endpoints (
?cursor=&limit=); opaque cursor token (base64-encoded offset or keyset);X-Next-Cursorresponse header; max page size enforced (configurable, default 100); OpenAPI schema updated with pagination parameters and response envelope. - Acceptance: first page returns
X-Next-Cursor; following cursor returns the next page; last page returns no cursor; requesting beyond last page returns empty list (not 404); RT8 GPM-F3 analytics tools can iterate all results without client-side offset arithmetic. - Cross-RT note: RT8 GPM-F3 (analytics tools over UDA) is blocked until this feature is stable. This is the highest-priority Phase 3 item.
UDA-F4 — Feature: Per-connection RBAC (issue #48)¶
- Type: Feature · Size: M · Estimate: 5 · Priority: P1 · Depends on: UDA-F3 (pagination must be stable before RBAC layering)
- Scope: Connection-level access policy stored in config or ArcadeDB; JWT claims or API-key scope
checked against the policy on every query;
403 Forbiddenwith structured error body on deny; admin endpointPOST /connections/{id}/policyto set the policy; policy evaluation logged as OTel span attribute. - Acceptance: query with insufficient scope returns 403; policy update takes effect without service restart; policy evaluation appears in OTel trace; existing public/unscoped queries unaffected when policy is not configured.
UDA-F5 — Feature: Query caching (issue #49)¶
- Type: Feature · Size: M · Estimate: 5 · Priority: P2 · Depends on: UDA-F3, UDA-F4; parallel with UDA-S1
- Scope: In-process LRU cache (Redis-ready via pluggable backend); cache key = hash of
(connector ID, query, parameters); TTL configurable per connector;
Cache-Control/X-Cache-Hitresponse headers; cache bypass viaCache-Control: no-cacherequest header; metrics:uda_cache_hits_total,uda_cache_misses_total. - Acceptance: identical query within TTL returns
X-Cache-Hit: true; mutating query bypasses cache; cache metrics appear in/metrics; TTL=0 disables caching for a connector.
UDA-S1 — Spike: OpenLineage integration (issue #50)¶
- Type: Spike · Size: S · Estimate: 2 · Priority: P2 · Time-box: ½ session
- Question: Determine the minimal OpenLineage event model for UDA queries (dataset-in / dataset-out, job name, run ID) and whether the OpenLineage HTTP transport can be added as a passthrough without blocking query execution. Output: findings note + recommended event schema
- estimated implementation size for a follow-up Feature.
Dependency graph¶
Phase 1 (shipped) — PR #31
└─ Phase 2 (in-flight) — #33–#44
└─ UDA-F3 (pagination — #47) ← P0; unblocks RT8 GPM-F3
├─ UDA-F4 (RBAC — #48)
│ └─ UDA-F5 (caching — #49)
├─ UDA-F1 (Postgres connector — #45) ← parallel with F2 after Phase 2 stable
└─ UDA-F2 (object-storage connector — #46) ← parallel with F1
UDA-S1 (OpenLineage spike — #50) ← independent; run any time
Critical path: UDA-F3 (pagination) is the highest-priority item — it unblocks RT8's analytics tooling (GPM-F3) which is part of the generative platform maturity train.
Cross-RT dependencies¶
| Downstream RT | Depends on UDA | Reason |
|---|---|---|
| RT7 MCR-F4 (C# data-platform objects) | UDA OpenAPI schema | Middle-core projection interfaces bind to the UDA contract; schema must be stable before MCR-F4 finalises |
| RT8 GPM-F3 (analytics tools over UDA) | UDA-F3 (pagination) | Agent analytics tool nodes must paginate results; blocked until UDA-F3 ships |
| RT8 GPM-EN1 (ArcadeDB pin backend) | UDA ArcadeDB connector (Phase 1/2) | Pin backend queries ArcadeDB via UDA surface |
RT6 provides the data-plane contract; RT7 and RT8 consume it. Versioning rule: a UDA OpenAPI schema change that breaks a response shape must be accompanied by a MCR-F4 / GPM-F3 consumer update in the same sprint; never break consumers silently.
Exit criteria¶
RT6 Phase 3 is done when: - Postgres connector runs integration-tested queries in CI (UDA-F1) - Object-storage connector lists and retrieves blobs against Azurite in CI (UDA-F2) - All list/query endpoints are cursor-paginated; RT8 analytics tools verified against paginated responses (UDA-F3) - Per-connection RBAC enforces scope on every query (UDA-F4) - Query caching reduces repeat query latency with measurable hit-rate metrics (UDA-F5) - OpenLineage spike deliverable reviewed and follow-up Feature sized (UDA-S1)
PI assignment¶
PI-3 (candidate) for Phase 3. Phase 1 shipped in PI-2. Phase 2 is active in PI-2/PI-3. UDA-F3 (pagination) is the Phase 3 item most likely to be pulled into the current sprint to unblock RT8 GPM-F3 — treat it as PI-2 tail / PI-3 head depending on Phase 2 velocity.
Notes¶
- Contract-first. The UDA OpenAPI schema is the integration boundary. All consumer code (middle-core projections, agent tool nodes) must be generated from or validated against the published schema. Never couple consumers to UDA internals.
- Spoke-implementation train. All deliverables are in
nickpclarke/backend-core. No hub template files are modified. - ArcadeDB is persistence, not the reasoner. The UDA routes queries to ArcadeDB as one of several backends; it does not embed ArcadeDB reasoning logic.
- Phase 2 issues (#33–#44) are tracked on the backend-core board. This document covers Phase 3 planning only; consult live issues for Phase 2 status.