Why Use This This skill provides specialized capabilities for aiskillstore's codebase.
Use Cases Developing new features in the aiskillstore repository Refactoring existing code to follow aiskillstore standards Understanding and working with aiskillstore's codebase structure
Install Guide 2 steps 1 2 Install inside Ananke
Click Install Skill, paste the link below, then press Install.
https://github.com/aiskillstore/marketplace/tree/main/skills/benderfendor/storage-debug-instrumentation Skill Snapshot Auto scan of skill assets. Informational only.
Valid SKILL.md Checks against SKILL.md specification
Source & Community
Updated At Jan 19, 2026, 04:39 AM
Skill Stats
SKILL.md 111 Lines
Total Files 1
Total Size 0 B
License NOASSERTION
---
name: storage-debug-instrumentation
description: Add comprehensive debugging and observability tooling for backend storage layers (PostgreSQL, ChromaDB) and startup metrics. Includes storage drift detection, raw data inspection endpoints, and a Next.js admin dashboard.
---
# Storage Debug Instrumentation
## Purpose
Enable rapid diagnosis of storage state, synchronization health, and backend performance bottlenecks by exposing:
- Raw article inspection from both PostgreSQL and ChromaDB
- Storage drift detection (missing/dangling entries)
- Detailed startup timeline breakdown (DB init, cache preload, vector store, RSS refresh)
- One-page debug dashboard consolidating all diagnostics
## Scope
- Backend: `app/services/startup_metrics.py`, `app/main.py`, `app/vector_store.py`, `app/database.py`, `app/api/routes/debug.py`
- Frontend: `frontend/lib/api.ts`, `frontend/app/debug/page.tsx`
- No schema changes; purely additive instrumentation and debug routes
## Workflow
### 1. Create startup metrics service
**File:** `backend/app/services/startup_metrics.py`
- Implement thread-safe `StartupMetrics` class to record phase timings
- Expose `record_event(name, started_at, detail, metadata)` for phase capture
- Support `add_note(key, value)` for arbitrary annotations
- Export singleton `startup_metrics` for app-wide use
### 2. Instrument vector store initialization
**File:** `backend/app/vector_store.py`
- Import `startup_metrics`
- In `VectorStore.__init__()`, wrap initialization with `time.time()` timer
- Record event with metadata: `host`, `port`, `collection`, `documents`
- Catch connection errors and annotate them
### 3. Instrument FastAPI startup sequence
**File:** `backend/app/main.py`
- Call `startup_metrics.mark_app_started()` at beginning of `on_startup()`
- Wrap each phase (DB init, schedulers, cache preload, RSS refresh, migration) with `record_event()`
- Include metadata: `cache_size`, `article_count`, `oldest_article_hours`
- Call `startup_metrics.mark_app_completed()` at end
- Add app version notes via `add_note()`
### 4. Add database pagination helpers
**File:** `backend/app/database.py`
- Implement `fetch_articles_page()` to support:
- Limit/offset pagination
- Optional source filter
- Missing-embeddings-only flag
- Published date range filters
- Sort direction (asc/desc)
- Return oldest/newest timestamp bounds
- Implement `fetch_article_chroma_mappings()` to return all article→chroma ID mappings for drift analysis
### 5. Add vector store pagination helpers
**File:** `backend/app/vector_store.py`
- Implement `list_articles(limit, offset)` to return paginated Chroma documents with metadata and previews
- Implement `list_all_ids()` to return all stored Chroma IDs for drift detection (used by `/debug/storage/drift`)
### 6. Expose debug API endpoints
**File:** `backend/app/api/routes/debug.py`
- Add `GET /debug/startup` → returns startup metrics timeline (events + notes)
- Add `GET /debug/chromadb/articles` → returns paginated raw Chroma entries with limit/offset
- Add `GET /debug/database/articles` → returns paginated Postgres rows with filters (source, embeddings, date range, sort)
- Add `GET /debug/storage/drift` → compares Chroma IDs vs Postgres mappings, returns missing/dangling counts + samples
### 7. Add frontend API bindings
**File:** `frontend/lib/api.ts`
- Export types: `StartupEventMetric`, `StartupMetricsResponse`, `ChromaDebugResponse`, `DatabaseDebugResponse`, `StorageDriftReport`
- Export fetchers: `fetchStartupMetrics()`, `fetchChromaDebugArticles()`, `fetchDatabaseDebugArticles()`, `fetchStorageDrift()`
- Ensure snake_case→camelCase mapping for response fields
### 8. Build debug dashboard page
**File:** `frontend/app/debug/page.tsx`
- Create `/debug` route with multi-tab inspection UI
- Render startup timeline: phase name, duration, metadata badges (cache size, vectors, migrated records)
- Display Chroma browser: paginated table with ID, title, source, preview
- Display Postgres browser: paginated table with filters (source, date range, missing-embeddings-only flag)
- Display drift report: sample tables for missing-in-chroma and dangling-in-chroma entries
- Include summary cards for quick metrics (boot time, total articles, vector count, drift count)
## Implementation checklist
- [ ] Create `backend/app/services/startup_metrics.py`
- [ ] Instrument `backend/app/vector_store.py::VectorStore.__init__()`
- [ ] Instrument `backend/app/main.py::on_startup()` (all phases)
- [ ] Add `fetch_articles_page()` and `fetch_article_chroma_mappings()` to `backend/app/database.py`
- [ ] Add `list_articles()` and `list_all_ids()` to `backend/app/vector_store.py`
- [ ] Add `/debug/startup`, `/debug/chromadb/articles`, `/debug/database/articles`, `/debug/storage/drift` to `backend/app/api/routes/debug.py`
- [ ] Add types and fetchers to `frontend/lib/api.ts`
- [ ] Create `frontend/app/debug/page.tsx` with dashboard layout
- [ ] Run `uvx ruff check backend` → all checks pass
- [ ] Test endpoints in curl or Postman to verify response structure
## Verification checklist
- [ ] `GET http://localhost:8000/debug/startup` returns valid timeline with events and notes
- [ ] `GET http://localhost:8000/debug/chromadb/articles?limit=50&offset=0` returns paginated Chroma docs
- [ ] `GET http://localhost:8000/debug/database/articles?source=bbc&missing_embeddings_only=false` filters correctly
- [ ] `GET http://localhost:8000/debug/storage/drift` compares counts and returns drift samples
- [ ] `http://localhost:3000/debug` loads without errors and displays all four sections
- [ ] Refresh button triggers all four API calls in parallel
- [ ] Pagination controls update limit/offset correctly
- [ ] Database filters (source, date range) update and refresh data
- [ ] Startup timeline shows non-zero phase durations if backend just started
## Future enhancements
- Streaming startup metrics via SSE (live tail during boot)
- Export startup report as JSON/CSV for performance tracking over time
- Automated drift alerts (post to Slack/email if dangling > threshold)
- Performance graphs (startup time trends, article throughput)
- Sync-on-demand action (button to force vector store refresh for missing articles)