Silo 02
Data Silo
A MySQL-backed analytics service that delivers stable reporting for operations and leadership dashboards.
Requirements
- Refresh reporting tables every 15 minutes with bounded query cost.
- Guarantee p95 report query latency below 500ms under normal load.
- Preserve metric consistency across weekly operations reports.
Stack
- Node.js
- TypeScript
- MySQL
- Prisma
- Nginx
Architecture
- Ingestion workers normalize transactional snapshots into reporting tables.
- Query routes expose constrained, indexed endpoints for key business metrics.
- Caching and pre-aggregation protect read performance during spikes.
Workflow
- Workers pull source snapshots and stage them in landing tables.
- Normalization jobs map records into dimensional report schema.
- Pre-aggregation tasks materialize common KPI views.
- API serves read-only metric endpoints with cache hints and query guards.
Diagram
flowchart TD
A[Source Systems] --> B[Landing Tables]
B --> C[Normalization Workers]
C --> D[MySQL Reporting Schema]
D --> E[Pre-Aggregated Views]
E --> F[Metrics API]
F --> G[Operations Dashboard]
Tradeoffs
- Additional storage used for pre-aggregations to improve read latency.
- Slight freshness delay accepted to keep query performance predictable.
- Schema evolution requires migration discipline across ingest and report layers.
Risks / Failure Modes
- Unbounded ad-hoc query patterns can bypass performance controls.
- Late-arriving source events can skew period aggregates.
- Index drift can degrade dashboard response times over time.
Outcomes
- Stabilized report generation under high concurrency without horizontal over-scaling.
- Reduced slow-query incidents by introducing targeted indexes and query review.
- Shortened decision time for weekly operations reviews with reliable metrics.