Monitoring

BlindCast adds an encryption layer to your video stack. Monitor these metrics to catch issues before they affect viewers.

Player metrics

The player exposes real-time metrics via getMetrics():

const metrics = player.getMetrics()

Metric	Type	What it means
`timeToFirstFrame`	`number` (ms)	Time from `load()` to first video frame rendered. Includes key fetch + first segment decrypt. Target: under 2s on broadband.
`avgDecryptTime`	`number` (ms)	Average time per segment decryption. Should be under 5ms on modern hardware.
`avgFragLoadTime`	`number` (ms)	Average time per fragment download. High values indicate CDN or bandwidth issues.
`avgKeyFetchTime`	`number` (ms)	Average time per key server request. Target: under 100ms p95.
`keyFetchCount`	`number`	Total key requests made.
`qualitySwitches`	`number`	Number of ABR quality level changes. Frequent switches indicate unstable bandwidth.
`fragmentsLoaded`	`number`	Total segments loaded so far.
`stallCount`	`number`	Number of playback stalls (buffer underruns). Should be 0 under normal conditions.

Reporting metrics

Send metrics to your analytics backend on playback end or periodically:

player.on("ended", () => {
  const metrics = player.getMetrics()
  analytics.track("video_playback_complete", {
    contentId,
    timeToFirstFrame: metrics.timeToFirstFrame,
    avgDecryptTime: metrics.avgDecryptTime,
    avgKeyFetchTime: metrics.avgKeyFetchTime,
    stallCount: metrics.stallCount,
  })
})

What to alert on

Condition	Possible cause
`timeToFirstFrame` > 5s (p95)	Key server latency, slow CDN, or large first segment
`stallCount` > 0 (frequent)	CDN throughput issues or client CPU overloaded from decryption
`avgKeyFetchTime` > 500ms (p95)	Key server overloaded or network issue

Key server metrics

Health check

GET /health returns 200 OK when the server is running. Use this for:

Docker health checks: HEALTHCHECK CMD curl -f http://localhost:4100/health
Load balancer probes
Uptime monitoring (Pingdom, Better Uptime, etc.)

Request logging

The key server logs each request to stdout in JSON format. Pipe to your log aggregator (Datadog, CloudWatch, etc.) and monitor:

Metric	How to measure	Alert threshold
Key fetch latency	p50/p95 of `GET /keys/:contentId` response time	p95 > 100ms
Error rate (4xx)	Count of 401 + 403 responses / total requests	> 5%
Error rate (5xx)	Count of 500 responses / total requests	> 0.1%
Lease creation rate	Count of `POST /keys/leases` per minute	Unusual spike (>2x baseline)
Lease revocation rate	Count of `POST /keys/leases/revoke` per minute	Unusual spike

Prometheus metrics (optional)

If you run a reverse proxy (nginx, Caddy) in front of the key server, use its built-in Prometheus exporter to track request rates, latency histograms, and error codes.

Database monitoring

Postgres

If using Postgres for lease storage:

Metric	What to watch
Connection count	Should stay well below `max_connections`
Query latency	Lease queries should be under 10ms
Table size	`leases` table grows over time — monitor row count
Dead tuples	Run `VACUUM` if dead tuple ratio is high

SQLite

If using SQLite (single-instance deployments):

Metric	What to watch
Database file size	Monitor `/data/blindcast.db` size
WAL file size	Large WAL files indicate slow checkpointing
Disk space	SQLite needs free disk space for journaling

Infrastructure

Component	Health check	What to monitor
Key server container	`GET /health`	CPU, memory, restart count
Postgres	`pg_isready`	Connections, replication lag, disk
S3 / R2	AWS Health Dashboard	4xx/5xx error rates on GET requests
CDN	Provider dashboard	Cache hit ratio, bandwidth, error rates

CDN cache hit ratio

Target: >95% cache hit ratio for segment requests. Low hit ratios mean the CDN is fetching from origin on most requests, adding latency and cost.

# CloudFront: check via CloudWatch metric "CacheHitRate"
# Cloudflare: check via Analytics dashboard → Cache tab

Dashboard template

Build a monitoring dashboard with these panels:

Player experience — Time to first frame (p50, p95, p99), stall rate
Key server — Request rate, latency (p50, p95), error rate (4xx, 5xx)
Leases — Active lease count, creation rate, revocation rate
Infrastructure — Container CPU/memory, DB connections, CDN cache hit ratio

Debugging playback issues

When a viewer reports playback problems:

Check key server logs — Was the key request successful? Look for 401 (auth), 403 (lease revoked/expired), 500 (server error).
Check player metrics — If avgKeyFetchTime is high, the issue is key server latency. If stallCount is high, the issue is CDN or bandwidth.
Check CDN logs — Are segments being served? Look for 403 (CORS) or 404 (missing segments).
Check lease state — If using leases, query the database: SELECT * FROM leases WHERE viewer_id = 'user-123' ORDER BY created_at DESC LIMIT 5;

Introduction

Getting Started

Server

Player

CLI

Uploader

Key Server

Serverless

Going to Production

Reference

Player metrics

Reporting metrics

What to alert on

Key server metrics

Health check

Request logging

Prometheus metrics (optional)

Database monitoring

Postgres

SQLite

Infrastructure

CDN cache hit ratio

Dashboard template

Debugging playback issues

​Player metrics

​Reporting metrics

​What to alert on

​Key server metrics

​Health check

​Request logging

​Prometheus metrics (optional)

​Database monitoring

​Postgres

​SQLite

​Infrastructure

​CDN cache hit ratio

​Dashboard template

​Debugging playback issues

Player metrics

Reporting metrics

What to alert on

Key server metrics

Health check

Request logging

Prometheus metrics (optional)

Database monitoring

Postgres

SQLite

Infrastructure

CDN cache hit ratio

Dashboard template

Debugging playback issues