Aller au contenu principal

Security Runbook - MyTelevision API

Purpose: Guide for responding to security incidents, investigating threats, and maintaining security posture.


Table of Contents

  1. Incident Response Checklist
  2. Common Security Scenarios
  3. Token & Session Management
  4. Rate Limiting & DDoS
  5. Database Security
  6. Secret Management
  7. Audit & Logging
  8. Contact & Escalation

Incident Response Checklist

Immediate Actions (First 15 minutes)

  • ASSESS: Identify the nature and scope of the incident
  • CONTAIN: Isolate affected systems if necessary
  • PRESERVE: Collect logs and evidence before any changes
  • NOTIFY: Alert the security team and stakeholders
  • DOCUMENT: Start incident timeline documentation

Investigation Phase

  • Review application logs (/var/log/mytelevision/ or container logs)
  • Check Redis for suspicious session patterns
  • Review database audit logs
  • Analyze rate limiting metrics
  • Check for unusual API patterns

Recovery Phase

  • Implement necessary fixes
  • Rotate compromised credentials
  • Clear affected caches/sessions
  • Verify system integrity
  • Update monitoring rules

Common Security Scenarios

1. Suspected Account Compromise

Symptoms:

  • Unusual login locations
  • Multiple failed login attempts
  • Unexpected password changes
  • Reports from users

Response:

# 1. Check recent login activity for user
# Query database for user sessions
SELECT * FROM "UserSession"
WHERE "userId" = '<USER_ID>'
ORDER BY "createdAt" DESC
LIMIT 50;

# For multi-tenant system
SELECT * FROM "AccountSession"
WHERE "accountId" = '<ACCOUNT_ID>'
ORDER BY "createdAt" DESC
LIMIT 50;

# 2. Revoke all sessions
DELETE FROM "UserSession" WHERE "userId" = '<USER_ID>';
DELETE FROM "AccountSession" WHERE "accountId" = '<ACCOUNT_ID>';

# 3. Clear Redis session cache
redis-cli KEYS "session:user:<USER_ID>:*" | xargs redis-cli DEL
redis-cli KEYS "account:<ACCOUNT_ID>:*" | xargs redis-cli DEL

# 4. Force password reset (mark in database)
UPDATE "User" SET "forcePasswordChange" = true WHERE id = '<USER_ID>';

2. Brute Force Attack Detection

Symptoms:

  • High volume of 401 responses
  • Same IP hitting login endpoint
  • Rate limiter triggered frequently

Response:

# 1. Identify attacking IPs from logs
grep "POST /api/v2/auth/login" /var/log/nginx/access.log | \
awk '{print $1}' | sort | uniq -c | sort -rn | head -20

# 2. Check rate limit counters in Redis
redis-cli KEYS "throttle:*" | head -50

# 3. Block IP at firewall level (temporary)
# For iptables:
iptables -A INPUT -s <ATTACKER_IP> -j DROP

# For nginx:
# Add to /etc/nginx/conf.d/blocked.conf:
# deny <ATTACKER_IP>;

# 4. Monitor rate limit effectiveness
redis-cli MONITOR | grep throttle

3. JWT Token Theft Suspected

Symptoms:

  • Same token used from multiple IPs
  • Token used after logout
  • Impossible travel patterns

Response:

# 1. Blacklist the compromised token family
# Get token family from JWT claims
redis-cli SADD "blacklist:tokens" "<TOKEN_JTI>"
redis-cli SADD "blacklist:families" "<TOKEN_FAMILY_ID>"

# 2. Revoke all sessions for affected user
# See Account Compromise section above

# 3. If token signing key suspected compromised:
# CRITICAL: Rotate JWT_SECRET
# This will invalidate ALL active tokens
# Update .env and restart all instances

# 4. Check for token reuse
SELECT * FROM "AccountSession"
WHERE "tokenFamily" = '<FAMILY_ID>'
ORDER BY "createdAt" DESC;

4. Data Exfiltration Attempt

Symptoms:

  • Abnormally high data transfer
  • Bulk API requests
  • Scraping patterns

Response:

# 1. Identify high-volume requesters
# Check application logs for bulk requests
grep "GET /api/v2/movies" /var/log/app/*.log | \
cut -d' ' -f1 | sort | uniq -c | sort -rn | head -20

# 2. Implement emergency rate limits
# Update rate limit configuration in environment:
THROTTLE_LIMIT=10 # Reduce from 100

# 3. Add user-agent blocking if scraper
# In nginx:
if ($http_user_agent ~* (scrapy|bot|crawler)) {
return 403;
}

# 4. Enable request logging for investigation
# Set LOG_LEVEL=debug temporarily

Token & Session Management

Token Architecture

Token TypeStorageTTLHash
Access Token (Legacy)DB + Redis1hSHA256
Refresh Token (Legacy)DB7dSHA256
Access Token (Multi-tenant)Redis1hSHA256
Refresh Token (Multi-tenant)DB7dSHA256
Stream TokenNone (signed)4hHMAC-SHA256

Emergency Token Operations

# Blacklist a specific token
redis-cli SADD "token:blacklist" "<TOKEN_JTI>"
redis-cli EXPIRE "token:blacklist" 86400 # 24h

# Clear all sessions for an account
redis-cli KEYS "session:account:<ACCOUNT_ID>:*" | xargs redis-cli DEL

# Force re-authentication for all users (EXTREME)
redis-cli FLUSHDB # WARNING: Clears ALL Redis data

# Rotate JWT secret (invalidates all tokens)
# 1. Generate new secret
openssl rand -base64 64

# 2. Update environment
# JWT_SECRET=<new_secret>
# JWT_REFRESH_SECRET=<new_refresh_secret>

# 3. Restart all instances
docker-compose restart api

Rate Limiting & DDoS

Rate Limit Tiers

TierLimitWindowKey Pattern
Short3 req1 secthrottle:short:<ip>
Medium20 req10 secthrottle:medium:<ip>
Long100 req60 secthrottle:long:<ip>
Profilevariesvariesthrottle:<tenant>:<profile>:<tier>

DDoS Response

# 1. Check current rate limit status
redis-cli KEYS "throttle:*" | wc -l

# 2. Identify top offenders
redis-cli --scan --pattern "throttle:*" | \
xargs -I {} redis-cli GET {} | sort -rn | head -20

# 3. Emergency rate limit reduction
# Update environment variables:
THROTTLE_LIMIT=20 # Reduce from 100
THROTTLE_TTL=120000 # Increase window to 2 minutes

# 4. Enable Cloudflare Under Attack mode (if using)
# Via API or dashboard

# 5. Scale up API instances
docker-compose up -d --scale api=5

Database Security

Suspicious Query Detection

-- Find users with excessive failed logins
SELECT u.email, COUNT(*) as failed_attempts
FROM "User" u
JOIN "AuditLog" a ON a."userId" = u.id
WHERE a.action = 'LOGIN_FAILED'
AND a."createdAt" > NOW() - INTERVAL '1 hour'
GROUP BY u.email
HAVING COUNT(*) > 5
ORDER BY failed_attempts DESC;

-- Find accounts with unusual session counts
SELECT a.email, COUNT(s.id) as session_count
FROM "Account" a
JOIN "AccountSession" s ON s."accountId" = a.id
WHERE s.status = 'ACTIVE'
GROUP BY a.email
HAVING COUNT(s.id) > 10
ORDER BY session_count DESC;

-- Find profiles with PIN lockout
SELECT p.*, a.email
FROM "Profile" p
JOIN "Account" a ON a.id = p."accountId"
WHERE p."pinLockedUntil" > NOW();

Emergency Database Operations

# Backup before any changes
pg_dump -h localhost -U mytelevision mytelevision > backup_$(date +%Y%m%d_%H%M%S).sql

# Lock user account
UPDATE "User" SET status = 'LOCKED' WHERE id = '<USER_ID>';
UPDATE "Account" SET status = 'LOCKED' WHERE id = '<ACCOUNT_ID>';

# Revoke admin privileges
UPDATE "User" SET role = 'USER' WHERE id = '<USER_ID>';

# Clear sensitive data (GDPR request)
UPDATE "User" SET
email = 'deleted_' || id || '@deleted.local',
"passwordHash" = 'DELETED',
"firstName" = 'Deleted',
"lastName" = 'User',
"deletedAt" = NOW()
WHERE id = '<USER_ID>';

Secret Management

Secret Rotation Checklist

SecretRotation FrequencyImpact
JWT_SECRETOn compromiseInvalidates all access tokens
JWT_REFRESH_SECRETOn compromiseInvalidates all refresh tokens
DATABASE_URLQuarterlyRequires restart
REDIS_PASSWORDQuarterlyRequires restart
TMDB_API_KEYOn compromiseAffects content metadata
FIREBASE_PRIVATE_KEYOn compromiseAffects social auth
STREAMING_SIGNING_SECRETOn compromiseInvalidates stream tokens
STREAMING_AES128_KEYOn compromiseAffects DRM

Secret Rotation Procedure

# 1. Generate new secret
NEW_SECRET=$(openssl rand -base64 64 | tr -d '\n')

# 2. Update in secrets manager (or .env for dev)
# For Docker Swarm:
echo "$NEW_SECRET" | docker secret create jwt_secret_v2 -

# 3. Update service to use new secret
# In docker-compose.production.yml:
# secrets:
# - jwt_secret_v2

# 4. Rolling restart
docker service update --secret-rm jwt_secret --secret-add jwt_secret_v2 mytelevision_api

# 5. Remove old secret after grace period
docker secret rm jwt_secret

Audit & Logging

Log Locations

Log TypeLocationRetention
Applicationstdout/stderr (container)30 days
Nginx Access/var/log/nginx/access.log90 days
Nginx Error/var/log/nginx/error.log90 days
Database/var/log/postgresql/30 days
Redis/var/log/redis/7 days

Key Log Patterns to Monitor

# Failed authentications
grep -E "(401|Invalid credentials|Unauthorized)" /var/log/app/*.log

# Rate limit triggers
grep "Too Many Requests" /var/log/app/*.log

# Token errors
grep -E "(token|jwt|JWT)" /var/log/app/*.log | grep -i error

# Database errors
grep -E "(prisma|database|sql)" /var/log/app/*.log | grep -i error

# Security headers issues
curl -I https://api.mytelevision.app/api/v2/health/live | grep -E "(X-|Content-Security|Strict-Transport)"

Setting Up Alerts

# Example Prometheus alert rules
groups:
- name: security
rules:
- alert: HighFailedLogins
expr: rate(auth_login_failures_total[5m]) > 10
for: 2m
labels:
severity: warning
annotations:
summary: 'High rate of failed logins'

- alert: RateLimitExceeded
expr: rate(http_requests_total{status="429"}[5m]) > 50
for: 1m
labels:
severity: warning
annotations:
summary: 'High rate of rate-limited requests'

Contact & Escalation

Escalation Matrix

SeverityResponse TimeEscalation
Critical (data breach)15 minCTO + Legal immediately
High (active attack)30 minTech Lead + Security
Medium (suspicious activity)2 hoursOn-call engineer
Low (policy violation)24 hoursTeam lead

Security Contacts

RoleContactAvailability
Security Lead[email protected]24/7
CTO[email protected]Business hours
On-Call[email protected]24/7
Legal[email protected]Business hours

External Resources

  • GitHub Security Advisories: Monitor for dependency vulnerabilities
  • CVE Database: Check for new vulnerabilities
  • OWASP Cheat Sheets: Reference for secure coding
  • NestJS Security: https://docs.nestjs.com/security/overview

Post-Incident Actions

  1. Document: Complete incident report within 24 hours
  2. Review: Conduct post-mortem within 1 week
  3. Improve: Update runbooks with lessons learned
  4. Test: Verify fixes with security testing
  5. Train: Update team on new procedures if needed