Cheatsheet Phase 4 - Architecture & System Design

Framework System Design

5 Etapes (45 min interview)

Requirements (5 min) - Fonctionnels + Non-fonctionnels
Estimation (5 min) - Traffic, Storage, Bandwidth
High-Level Design (15 min) - Composants + APIs
Deep Dive (15 min) - 2-3 composants critiques
Trade-offs (5 min) - Bottlenecks, alternatives

Toujours clarifier avant de designer!

Estimations Rapides

Conversions

1 jour = 86,400 sec ~ 100K sec
1 mois = 2.5M sec
1 an = 31.5M sec ~ 30M sec

1 KB = 1,000 bytes
1 MB = 1,000 KB = 10^6 bytes
1 GB = 1,000 MB = 10^9 bytes
1 TB = 1,000 GB = 10^12 bytes

Formules

QPS = DAU * actions/user / 86400
Storage/an = writes/day * size * 365
Bandwidth = QPS * size

Disponibilite (Nines)

SLA	Downtime/an	Downtime/mois
99%	3.65 jours	7.2h
99.9%	8.76h	43.8 min
99.99%	52.6 min	4.38 min
99.999%	5.26 min	26 sec

Dispo systeme = D1 * D2 * D3...
Avec redondance = 1 - (1-D)^n

CAP Theorem

Consistency - Availability - Partition Tolerance

En distribue: choisir C ou A quand partition

Type	Exemples	Use Case
CP	MongoDB, HBase, Redis Cluster	Banking, Inventory
AP	Cassandra, DynamoDB, CouchDB	Social, Analytics

ACID vs BASE

ACID: Atomicity, Consistency, Isolation, Durability
BASE: Basically Available, Soft state, Eventually consistent

Load Balancing

Algorithmes

Algo	Description
Round Robin	Chacun son tour
Weighted RR	Pondere par capacite
Least Conn	Moins de connexions
IP Hash	Sticky sessions

Layer 4 vs Layer 7

L4: TCP/UDP, rapide, simple
L7: HTTP, routing intelligent, SSL termination

Caching Strategies

Pattern	Read	Write
Cache-Aside	App check cache	DB puis invalide
Write-Through	Via cache	Cache + DB sync
Write-Behind	Via cache	Cache async DB
Read-Through	Cache fetch DB	Direct DB

Eviction Policies

LRU - Least Recently Used (defaut)
LFU - Least Frequently Used
FIFO - First In First Out
TTL - Time To Live

Database Selection

Type	Exemples	Use Case
Relational	PostgreSQL, MySQL	ACID, relations
Document	MongoDB	Flexible schema
Key-Value	Redis, DynamoDB	Cache, sessions
Wide-Column	Cassandra, HBase	Time-series, logs
Graph	Neo4j	Relations complexes

Scaling

Vertical: Plus de RAM/CPU
Horizontal: Sharding, Replication
Read Replicas: Scale reads
Sharding: Scale writes

Sharding Strategies

Strategie	Pro	Con
Hash-based	Distribution uniforme	Resharding dur
Range-based	Range queries	Hotspots
Geographic	Latence locale	Cross-region
Directory	Flexible	Lookup overhead

Consistent Hashing

Ring avec virtual nodes
Minimise redistribution lors
d'ajout/suppression de nodes

Message Queues

Tech	Use Case
RabbitMQ	Routing complexe, RPC
Kafka	Event streaming, logs
SQS	Simple, serverless
Redis Pub/Sub	Real-time, ephemere

Patterns

Point-to-Point: 1 consumer
Pub/Sub: Multiple consumers
Fan-out: Broadcast
Fan-in: Aggregation

Design Patterns

CQRS

Commands -> Write Model -> Write DB
Queries  -> Read Model  -> Read DB
Use: Read/Write asymetrique

Event Sourcing

Store events, not state
Replay to reconstruct
Use: Audit, time-travel

Saga Pattern

Distributed transactions
Choreography: Events
Orchestration: Coordinator
Compensating actions for rollback

Resilience Patterns

Circuit Breaker

CLOSED -> failures > threshold -> OPEN
OPEN -> timeout -> HALF-OPEN
HALF-OPEN -> success -> CLOSED
HALF-OPEN -> failure -> OPEN

Autres Patterns

Pattern	Description
Retry	Exponential backoff
Timeout	Fail fast
Bulkhead	Isoler les ressources
Rate Limiting	Throttle requests

Microservices

Principes

Single Responsibility
Database per Service
API-First Design
Decentralized Governance

Communication

Sync	Async
REST	Message Queue
gRPC	Event Bus
GraphQL	Webhooks

Domain-Driven Design

Concepts

Concept	Definition
Bounded Context	Frontiere du modele
Entity	Identite unique (ID)
Value Object	Par valeur, immutable
Aggregate	Cluster transactionnel
Domain Event	Fait metier

1 Bounded Context ~ 1 Microservice

API Design

REST Best Practices

GET    /users       # List
GET    /users/{id}  # Read
POST   /users       # Create
PUT    /users/{id}  # Replace
PATCH  /users/{id}  # Update
DELETE /users/{id}  # Delete

Status codes:
200 OK, 201 Created, 204 No Content
400 Bad Request, 401 Unauthorized
404 Not Found, 500 Server Error

REST vs gRPC vs GraphQL

	REST	gRPC	GraphQL
Format	JSON	Protobuf	JSON
Perf	Bon	Excellent	Bon
Flex	Fixe	Fixe	Flexible

C4 Model

4 Niveaux de Zoom

Context - Systeme + users + externes
Container - Apps, DBs, services
Component - Modules internes
Code - Classes (optionnel)

Outils

Structurizr (officiel)
PlantUML + C4 extension
Draw.io / Diagrams.net
Mermaid (Markdown)

ADR Template

# ADR-NNN: Titre

## Status
Proposed | Accepted | Deprecated

## Context
Pourquoi cette decision?

## Decision
Ce qu'on a decide

## Alternatives
Options considerees

## Consequences
+ Positifs
- Negatifs
! Risques

## References
Links, docs

Case Studies Communs

Systeme	Focus
Twitter	Fan-out, Timeline
Instagram	Photo storage, Feed
Netflix	Video CDN, Adaptive
Uber	Geo, Real-time match
WhatsApp	Messaging, E2E crypto
Dropbox	File sync, Dedup
URL Shortener	Hashing, Redirect
Rate Limiter	Token bucket, Sliding

Latences a Connaitre

L1 cache ref:           0.5 ns
L2 cache ref:             7 ns
RAM ref:                100 ns
SSD random read:     150,000 ns (150 us)
HDD seek:         10,000,000 ns (10 ms)
Network same DC:        500 us
Network cross-country:  150 ms

Read 1MB from RAM:      250 us
Read 1MB from SSD:        1 ms
Read 1MB from HDD:       20 ms
Network round trip:       1 ms

Checklist Interview

[ ] Clarifier requirements
[ ] Estimer traffic/storage
[ ] API design (endpoints)
[ ] Database schema
[ ] High-level architecture
[ ] Deep dive critique
[ ] Scaling strategy
[ ] Caching layers
[ ] Single points of failure
[ ] Trade-offs discusses