Framework System Design
5 Etapes (45 min interview)
- Requirements (5 min) - Fonctionnels + Non-fonctionnels
- Estimation (5 min) - Traffic, Storage, Bandwidth
- High-Level Design (15 min) - Composants + APIs
- Deep Dive (15 min) - 2-3 composants critiques
- Trade-offs (5 min) - Bottlenecks, alternatives
Toujours clarifier avant de designer!
Estimations Rapides
Conversions
1 jour = 86,400 sec ~ 100K sec
1 mois = 2.5M sec
1 an = 31.5M sec ~ 30M sec
1 KB = 1,000 bytes
1 MB = 1,000 KB = 10^6 bytes
1 GB = 1,000 MB = 10^9 bytes
1 TB = 1,000 GB = 10^12 bytes
Formules
QPS = DAU * actions/user / 86400
Storage/an = writes/day * size * 365
Bandwidth = QPS * size
Disponibilite (Nines)
| SLA | Downtime/an | Downtime/mois |
| 99% | 3.65 jours | 7.2h |
| 99.9% | 8.76h | 43.8 min |
| 99.99% | 52.6 min | 4.38 min |
| 99.999% | 5.26 min | 26 sec |
Dispo systeme = D1 * D2 * D3...
Avec redondance = 1 - (1-D)^n
CAP Theorem
Consistency - Availability - Partition Tolerance
En distribue: choisir C ou A quand partition
| Type | Exemples | Use Case |
| CP | MongoDB, HBase, Redis Cluster | Banking, Inventory |
| AP | Cassandra, DynamoDB, CouchDB | Social, Analytics |
ACID vs BASE
ACID: Atomicity, Consistency, Isolation, Durability
BASE: Basically Available, Soft state, Eventually consistent
Load Balancing
Algorithmes
| Algo | Description |
| Round Robin | Chacun son tour |
| Weighted RR | Pondere par capacite |
| Least Conn | Moins de connexions |
| IP Hash | Sticky sessions |
Layer 4 vs Layer 7
L4: TCP/UDP, rapide, simple
L7: HTTP, routing intelligent, SSL termination
Caching Strategies
| Pattern | Read | Write |
| Cache-Aside | App check cache | DB puis invalide |
| Write-Through | Via cache | Cache + DB sync |
| Write-Behind | Via cache | Cache async DB |
| Read-Through | Cache fetch DB | Direct DB |
Eviction Policies
LRU - Least Recently Used (defaut)
LFU - Least Frequently Used
FIFO - First In First Out
TTL - Time To Live
Database Selection
| Type | Exemples | Use Case |
| Relational | PostgreSQL, MySQL | ACID, relations |
| Document | MongoDB | Flexible schema |
| Key-Value | Redis, DynamoDB | Cache, sessions |
| Wide-Column | Cassandra, HBase | Time-series, logs |
| Graph | Neo4j | Relations complexes |
Scaling
Vertical: Plus de RAM/CPU
Horizontal: Sharding, Replication
Read Replicas: Scale reads
Sharding: Scale writes
Sharding Strategies
| Strategie | Pro | Con |
| Hash-based | Distribution uniforme | Resharding dur |
| Range-based | Range queries | Hotspots |
| Geographic | Latence locale | Cross-region |
| Directory | Flexible | Lookup overhead |
Consistent Hashing
Ring avec virtual nodes
Minimise redistribution lors
d'ajout/suppression de nodes
Message Queues
| Tech | Use Case |
| RabbitMQ | Routing complexe, RPC |
| Kafka | Event streaming, logs |
| SQS | Simple, serverless |
| Redis Pub/Sub | Real-time, ephemere |
Patterns
Point-to-Point: 1 consumer
Pub/Sub: Multiple consumers
Fan-out: Broadcast
Fan-in: Aggregation
Design Patterns
CQRS
Commands -> Write Model -> Write DB
Queries -> Read Model -> Read DB
Use: Read/Write asymetrique
Event Sourcing
Store events, not state
Replay to reconstruct
Use: Audit, time-travel
Saga Pattern
Distributed transactions
Choreography: Events
Orchestration: Coordinator
Compensating actions for rollback
Resilience Patterns
Circuit Breaker
CLOSED -> failures > threshold -> OPEN
OPEN -> timeout -> HALF-OPEN
HALF-OPEN -> success -> CLOSED
HALF-OPEN -> failure -> OPEN
Autres Patterns
| Pattern | Description |
| Retry | Exponential backoff |
| Timeout | Fail fast |
| Bulkhead | Isoler les ressources |
| Rate Limiting | Throttle requests |
Microservices
Principes
- Single Responsibility
- Database per Service
- API-First Design
- Decentralized Governance
Communication
| Sync | Async |
| REST | Message Queue |
| gRPC | Event Bus |
| GraphQL | Webhooks |
Domain-Driven Design
Concepts
| Concept | Definition |
| Bounded Context | Frontiere du modele |
| Entity | Identite unique (ID) |
| Value Object | Par valeur, immutable |
| Aggregate | Cluster transactionnel |
| Domain Event | Fait metier |
1 Bounded Context ~ 1 Microservice
API Design
REST Best Practices
GET /users # List
GET /users/{id} # Read
POST /users # Create
PUT /users/{id} # Replace
PATCH /users/{id} # Update
DELETE /users/{id} # Delete
Status codes:
200 OK, 201 Created, 204 No Content
400 Bad Request, 401 Unauthorized
404 Not Found, 500 Server Error
REST vs gRPC vs GraphQL
| REST | gRPC | GraphQL |
| Format | JSON | Protobuf | JSON |
| Perf | Bon | Excellent | Bon |
| Flex | Fixe | Fixe | Flexible |
C4 Model
4 Niveaux de Zoom
- Context - Systeme + users + externes
- Container - Apps, DBs, services
- Component - Modules internes
- Code - Classes (optionnel)
Outils
Structurizr (officiel)
PlantUML + C4 extension
Draw.io / Diagrams.net
Mermaid (Markdown)
ADR Template
# ADR-NNN: Titre
## Status
Proposed | Accepted | Deprecated
## Context
Pourquoi cette decision?
## Decision
Ce qu'on a decide
## Alternatives
Options considerees
## Consequences
+ Positifs
- Negatifs
! Risques
## References
Links, docs
Case Studies Communs
| Systeme | Focus |
| Twitter | Fan-out, Timeline |
| Instagram | Photo storage, Feed |
| Netflix | Video CDN, Adaptive |
| Uber | Geo, Real-time match |
| WhatsApp | Messaging, E2E crypto |
| Dropbox | File sync, Dedup |
| URL Shortener | Hashing, Redirect |
| Rate Limiter | Token bucket, Sliding |
Latences a Connaitre
L1 cache ref: 0.5 ns
L2 cache ref: 7 ns
RAM ref: 100 ns
SSD random read: 150,000 ns (150 us)
HDD seek: 10,000,000 ns (10 ms)
Network same DC: 500 us
Network cross-country: 150 ms
Read 1MB from RAM: 250 us
Read 1MB from SSD: 1 ms
Read 1MB from HDD: 20 ms
Network round trip: 1 ms
Checklist Interview
- [ ] Clarifier requirements
- [ ] Estimer traffic/storage
- [ ] API design (endpoints)
- [ ] Database schema
- [ ] High-level architecture
- [ ] Deep dive critique
- [ ] Scaling strategy
- [ ] Caching layers
- [ ] Single points of failure
- [ ] Trade-offs discusses
Formation Architecte Systeme - Phase 4 Architecture | Version imprimable