Cheatsheet Phase 4 - Architecture & System Design

Framework System Design

5 Etapes (45 min interview)

  1. Requirements (5 min) - Fonctionnels + Non-fonctionnels
  2. Estimation (5 min) - Traffic, Storage, Bandwidth
  3. High-Level Design (15 min) - Composants + APIs
  4. Deep Dive (15 min) - 2-3 composants critiques
  5. Trade-offs (5 min) - Bottlenecks, alternatives
Toujours clarifier avant de designer!

Estimations Rapides

Conversions

1 jour = 86,400 sec ~ 100K sec
1 mois = 2.5M sec
1 an = 31.5M sec ~ 30M sec

1 KB = 1,000 bytes
1 MB = 1,000 KB = 10^6 bytes
1 GB = 1,000 MB = 10^9 bytes
1 TB = 1,000 GB = 10^12 bytes

Formules

QPS = DAU * actions/user / 86400
Storage/an = writes/day * size * 365
Bandwidth = QPS * size

Disponibilite (Nines)

SLADowntime/anDowntime/mois
99%3.65 jours7.2h
99.9%8.76h43.8 min
99.99%52.6 min4.38 min
99.999%5.26 min26 sec
Dispo systeme = D1 * D2 * D3...
Avec redondance = 1 - (1-D)^n

CAP Theorem

Consistency - Availability - Partition Tolerance

En distribue: choisir C ou A quand partition

TypeExemplesUse Case
CPMongoDB, HBase, Redis ClusterBanking, Inventory
APCassandra, DynamoDB, CouchDBSocial, Analytics

ACID vs BASE

ACID: Atomicity, Consistency, Isolation, Durability
BASE: Basically Available, Soft state, Eventually consistent

Load Balancing

Algorithmes

AlgoDescription
Round RobinChacun son tour
Weighted RRPondere par capacite
Least ConnMoins de connexions
IP HashSticky sessions

Layer 4 vs Layer 7

L4: TCP/UDP, rapide, simple
L7: HTTP, routing intelligent, SSL termination

Caching Strategies

PatternReadWrite
Cache-AsideApp check cacheDB puis invalide
Write-ThroughVia cacheCache + DB sync
Write-BehindVia cacheCache async DB
Read-ThroughCache fetch DBDirect DB

Eviction Policies

LRU - Least Recently Used (defaut)
LFU - Least Frequently Used
FIFO - First In First Out
TTL - Time To Live

Database Selection

TypeExemplesUse Case
RelationalPostgreSQL, MySQLACID, relations
DocumentMongoDBFlexible schema
Key-ValueRedis, DynamoDBCache, sessions
Wide-ColumnCassandra, HBaseTime-series, logs
GraphNeo4jRelations complexes

Scaling

Vertical: Plus de RAM/CPU
Horizontal: Sharding, Replication
Read Replicas: Scale reads
Sharding: Scale writes

Sharding Strategies

StrategieProCon
Hash-basedDistribution uniformeResharding dur
Range-basedRange queriesHotspots
GeographicLatence localeCross-region
DirectoryFlexibleLookup overhead

Consistent Hashing

Ring avec virtual nodes
Minimise redistribution lors
d'ajout/suppression de nodes

Message Queues

TechUse Case
RabbitMQRouting complexe, RPC
KafkaEvent streaming, logs
SQSSimple, serverless
Redis Pub/SubReal-time, ephemere

Patterns

Point-to-Point: 1 consumer
Pub/Sub: Multiple consumers
Fan-out: Broadcast
Fan-in: Aggregation

Design Patterns

CQRS

Commands -> Write Model -> Write DB
Queries  -> Read Model  -> Read DB
Use: Read/Write asymetrique

Event Sourcing

Store events, not state
Replay to reconstruct
Use: Audit, time-travel

Saga Pattern

Distributed transactions
Choreography: Events
Orchestration: Coordinator
Compensating actions for rollback

Resilience Patterns

Circuit Breaker

CLOSED -> failures > threshold -> OPEN
OPEN -> timeout -> HALF-OPEN
HALF-OPEN -> success -> CLOSED
HALF-OPEN -> failure -> OPEN

Autres Patterns

PatternDescription
RetryExponential backoff
TimeoutFail fast
BulkheadIsoler les ressources
Rate LimitingThrottle requests

Microservices

Principes

  • Single Responsibility
  • Database per Service
  • API-First Design
  • Decentralized Governance

Communication

SyncAsync
RESTMessage Queue
gRPCEvent Bus
GraphQLWebhooks

Domain-Driven Design

Concepts

ConceptDefinition
Bounded ContextFrontiere du modele
EntityIdentite unique (ID)
Value ObjectPar valeur, immutable
AggregateCluster transactionnel
Domain EventFait metier
1 Bounded Context ~ 1 Microservice

API Design

REST Best Practices

GET    /users       # List
GET    /users/{id}  # Read
POST   /users       # Create
PUT    /users/{id}  # Replace
PATCH  /users/{id}  # Update
DELETE /users/{id}  # Delete

Status codes:
200 OK, 201 Created, 204 No Content
400 Bad Request, 401 Unauthorized
404 Not Found, 500 Server Error

REST vs gRPC vs GraphQL

RESTgRPCGraphQL
FormatJSONProtobufJSON
PerfBonExcellentBon
FlexFixeFixeFlexible

C4 Model

4 Niveaux de Zoom

  1. Context - Systeme + users + externes
  2. Container - Apps, DBs, services
  3. Component - Modules internes
  4. Code - Classes (optionnel)

Outils

Structurizr (officiel)
PlantUML + C4 extension
Draw.io / Diagrams.net
Mermaid (Markdown)

ADR Template

# ADR-NNN: Titre

## Status
Proposed | Accepted | Deprecated

## Context
Pourquoi cette decision?

## Decision
Ce qu'on a decide

## Alternatives
Options considerees

## Consequences
+ Positifs
- Negatifs
! Risques

## References
Links, docs

Case Studies Communs

SystemeFocus
TwitterFan-out, Timeline
InstagramPhoto storage, Feed
NetflixVideo CDN, Adaptive
UberGeo, Real-time match
WhatsAppMessaging, E2E crypto
DropboxFile sync, Dedup
URL ShortenerHashing, Redirect
Rate LimiterToken bucket, Sliding

Latences a Connaitre

L1 cache ref:           0.5 ns
L2 cache ref:             7 ns
RAM ref:                100 ns
SSD random read:     150,000 ns (150 us)
HDD seek:         10,000,000 ns (10 ms)
Network same DC:        500 us
Network cross-country:  150 ms

Read 1MB from RAM:      250 us
Read 1MB from SSD:        1 ms
Read 1MB from HDD:       20 ms
Network round trip:       1 ms

Checklist Interview

  • [ ] Clarifier requirements
  • [ ] Estimer traffic/storage
  • [ ] API design (endpoints)
  • [ ] Database schema
  • [ ] High-level architecture
  • [ ] Deep dive critique
  • [ ] Scaling strategy
  • [ ] Caching layers
  • [ ] Single points of failure
  • [ ] Trade-offs discusses
Formation Architecte Systeme - Phase 4 Architecture | Version imprimable