| Modele | ID | In/Out (1M tok) | Contexte |
|---|---|---|---|
| Opus 4.6 | claude-opus-4-6 | $5 / $25 | 1M |
| Sonnet 4.5 | claude-sonnet-4-5-20250929 | $3 / $15 | 1M |
| Haiku 4.5 | claude-haiku-4-5-20251001 | $1 / $5 | 200K |
import anthropic client = anthropic.Anthropic() # ANTHROPIC_API_KEY env message = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, system="Tu es un expert en architecture.", messages=[ {"role": "user", "content": "Explique les microservices"} ] ) print(message.content[0].text)
| Param | Type | Description |
|---|---|---|
| model | string* | ID du modele |
| max_tokens | int* | Max tokens en sortie |
| messages | array* | Historique conversation |
| system | string | System prompt |
| temperature | float | 0-1 (defaut 1.0) |
| top_p | float | Nucleus sampling |
| top_k | int | Top-k sampling |
| stop_sequences | array | Sequences d'arret |
| stream | bool | Activer streaming |
* = requis
{
"id": "msg_01XFDUDYJgAACzvnptvVoYEL",
"type": "message",
"role": "assistant",
"content": [{
"type": "text",
"text": "Les microservices sont..."
}],
"model": "claude-sonnet-4-5-20250929",
"stop_reason": "end_turn",
"usage": {
"input_tokens": 25,
"output_tokens": 150
}
}
with client.messages.stream( model="claude-sonnet-4-5-20250929", max_tokens=1024, messages=[{"role": "user", "content": "Ecris un poeme"}] ) as stream: for text in stream.text_stream: print(text, end="", flush=True)
Events SSE: message_start, content_block_start, content_block_delta, message_delta, message_stop
import base64 with open("image.png", "rb") as f: data = base64.standard_b64encode(f.read()).decode() msg = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, messages=[{"role": "user", "content": [ {"type": "image", "source": { "type": "base64", "media_type": "image/png", "data": data}}, {"type": "text", "text": "Decris cette image"} ]}] )
Formats: JPEG, PNG, GIF, WebP | Max 20MB/image
| Code | Signification | Action |
|---|---|---|
| 200 | Succes | Traiter la reponse |
| 400 | Bad Request | Verifier les params |
| 401 | Unauthorized | Verifier la cle API |
| 403 | Forbidden | Verifier permissions |
| 429 | Rate Limit | Retry avec backoff |
| 500 | Server Error | Retry apres delai |
| 529 | Overloaded | Retry avec backoff |
Headers: x-ratelimit-limit-requests, x-ratelimit-remaining-requests, retry-after
import time def call_with_retry(messages, retries=3): for i in range(retries): try: return client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, messages=messages) except anthropic.RateLimitError: time.sleep(2 ** i) raise Exception("Max retries")
msg = client.messages.create( model="claude-sonnet-4-5-20250929", max_tokens=1024, system=[{ "type": "text", "text": "LONG SYSTEM PROMPT...", "cache_control": {"type": "ephemeral"} }], messages=[{"role": "user", "content": "Question"}] )
batch = client.messages.batches.create( requests=[ {"custom_id": "r1", "params": { "model": "claude-sonnet-4-5-20250929", "max_tokens": 1024, "messages": [{"role":"user", "content":"Q1"}] }}, # ... plus de requetes ] ) # Poll: client.messages.batches.retrieve(batch.id)
Use case: Traitement non-urgent en masse
# Zero-shot CoT "Resous ce probleme etape par etape:" # Structured CoT """Analyse cette architecture: 1. Identifie les composants 2. Evalue chaque composant 3. Identifie les faiblesses 4. Propose des ameliorations 5. Justifie chaque choix"""
Quand: Raisonnement complexe, math, logique, debug
"""Classifie le sentiment:
Texte: "Super produit!" -> Positif
Texte: "Decevant" -> Negatif
Texte: "Correct, rien de special" -> Neutre
Texte: "J'adore cette formation!" -> """
<context> Systeme e-commerce, 1M utilisateurs </context> <task> Identifie les 3 risques architecturaux </task> <format> JSON: {risk, severity(1-5), mitigation} </format> <constraints> - Maximum 200 mots par risque - Inclure des exemples concrets </constraints>
"""Tu es un [ROLE] avec [EXPERIENCE].
CONTEXTE:
[Description du contexte]
REGLES:
- Regle 1
- Regle 2
- Regle 3
FORMAT DE SORTIE:
[Format attendu]
CONTRAINTES:
- Ne pas [contrainte 1]
- Toujours [contrainte 2]
EXEMPLES:
[Exemple entree] -> [Exemple sortie]
"""
Tache simple? OUI -> Haiku 4.5 ($1/$5) Classification, extraction, routing NON -> Besoin max qualite? OUI -> Opus 4.6 ($5/$25) Agents, raisonnement, code complexe NON -> Sonnet 4.5 ($3/$15) Coding, analyse, generation
Formule:
cout = (input_tokens / 1M * prix_input)
+ (output_tokens / 1M * prix_output)
Exemple Sonnet:
1000 tokens in + 500 tokens out = (1000/1M * $3) + (500/1M * $15) = $0.003 + $0.0075 = $0.0105 par requete 10K requetes/jour = $105/jour
| Contenu | Tokens (approx) |
|---|---|
| 1 mot anglais | ~1.3 tokens |
| 1 mot francais | ~1.5 tokens |
| 1 page texte (~500 mots) | ~750 tokens |
| 1 ligne de code | ~10-15 tokens |
| 200K tokens | ~500 pages |
| 1M tokens | ~2500 pages |
# Python pip install anthropic # TypeScript/JavaScript npm install @anthropic-ai/sdk # Claude Code CLI npm install -g @anthropic-ai/claude-code # Variable d'environnement export ANTHROPIC_API_KEY=sk-ant-api03-... # Verification Python python -c "import anthropic; print(anthropic.__version__)"