Prompt Engineering

Claude 3.5 Sonnet Optimization: How XML Tags Impact Tokenization and Cost

May 12, 20267 min read

The Anthropic Way: XML Tag Structuring

Unlike OpenAI's models, which are optimized heavily for system messages and Markdown keys, Anthropic's Claude family (including Claude 3 Opus and Claude 3.5 Sonnet) was trained specifically on XML-delimited text.

Anthropic recommends enclosing documents, rules, examples, and user context in tags like <context>, <instructions>, and <example>. While XML delimiters greatly enhance Claude's instruction adherence and accuracy, using them incorrectly can inflate token usage, especially when working with massive multi-shot prompt templates.

In this article, we will analyze the interaction between XML structure and Claude's tokenizer, and establish a pattern for low-cost, high-precision XML design.


1. Why XML Tags Are Highly Effective for Claude

Claude's underlying training data utilized XML tags to separate structured data, few-shot examples, and documents. Because of this, Claude has developed a strong cognitive bias towards recognizing content enclosed inside tags:

xml
<instructions>
Summarize the text enclosed in the <document> tags.
</instructions>

<document>
[Dynamic User Article Payload...]
</document>

By separating the instructions from the document using unique XML tags, Claude is highly unlikely to confuse user content with system instructions, protecting your application from prompt-injection vulnerabilities.


2. Minimizing XML Token Overheads

While XML is powerful, developers often make the mistake of creating highly verbose tag naming schemes. Remember that every character inside your tag names must be tokenized!

Verbose XML Tag Pattern:

xml
<UserAcmeSystemFormattingInstructionsAndRules>
Provide your response in JSON format.
</UserAcmeSystemFormattingInstructionsAndRules>

This custom tag uses 24 tokens just to open and close!

High-Efficiency XML Tag Pattern:

xml
<rules>
Provide your response in JSON format.
</rules>

This optimized tag uses only 4 tokens—saving 20 tokens on every single request.


3. Best Practices for XML Compression

To get the most out of Claude while minimizing API expenditures, apply these techniques:

Rules for XML Optimization:

  1. Keep Tag Names Extremely Short: Use single words or abbreviations for tag names (e.g., <docs> instead of <ReferenceDocumentation>, <cfg> instead of <SystemConfigurationInstructions>).
  2. Avoid Nested Tag Overkill: Do not nest XML tags three or four levels deep unless absolutely necessary for complex data structures. Each nesting layer adds closing tag overhead.
  3. Do Not Repeat Tags in User Messages: If a tag is already opened in the system prompt, you do not need to re-open and re-explain it in the user message. Maintain a strict, linear flow of information.
  4. Use Self-Closing Tags for Configuration: If you need to toggle parameters, use self-closing tags like <format type="json" /> rather than verbose blocks.

4. XML Token Benchmark

Here is a quick token count comparison for common XML patterns in Anthropic's tokenizer:

PatternVerbose TagOptimized TagSaved Tokens
System Context<SystemContextBackground><context>6 tokens
Few-Shot Example<FewShotPromptExampleCard><example>8 tokens
Reference File<UserUploadedReferenceFile><doc>7 tokens
Output Guidelines<ResponseFormattingRequirements><format>8 tokens

Implementing short tag structures throughout your code templates ensures that Claude continues to respond with elite precision while keeping your token budget completely under control.

Written By

SM
Sarah Miller
Senior Prompt Architect

Sarah Miller is a cognitive engineer and prompt architect who designs high-intent, low-token orchestration layers for enterprise generative AI deployments.

Related Articles