Claude Code CLI JSON structured output automation Python 2026

claude -p JSON Output Breaks in Production — Fix It with Structured Output

The Prompt Shelf ·

You built a pipeline. You run claude -p "extract the config values from this file and return JSON". It works in testing. In production, you get:

Here are the configuration values I found:

```json
{
  "database_url": "postgres://...",
  "api_key": "...",

Wait, let me also mention that the max_connections field appears in two places…

{ “database_url”: “postgres://…”,


Your JSON parser throws an exception. The pipeline fails. You wonder if you've been doing this wrong from the start.

You have. Here's what's actually happening and how to fix it.

---

## Why claude -p Output Breaks JSON Parsers

The `claude -p` flag runs Claude Code in "print" mode — a non-interactive session that outputs to stdout and exits. It's designed for scripting. But the default output format is *human-readable text*, not machine-readable JSON.

When you ask Claude to "return JSON" in natural language, you're asking the language model to format its response. It will usually comply. But several things can corrupt the output before it reaches your parser:

### Problem 1: Conversational framing

Claude is a conversational model by default. Without explicit output formatting, it treats your prompt as a conversation, not a function call. That means the JSON you asked for might be wrapped in:

- Introductory sentences ("Here are the values you requested:")
- Explanatory commentary after the JSON
- Inline markdown formatting with triple backticks around the JSON block

All of these break `json.loads()`.

### Problem 2: Extended thinking leaks

Claude's extended thinking feature generates reasoning tokens before the final response. In most configurations, thinking tokens are stripped before output. But in some streaming configurations or older CLI versions, thinking output can leak into stdout alongside the response.

A leaking thinking block looks like:
The user wants a JSON object with the database configuration values. Let me extract...

{ “database_url”: “postgres://…” }


Your parser sees `<thinking>` as invalid JSON and fails immediately.

### Problem 3: Streaming artifacts

The Claude API streams responses token by token. The `claude -p` CLI reassembles this stream before writing to stdout — but it's reassembling *text*, not structured data. Newline handling, buffering, and terminal encoding can all introduce artifacts that make valid JSON unparseable in specific environments.

This is the worst category because it's non-deterministic. The output looks fine in development (your terminal, your locale, your buffer size) and fails in production (a different OS, different Python version, different stdout encoding).

---

## The Fix: --output-format json

Claude Code's CLI has a `--output-format` flag with three options:

- `text` — human-readable text (default)
- `json` — structured JSON envelope
- `stream-json` — newline-delimited JSON for streaming consumers

For scripting use cases, `--output-format json` is what you want. It wraps the entire response in a JSON envelope:

```bash
claude -p "extract database config and return as JSON object" \
  --output-format json

Output:

{
  "type": "result",
  "subtype": "success",
  "cost_usd": 0.0023,
  "duration_ms": 1847,
  "result": "{\n  \"database_url\": \"postgres://localhost:5432/prod\",\n  \"max_connections\": 100,\n  \"ssl_mode\": \"require\"\n}",
  "session_id": "sess_01abc..."
}

The result field contains the model’s response as a string. If you asked Claude to return JSON, the string contains JSON. You parse the outer envelope first, then parse result.

This doesn’t solve the conversational framing problem entirely — Claude might still wrap the JSON in explanation text inside result. You need to address that in the prompt.


Writing Prompts That Return Clean JSON

The --output-format json flag handles the outer envelope. Your prompt handles the inner content.

The difference between a prompt that reliably returns JSON and one that doesn’t is how specifically you instruct the output format:

Unreliable:

Extract the config values from this file and return JSON.

Reliable:

Extract the config values from the file below.

Return ONLY a JSON object with no other text, explanation, or markdown formatting.
The JSON object must have these exact keys: database_url, max_connections, ssl_mode.
If a value is not found, use null.

Do not include backticks, code blocks, or any text outside the JSON object.

The key phrases:

  • “Return ONLY a JSON object with no other text”
  • “Do not include backticks, code blocks, or any text outside the JSON object”
  • Explicit field names with types and null behavior

This sounds verbose. It is. It’s also the difference between a pipeline that works and one that fails 3% of the time in ways that are hard to reproduce.


Python Implementation with Validation and Retry

Here’s a production-grade wrapper that handles parsing, validation, and retry:

import json
import subprocess
import time
from typing import Any, Optional
from pydantic import BaseModel, ValidationError


class ClaudeJSONResult(BaseModel):
    """Pydantic model for the outer claude --output-format json envelope."""
    type: str
    subtype: str
    cost_usd: float
    duration_ms: int
    result: str
    session_id: Optional[str] = None


def run_claude_json(
    prompt: str,
    max_retries: int = 3,
    retry_delay: float = 2.0,
    max_turns: int = 1,
) -> dict[str, Any]:
    """
    Run claude -p with --output-format json and return parsed inner result.
    
    Args:
        prompt: The prompt to send to Claude
        max_retries: Number of retry attempts on parse failure
        retry_delay: Seconds to wait between retries
        max_turns: Maximum conversation turns (default 1 for single-shot)
    
    Returns:
        Parsed JSON dict from Claude's response
    
    Raises:
        ValueError: If output cannot be parsed after all retries
        RuntimeError: If claude CLI returns non-zero exit code
    """
    cmd = [
        "claude",
        "-p", prompt,
        "--output-format", "json",
        "--max-turns", str(max_turns),
    ]
    
    last_error: Optional[Exception] = None
    
    for attempt in range(max_retries):
        if attempt > 0:
            time.sleep(retry_delay)
        
        try:
            proc = subprocess.run(
                cmd,
                capture_output=True,
                text=True,
                encoding="utf-8",
            )
            
            if proc.returncode != 0:
                raise RuntimeError(
                    f"claude exited with code {proc.returncode}: {proc.stderr}"
                )
            
            # Parse the outer JSON envelope
            outer = ClaudeJSONResult.model_validate_json(proc.stdout)
            
            if outer.subtype != "success":
                raise ValueError(f"Claude returned non-success subtype: {outer.subtype}")
            
            # Strip any accidental markdown fences from the inner result
            inner_text = strip_markdown_fences(outer.result.strip())
            
            # Parse the inner JSON (what Claude actually returned)
            return json.loads(inner_text)
        
        except (json.JSONDecodeError, ValidationError) as e:
            last_error = e
            continue
    
    raise ValueError(
        f"Failed to parse Claude JSON output after {max_retries} attempts. "
        f"Last error: {last_error}"
    )


def strip_markdown_fences(text: str) -> str:
    """
    Remove markdown code fences if Claude wrapped the JSON in them.
    Handles ```json ... ``` and ``` ... ``` patterns.
    """
    lines = text.splitlines()
    
    if not lines:
        return text
    
    # Strip opening fence
    first_line = lines[0].strip()
    if first_line.startswith("```"):
        lines = lines[1:]
    
    # Strip closing fence
    if lines and lines[-1].strip() == "```":
        lines = lines[:-1]
    
    return "\n".join(lines)

Usage:

prompt = """
Extract the database configuration from the following config file content.

Return ONLY a JSON object with no other text, explanation, or markdown.
Use these exact keys: database_url, max_connections, ssl_mode, port.
If a value is not found, use null.

Config file:
---
DATABASE_URL=postgres://user:pass@localhost:5432/prod
MAX_CONNECTIONS=100
SSL_MODE=require
---
"""

try:
    config = run_claude_json(prompt)
    print(f"Database URL: {config['database_url']}")
    print(f"Max connections: {config['max_connections']}")
except ValueError as e:
    print(f"Parse failed: {e}")
except RuntimeError as e:
    print(f"Claude CLI error: {e}")

Adding Schema Validation

The retry logic above catches JSON parse errors but not semantic errors — Claude might return valid JSON that doesn’t match your expected schema. Add Pydantic validation for the inner result too:

from pydantic import BaseModel, field_validator
from typing import Optional


class DatabaseConfig(BaseModel):
    database_url: str
    max_connections: int
    ssl_mode: str
    port: Optional[int] = None
    
    @field_validator("max_connections")
    @classmethod
    def max_connections_positive(cls, v: int) -> int:
        if v <= 0:
            raise ValueError("max_connections must be positive")
        return v
    
    @field_validator("ssl_mode")
    @classmethod
    def ssl_mode_valid(cls, v: str) -> str:
        valid_modes = {"disable", "allow", "prefer", "require", "verify-ca", "verify-full"}
        if v not in valid_modes:
            raise ValueError(f"ssl_mode must be one of {valid_modes}")
        return v


def extract_database_config(config_text: str) -> DatabaseConfig:
    prompt = f"""
Extract the database configuration from the text below.

Return ONLY a JSON object with no other text, explanation, or markdown.
Required keys: database_url (string), max_connections (integer), ssl_mode (string).
Optional keys: port (integer or null).

Text:
---
{config_text}
---
"""
    
    raw = run_claude_json(prompt)
    return DatabaseConfig.model_validate(raw)

If Claude returns an ssl_mode that isn’t in the valid set, Pydantic raises ValidationError immediately. The retry loop in run_claude_json catches it and retries. On retry, Claude often corrects itself because it has slightly different sampling.

This pattern — retry on validation failure, not just parse failure — is what separates a demo-grade integration from a production one.


Handling the stream-json Format

For long-running Claude tasks where you need to process output incrementally, --output-format stream-json outputs newline-delimited JSON. Each line is a JSON object representing a streaming event.

import json
import subprocess
from typing import Generator


def stream_claude_events(prompt: str) -> Generator[dict, None, None]:
    """Yield parsed events from claude --output-format stream-json."""
    cmd = [
        "claude",
        "-p", prompt,
        "--output-format", "stream-json",
        "--max-turns", "1",
    ]
    
    proc = subprocess.Popen(
        cmd,
        stdout=subprocess.PIPE,
        stderr=subprocess.PIPE,
        text=True,
        encoding="utf-8",
    )
    
    for line in proc.stdout:
        line = line.strip()
        if not line:
            continue
        try:
            yield json.loads(line)
        except json.JSONDecodeError:
            # Partial lines or non-JSON output — skip
            continue
    
    proc.wait()
    if proc.returncode != 0:
        stderr = proc.stderr.read()
        raise RuntimeError(f"claude exited with code {proc.returncode}: {stderr}")


def extract_result_from_stream(prompt: str) -> str:
    """Run claude in streaming mode and return the final result text."""
    result_text = ""
    
    for event in stream_claude_events(prompt):
        event_type = event.get("type")
        
        if event_type == "assistant":
            # Extract text from assistant message content
            for block in event.get("message", {}).get("content", []):
                if block.get("type") == "text":
                    result_text += block.get("text", "")
        
        elif event_type == "result":
            # Final result event — use result field if available
            if event.get("subtype") == "success":
                result_text = event.get("result", result_text)
            break
    
    return result_text

The stream-json format is useful when:

  • Claude is doing multi-turn work and you want progress updates
  • The task is long-running and you want to detect stalls
  • You’re building a UI that shows Claude’s progress in real time

For simple extraction tasks, --output-format json is cleaner. Use stream-json when you actually need the streaming behavior.


Environment and Version Considerations

A few things that cause “works locally, breaks in CI” failures specifically with JSON output:

Python encoding. On some Linux systems, the default encoding for subprocess text output is ASCII, not UTF-8. If Claude’s response contains any non-ASCII character (in content it’s analyzing, not just in its JSON keys), it raises a UnicodeDecodeError. Fix it by passing encoding="utf-8" explicitly to subprocess.run().

Claude Code version pinning. The JSON envelope format has changed across Claude Code versions. The type, subtype, result field structure in the examples above is current as of mid-2026. If you’re using an older version, the field names may differ. Pin your Claude Code version in CI with npm install -g @anthropic-ai/[email protected].

ANTHROPIC_API_KEY scope. The claude -p command uses the API key from ANTHROPIC_API_KEY in the environment. In CI, this needs to be set explicitly. In local development, Claude Code usually has it stored from the interactive setup. The failure mode is a silent auth error that produces a non-JSON error message on stdout instead of the expected envelope, which then breaks your parser in a confusing way.


What About the API Directly?

For production workloads where you need structured output reliably, consider calling the Anthropic API directly with the response_format parameter rather than shelling out to the CLI.

The API’s structured output mode guarantees JSON-schema-conformant responses:

import anthropic
import json

client = anthropic.Anthropic()

response = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=1024,
    messages=[{
        "role": "user",
        "content": "Extract database config. Return JSON with keys: database_url, max_connections, ssl_mode."
    }]
)

# Response text should be clean JSON per the prompt
config = json.loads(response.content[0].text)

The CLI approach (claude -p) is appropriate for:

  • Scripts that need Claude Code’s file-system context (reading files in the project)
  • Workflows that use Claude Code’s tool-use capabilities (Bash, Edit, Read)
  • Tasks where you want the same Claude Code session that your engineers use

The API approach is appropriate for:

  • Pure text processing with no filesystem access needed
  • High-volume pipelines where shell process overhead matters
  • Production services where you need guaranteed response structure

For most automation tasks, the CLI with --output-format json is the right starting point. If you find yourself fighting the output format repeatedly, that’s the signal to switch to the API.

For the full picture on Claude Code permissions and what tools are available in CLI mode, see Claude Code Best Practices: The Official and Community-Tested Guide for 2026.


Quick Reference

# Basic JSON output
claude -p "your prompt" --output-format json

# With turn limit (prevents multi-turn for simple extraction)
claude -p "your prompt" --output-format json --max-turns 1

# Streaming JSON for long tasks
claude -p "your prompt" --output-format stream-json

# Parse the result in Python (one-liner for simple cases)
python3 -c "
import json, subprocess
out = subprocess.run(['claude', '-p', 'return {\"ok\": true}', '--output-format', 'json'], capture_output=True, text=True)
envelope = json.loads(out.stdout)
result = json.loads(envelope['result'])
print(result)
"

The pattern that works in production:

  1. --output-format json for the outer envelope
  2. Explicit prompt instructions for the inner JSON format
  3. strip_markdown_fences() as a safety net
  4. Pydantic validation on the parsed result
  5. Retry on both json.JSONDecodeError and ValidationError

Claude’s JSON output is reliable when you treat it as a two-layer problem: the CLI’s output format (fixed by the flag) and the model’s response format (fixed by the prompt and validated with schema). Handle both layers and the failures disappear.

Related Articles

Explore the collection

Browse all AI coding rules — CLAUDE.md, .cursorrules, AGENTS.md, and more.

Browse Rules