Prospeqt Spintax API

HTTP API that converts plain email copy into platform-specific spintax, then lints and QA-checks the output. Wraps OpenAI reasoning models (o3, gpt-5.x) and Anthropic Claude behind a stateless job interface. Also exposes batch processing for whole markdown sequence files and standalone lint/QA endpoints.

AI agents: the machine-readable version of this page is at /llms.txt. The OpenAPI 3.1 spec is at /openapi.json. Both are public (no auth).

Overview

Base URLhttps://prospeqt-spintax.onrender.com
AuthenticationBearer token. Authorization: Bearer <BATCH_API_KEY> on every /api/* request.
Doc surfaces/docs (this page), /llms.txt, /openapi.json - all PUBLIC.
Content-Typeapplication/json
DeterminismSpintax generation: NOT deterministic (LLM sampling). Lint and QA: deterministic (same input always yields the same errors and warnings).
Typical latencySingle-body jobs: 30-90s. Batches scale linearly with total_bodies / concurrency. Sync lint/QA: under 100ms.

Quickstart - one body

curl
curl -X POST https://prospeqt-spintax.onrender.com/api/spintax \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"text": "Hi {{firstName}}, noticed your team is hiring SDRs.", "platform": "instantly", "model": "o3"}'

Quickstart - poll the job

python
import json, time, urllib.request

BASE = "https://prospeqt-spintax.onrender.com"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def submit(text, platform="instantly", model="o3"):
    body = json.dumps({"text": text, "platform": platform, "model": model}).encode()
    req = urllib.request.Request(f"{BASE}/api/spintax", data=body, headers=HEADERS, method="POST")
    with urllib.request.urlopen(req, timeout=10) as resp:
        return json.loads(resp.read())["job_id"]

def poll(job_id, interval=10, timeout=600):
    deadline = time.monotonic() + timeout
    while time.monotonic() < deadline:
        req = urllib.request.Request(f"{BASE}/api/status/{job_id}", headers=HEADERS)
        with urllib.request.urlopen(req, timeout=10) as resp:
            data = json.loads(resp.read())
        if data["status"] in ("done", "failed"):
            return data
        time.sleep(interval)
    raise TimeoutError(f"Job {job_id} did not finish within {timeout}s")

Quickstart - dry-run a batch

curl
curl -X POST https://prospeqt-spintax.onrender.com/api/spintax/batch \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -nR --arg md \"$(cat sequence.md)\" '{md: $md, platform: \"instantly\", dry_run: true}')"

Authentication & limits

Send Authorization: Bearer <BATCH_API_KEY> on every /api/* request. The key is stored in ClickUp - ask Mihajlo in chat for the current value.

LimitValueNotes
Per-request rate limitNoneThe server does not throttle individual requests.
Daily spend cap$50 USD (default)Enforced across all OpenAI and Anthropic calls. Configurable via DAILY_SPEND_CAP_USD.
Cap reached behaviorHTTP 429Body: {"error": "spend_cap_reached", "cap_usd": 50.0, "spent_usd": 50.12, "resets_at": "..."}. Resets at midnight UTC.
Job memory TTL1 hourJobs evicted from memory 1h after creation. Pull the result well before then.
Batch concurrency1 to 16 (default 4)Concurrent bodies inside one batch. Set via concurrency field.
Spend cap is global. If multiple integrations share the same server, coordinate before kicking off large batches - the cap applies across all callers.

Placeholders

The API treats {{firstName}}, {{companyName}}, {{accountSignature}}, and any other {{snake_case}} token as opaque placeholders. They are preserved verbatim in the spintax output - the spinner never alters or removes them. Instantly / EmailBison fill them in at send time from your lead data.

Convention for cold-email bodies

  • Greeting line uses {{firstName}}.
  • Sign-off line is exactly {{accountSignature}} on its own line. This expands to the sender's full email signature (name, title, company, links).
  • Custom variables (e.g. {{companyName}}, {{painPoint}}, {{redirectDomain}}) live anywhere in the body. The spinner does not validate that the placeholder name resolves to a real column on the sending platform - that's your responsibility.
Always include {{accountSignature}} at the end of full email bodies you submit. Examples below show the full pattern.

When to use this API

Use this API when an agent needs to:

  • Convert a plain email body into Instantly or EmailBison spintax syntax.
  • Spin a whole markdown sequence file (multiple segments and emails) in one batch and download the result as a zip.
  • Lint already-written spintax copy for syntax errors and length-balance issues.
  • QA-check spintax output against the original plain input for fidelity, drift, duplicate variations, and platform-specific markup violations.

Do NOT use this API for:

  • Writing email copy from scratch (this only spins existing copy - bring your own copy first).
  • Cross-language translation (the spinner preserves the input language; it does not translate).
  • General LLM completions (this is a constrained tool-call loop with hard format rules - use the OpenAI/Anthropic API directly for free-form generation).
  • Real-time UX where latency matters (single-body jobs take 30-90s; batches scale linearly).

Endpoints

All /api/* endpoints require Authorization: Bearer <key>. The doc surfaces (/docs, /llms.txt, /openapi.json) are public.

MethodPathDescriptionAuth
POST/api/spintaxSubmit one body, get job_id. Async.Bearer
GET/api/status/{id}Poll one job's state.Bearer
POST/api/spintax/batchSubmit a markdown file, fire N jobs. Async.Bearer
GET/api/spintax/batch/{id}Poll a batch's state.Bearer
POST/api/spintax/batch/{id}/cancelCancel a running batch.Bearer
GET/api/spintax/batch/{id}/downloadDownload the result zip.Bearer
POST/api/lintSync lint of already-spun copy.Bearer
POST/api/qaSync QA of spun copy vs original.Bearer
GET/docsThis page.Public
GET/llms.txtLLM-optimized markdown docs.Public
GET/openapi.jsonOpenAPI 3.1 spec.Public
POST /api/spintax

Submit one plain email body. Returns a job_id immediately; the actual generation runs in the background. Poll /api/status/{job_id} until status is done or failed.

Request body (JSON)

FieldTypeRequiredDescription
textstringYesPlain email body to spin. Must not be empty.
platformstringYes"instantly" or "emailbison". Determines spintax syntax ({a|b|c} vs [spin|a|b|c]).
modelstringNoModel name. Defaults to the server's OPENAI_MODEL env var (currently o3). See Models.
reasoning_effortstringNo"low", "medium", or "high". Honored for OpenAI o-series and gpt-5.x. Ignored otherwise. Default "medium".

Example body

json
{
  "text": "Hi {{firstName}}, noticed your team is hiring SDRs...",
  "platform": "instantly",
  "model": "o3",
  "reasoning_effort": "medium"
}

Response (200 OK)

json
{"job_id": "8e2a7c0f-1c19-4a42-9f73-3a3d9d1b54ab"}

Errors

StatusWhen
401Missing or invalid bearer token.
422Invalid input (empty text, bad platform value).
429Daily spend cap reached. Body: {"error": "spend_cap_reached", "cap_usd": 50.0, "spent_usd": 50.12, "resets_at": "..."}.
GET /api/status/{job_id}

Poll a job's state. Jobs are retained in memory for 1 hour after creation, then evicted.

Response (200 OK)

json
{
  "job_id": "8e2a7c0f-1c19-4a42-9f73-3a3d9d1b54ab",
  "status": "done",
  "progress": null,
  "result": {
    "spintax_body": "Hi {{firstName}}, {noticed|saw} your team is {hiring|growing} SDRs...",
    "lint": {"passed": true, "errors": [], "warnings": []},
    "qa": {"passed": true, "errors": [], "warnings": []},
    "tool_calls": 2,
    "api_calls": 3,
    "cost_usd": 0.0512,
    "drift_revisions": 0,
    "drift_unresolved": []
  },
  "error": null,
  "error_detail": null,
  "cost_usd": 0.0512,
  "elapsed_sec": 47.3
}

Status values

StatusMeaning
queuedJob created, generation hasn't started.
draftingModel is generating the first spintax draft.
lintingRunning the deterministic linter on the draft.
iteratingModel is calling the lint tool to fix its own errors.
qaRunning QA checks (drift, duplicates, fidelity).
doneTERMINAL. result is populated.
failedTERMINAL. error (machine key) and error_detail (human message) are populated.

Result fields (only when status == "done")

FieldTypeDescription
spintax_bodystringThe final spintax-formatted email body.
lint.passed / errors / warningsobjectFinal lint result on the spintax body.
qa.passed / errors / warningsobjectFinal QA result against the original input.
tool_callsintNumber of times the model invoked the lint tool inside its loop.
api_callsintTotal round-trips to OpenAI/Anthropic for this job (drafts + revisions).
cost_usdfloatAccumulated USD cost for this job.
drift_revisionsintNumber of drift-revision passes triggered (0 = clean on first try). See Drift revision loop.
drift_unresolvedstring[]Drift warnings that REMAINED after all revision attempts. Empty when drift was resolved or never detected.

Top-level fields (always present)

FieldTypeDescription
cost_usdfloatSame as result.cost_usd for completed jobs; cumulative-so-far for in-flight jobs.
elapsed_secfloatWall-clock seconds since job creation.
errorstring | nullMachine-readable error key. Null unless status == "failed". See Error codes.
error_detailstring | nullHuman-readable provider message. Null unless status == "failed".

Errors

StatusWhen
401Missing or invalid bearer token.
404Job not found or expired (TTL 1h).
POST /api/spintax/batch

Parse a markdown sequence file, then spin every email body inside it concurrently. Returns a batch_id immediately. The batch runs in the background.

Request body (JSON)

FieldTypeRequiredDescription
mdstringYesFull markdown document. Parsed by an o4-mini structured-output parser into segments and email bodies.
platformstringYes"instantly" or "emailbison".
modelstringNoModel used for spintax generation. Defaults to server OPENAI_MODEL.
concurrencyintNoConcurrent jobs (1..16, default 4).
dry_runboolNoIf true, parse only and return the structure WITHOUT firing any spintax jobs. Default false.

Example body

json
{
  "md": "# Segment A\n\n## Email 1\n\nSubject: ...\n\nBody...\n\n## Email 2\n\nSubject: ...\n\nBody...",
  "platform": "instantly",
  "model": "o3",
  "concurrency": 4,
  "dry_run": false
}

Response (200 OK)

json
{
  "batch_id": "b1f3...",
  "parsed": {
    "segments": [
      {
        "name": "Recruiter persona",
        "section": "Cold sequence",
        "email_count": 5,
        "emails_to_spin": 1,
        "warnings": []
      }
    ],
    "total_bodies": 5,
    "total_bodies_to_spin": 1,
    "warnings": []
  },
  "status": "running",
  "fired": true,
  "total_jobs": 5
}
Email-1-only rule: Only Email 1 of each segment hits the spinner. Emails 2-5 are passed through unchanged - the runner enforces this because re-spinning follow-ups produces drift. emails_to_spin reflects the actual OpenAI call count.

Errors

StatusWhen
401Missing or invalid bearer token.
422Empty md, bad platform, or parser found zero segments. Body when zero segments: {"error": "no_segments_found", "message": "...", "warnings": [...]}.
500Parser crashed unexpectedly.
GET /api/spintax/batch/{batch_id}

Poll a batch's state.

Response (200 OK)

json
{
  "batch_id": "b1f3...",
  "status": "running",
  "platform": "instantly",
  "model": "o3",
  "completed": 3,
  "failed": 0,
  "in_progress": 1,
  "retrying": 0,
  "queued": 1,
  "total": 5,
  "retries_used": 1,
  "elapsed_sec": 124.8,
  "cost_usd_so_far": 0.18,
  "cost_usd_estimated_total": 0.30,
  "failure_reason": null,
  "download_url": null,
  "parsed": { "...same shape as submit response..." }
}

Status values: parsed, running, done, failed, cancelled. download_url is non-null when status is done or cancelled (partial output downloads are allowed).

POST /api/spintax/batch/{batch_id}/cancel

Mark a batch as cancelled. In-flight bodies finish naturally; queued bodies are skipped. Idempotent - calling on a terminal batch returns cancelled: false with a message.

Response

json
{"batch_id": "b1f3...", "status": "cancelled", "cancelled": true}
GET /api/spintax/batch/{batch_id}/download

Stream the final .zip containing the spun markdown plus a report.md summary.

StatusBodyWhen
200application/zip, Content-Disposition: attachmentBatch is done or cancelled.
404not foundBatch ID does not exist or has been evicted.
409{"error": "batch_not_complete", "message": "...", "status": "running"}Batch is still running. Wait or call cancel first.
POST /api/lint

Deterministic lint on already-spun copy. No LLM, no cost, no async - synchronous response.

Request body (JSON)

FieldTypeRequiredDescription
textstringYesSpintax copy to lint. Must contain at least one block.
platformstringYes"instantly" or "emailbison".
tolerancefloatNoLength-balance tolerance fraction (0.0..1.0). Default 0.05 (5%). Variations longer or shorter than the base by more than this trigger a warning.
tolerance_floorintNoMinimum absolute char tolerance, protects short blocks. Effective tolerance = max(base * tolerance, floor). Default 3.

Example body

json
{
  "text": "Hi {{firstName}}, {noticed|saw} your team...",
  "platform": "instantly",
  "tolerance": 0.05,
  "tolerance_floor": 3
}

Response (200 OK)

json
{
  "errors": [],
  "warnings": ["Block 2: variation 3 is 18 chars longer than base."],
  "passed": true,
  "error_count": 0,
  "warning_count": 1
}

passed is true iff errors is empty. Warnings are advisory and do NOT affect passed.

POST /api/qa

Deterministic QA against the original plain input. Synchronous.

Request body (JSON)

json
{
  "output_text": "Hi {{firstName}}, {noticed|saw} your team...",
  "input_text": "Hi {{firstName}}, noticed your team...",
  "platform": "instantly"
}

Response (200 OK)

json
{
  "passed": true,
  "error_count": 0,
  "warning_count": 0,
  "errors": [],
  "warnings": [],
  "block_count": 4,
  "input_paragraph_count": 2
}

QA checks performed: V1 fidelity (variation 1 of every block matches the original), block count vs input paragraph count, greeting whitelist, duplicate variations within a block, smart quotes, doubled punctuation, concept drift (new content words introduced in variations 2+).

Models

All currently available models. Cost-per-job estimates assume a typical email body (~3000 input tokens + ~2000 output tokens). Actual cost varies with copy length and reasoning effort.

ModelFamilyEndpoint familyApprox cost / bodyNotes
o3 (default)OpenAI o-serieschat completions~$0.05Reliable. Good drift resistance. The default.
o3-miniOpenAI o-serieschat completions~$0.013Cheaper, less reliable on long bodies.
o3-proOpenAI o-serieschat completions~$0.22Slow. Use only when o3 keeps drifting.
o4-miniOpenAI o-serieschat completions~$0.013Same price as o3-mini. Used for the markdown parser, not normally for spintax.
o1, o1-miniOpenAI o-series (legacy)chat completions~$0.18 / ~$0.013Legacy. Prefer o3 family.
gpt-4.1, gpt-4.1-miniOpenAI GPT-4.1chat completions~$0.022 / ~$0.004Non-reasoning. Faster, less reliable on tool-call loops.
gpt-5OpenAI GPT-5.xresponses API~$0.027Routed through /v1/responses. Reasoning + tools combo.
gpt-5-miniOpenAI GPT-5.xresponses API~$0.005Cheapest gpt-5 variant.
gpt-5.5OpenAI GPT-5.xresponses API~$0.055Strong drift resistance. ~$0.26/seg observed at full reasoning.
gpt-5.5-proOpenAI GPT-5.xresponses API~$0.11Top-of-line. Single-data-point benchmark: ~$0.30/seg.
claude-opus-4-7Anthropicmessages API~$0.065Confirmed pricing. Single-data-point benchmark: ~$0.48/seg.
claude-sonnet-4-6Anthropicmessages API~$0.039Mid-tier Anthropic. Confirmed pricing.
reasoning_effort is honored for o-series and gpt-5.x. Anthropic models use their own thinking config; reasoning_effort is ignored. If model is set to a value not in the registry, the job fails with model_not_found and error_detail echoes the provider's message.

Drift revision loop

The runner has a self-correction loop layered on top of the model's tool-call loop. After the model produces a spintax draft and runs lint, the runner runs QA. If QA reports concept-drift warnings (variations 2-N introduce nouns or content words that aren't in V1 - meaning the model invented new context), the runner sends a revision prompt back to the model:

revision prompt
REVISION PASS #N - concept drift detected.

The following drift warnings were raised:
  - Block 3 variation 2 introduces "quarterly" not present in V1.
  - Block 5 variation 4 introduces "stakeholders" not present in V1.

Keep V1 fidelity intact. Swap drifted nouns for synonyms only.
Do NOT introduce new concepts. Re-emit the full spintax body.

Up to MAX_DRIFT_REVISIONS = 3 revision passes. The loop exits the moment QA reports zero drift, or after the third revision attempt regardless.

How agents should interpret the result

Result stateMeaning
drift_revisions == 0Model was clean on first try. Highest-quality output.
drift_revisions in {1, 2, 3}Model needed corrections but converged. Output is acceptable. Higher numbers correlate with weaker model fit for this copy.
drift_unresolved empty AND drift_revisions > 0Drift was caught and fixed. Use the output.
drift_unresolved non-emptyModel could NOT resolve drift in 3 attempts. Output is still returned so you can salvage it, but treat the listed phrases as suspect. Consider re-running on a stronger model (o3-pro or gpt-5.5-pro), shortening the input, or accepting the drift if the new wording is acceptable.

Error codes

When a job fails (status == "failed"), error is one of the keys below and error_detail carries the human-readable provider message (truncated to 500 chars).

error keyMeaningRecovery
openai_timeoutProvider request exceeded the per-call timeout. Example: "Request timed out after 120s".Retry once. If it persists, switch to a smaller/faster model (o3-mini or gpt-4.1-mini).
openai_quotaProvider quota / rate limit hit. Example: "Rate limit reached for o3 in organization org_...".Wait a few seconds and retry. Reduce batch concurrency if it's frequent.
max_tool_callsModel exhausted the 10-tool-call ceiling without producing valid output. Example: "Reached max tool calls (10) without finishing".Re-run with a stronger model (o3-pro or gpt-5.5). The current model can't fit your copy in 10 lint cycles.
malformed_responseProvider returned something the runner couldn't parse (no spintax body, broken tool call, truncated JSON). Example: "Could not extract spintax_body from response".Retry once. If it persists, simplify the input copy (shorter paragraphs, fewer placeholders).
auth_failedProvider rejected the API key. Example: "Incorrect API key provided: sk-..." or "invalid x-api-key".Server config issue, not a caller issue. Tell Mihajlo.
low_balanceProvider account is out of credits / billing failed. Example: "Your credit balance is too low to access the Anthropic API".Server config issue. Tell Mihajlo. Switch model to an OpenAI model in the meantime.
bad_requestProvider rejected the request shape (typically a model-specific parameter mismatch). Example: "Unsupported value: 'reasoning_effort' is not supported with this model".Adjust the request - e.g., omit reasoning_effort for non-reasoning models.
model_not_foundProvider doesn't recognize the model name. Example: "The model 'gpt-99' does not exist or you do not have access to it".Check the Models table above. The model name is case-sensitive.
internal_errorAnything else. Example: "KeyError: 'choices'".Retry once. If it persists, capture the job_id and elapsed_sec and report it.

Polling pattern

Recommended polling cadence: every 10 seconds. Single-body jobs typically finish in 30-90s; batches scale linearly with total_bodies / concurrency.

Terminal states are done and failed. Do not poll beyond a terminal state - the response will not change. Jobs are evicted from memory 1 hour after creation, so finish polling and pull the result well before then.

python
import json, time, urllib.request

BASE = "https://prospeqt-spintax.onrender.com"
HEADERS = {"Authorization": f"Bearer {API_KEY}", "Content-Type": "application/json"}

def submit(text, platform="instantly", model="o3"):
    body = json.dumps({"text": text, "platform": platform, "model": model}).encode()
    req = urllib.request.Request(f"{BASE}/api/spintax", data=body, headers=HEADERS, method="POST")
    with urllib.request.urlopen(req, timeout=10) as resp:
        return json.loads(resp.read())["job_id"]

def poll(job_id, interval=10, timeout=600):
    deadline = time.monotonic() + timeout
    while time.monotonic() < deadline:
        req = urllib.request.Request(f"{BASE}/api/status/{job_id}", headers=HEADERS)
        with urllib.request.urlopen(req, timeout=10) as resp:
            data = json.loads(resp.read())
        if data["status"] in ("done", "failed"):
            return data
        time.sleep(interval)
    raise TimeoutError(f"Job {job_id} did not finish within {timeout}s")

Examples

Single-body spintax with model selection

bash
JOB_ID=$(curl -sX POST https://prospeqt-spintax.onrender.com/api/spintax \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "text": "Hi {{firstName}},\n\nNoticed {{companyName}} just hired 5 SDRs. Quick question - how is the onboarding holding up at that pace?\n\nWe help RevOps teams ramp new SDRs to quota in 30 days instead of 90. Worth a 15-min call this week?\n\n{{accountSignature}}",
    "platform": "instantly",
    "model": "gpt-5.5",
    "reasoning_effort": "high"
  }' | jq -r .job_id)

while true; do
  STATUS=$(curl -sH "Authorization: Bearer $API_KEY" \
    https://prospeqt-spintax.onrender.com/api/status/$JOB_ID)
  STATE=$(echo "$STATUS" | jq -r .status)
  [ "$STATE" = "done" ] && echo "$STATUS" | jq .result.spintax_body && break
  [ "$STATE" = "failed" ] && echo "$STATUS" | jq '{error, error_detail}' && exit 1
  sleep 10
done

Batch from a multi-segment markdown file

bash
BATCH_ID=$(curl -sX POST https://prospeqt-spintax.onrender.com/api/spintax/batch \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d "$(jq -nR --arg md \"$(cat sequence.md)\" '{md: $md, platform: \"instantly\", model: \"o3\", concurrency: 4}')" \
  | jq -r .batch_id)

while true; do
  STATE=$(curl -sH "Authorization: Bearer $API_KEY" \
    https://prospeqt-spintax.onrender.com/api/spintax/batch/$BATCH_ID | jq -r .status)
  [ "$STATE" = "done" ] && break
  [ "$STATE" = "failed" ] && exit 1
  sleep 10
done

curl -OJ -H "Authorization: Bearer $API_KEY" \
  https://prospeqt-spintax.onrender.com/api/spintax/batch/$BATCH_ID/download

Tip: use dry_run: true first to confirm the parser found the segments and bodies you expect, then re-submit with dry_run: false.

QA-only check on existing spintax copy

Use this when you already have spun copy (from a previous run, or hand-written) and want to verify it before pushing to a sending platform. Returns synchronously - no job, no cost, no polling.

bash
curl -sX POST https://prospeqt-spintax.onrender.com/api/qa \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "output_text": "Hi {{firstName}}, {noticed|saw} your team is {hiring|growing}...",
    "input_text": "Hi {{firstName}}, noticed your team is hiring...",
    "platform": "instantly"
  }'

Guide for AI agents

What's idiomatic

  • Submit with POST /api/spintax, then poll /api/status/{id} every 10s. Don't poll faster - it costs you nothing but it costs the server.
  • For >2 bodies, use /api/spintax/batch instead of N parallel single-body calls. The batch endpoint manages concurrency and produces a single zip.
  • Always inspect result.qa.warnings and result.drift_unresolved before shipping output, even when result.qa.passed is true.
  • For a new agent integration: dry-run a batch first (dry_run: true) to confirm the parser output matches what you expect, then re-submit.
  • Cache job_id and batch_id on your side. The server retains jobs for only 1 hour.

What to avoid

  • Don't loop spintax over the same body to "improve" it. Spintax is one-shot per body. Re-spinning produces drift.
  • Don't pass spun copy to /api/spintax. The endpoint expects PLAIN copy. Use /api/lint or /api/qa to check existing spintax.
  • Don't pass spintax with both {a|b} and [spin|a|b] syntaxes mixed. Pick a platform and stick to it for the whole document.
  • Don't submit jobs from multiple integrations against a shared spend cap without coordination. The cap is global per-server.
  • Don't ignore drift_unresolved - when it's non-empty, the model failed to fix concept drift. Treat the listed phrases as bugs.

When to retry

  • openai_timeout: retry once after 30s. Then escalate to a faster model.
  • openai_quota: retry once after 60s. Then reduce batch concurrency.
  • malformed_response: retry once. If it fails again, simplify the input.
  • max_tool_calls: do NOT retry on the same model. Switch to o3-pro or gpt-5.5-pro.
  • auth_failed, low_balance: do NOT retry. Tell Mihajlo.
  • bad_request: do NOT retry. Fix the request payload first.
  • model_not_found: do NOT retry. Check the Models table for the exact name.
  • internal_error: retry once. If it persists, capture the job_id and report.

Default config

  • model: o3
  • platform: instantly
  • reasoning_effort: medium
  • concurrency (batch): 4

When in doubt, default to these. That's the proven baseline.