Universal Extraction
AgentInbox's universal extraction system automatically extracts structured data from any email using a cascading approach: built-in extractors first, then LLM fallback.
Extraction Pipeline
How It Works
When you request an extraction, the system tries multiple strategies in order:
Built-in Extractors
Fast, deterministic regex and pattern-based extractors for common types like OTP, magic links, and tracking numbers.
Heuristic Analysis
If the built-in extractor doesn't find a match, heuristic analysis examines the email structure and context.
LLM Fallback
If the first two stages fail, an LLM (GPT-4o-mini) reads the email and extracts the requested data. This is optional and can be disabled.
Built-in Extractors
Built-in extractors are optimized for specific email types and run instantly with no additional cost.
OTP
Matches 4-8 digit codes in verification contexts. Handles spaces, dashes, and prefixes.
Magic Link
Extracts the primary verification URL from HTML and text bodies.
Verification Code
Alphanumeric codes, often longer than OTPs, with mixed case support.
Invoice Number
Matches common invoice formats (INV-, #12345, etc.) from billing emails.
Tracking Number
Recognizes carrier formats (UPS, FedEx, USPS, DHL) and generic tracking codes.
API Token
Extracts bearer tokens, API keys, and secrets from developer emails.
Coupon Code
Identifies promo codes and discount identifiers from marketing emails.
Password Reset Link
Extracts password reset URLs from account recovery emails.
Confidence Scoring
Every extraction returns a confidence score. Built-in extractors typically return scores above 0.95.
- Built-in: 0.95 - 1.00 (very high confidence)
- Heuristic: 0.70 - 0.95 (good confidence)
- LLM: 0.80 - 0.95 (high confidence, but slower)
Disabling LLM Fallback
If you want only fast, deterministic extraction, disable the LLM fallback.
const extraction = await client.extractions.create({ messageId: "msg_456", type: "otp", useLlm: false, // Disable LLM fallback});extraction = client.extractions.create( message_id="msg_456", type="otp", use_llm=False, # Disable LLM fallback)Performance optimization