Provider Deep Dives¶

Per-provider reference cards, magic quadrant analysis, and visual comparisons.

This page is the detailed reference for each provider's capabilities, pricing, models, and strategic positioning. For the decision-oriented summary, see the AI Provider Comparison Guide. For measured performance numbers, see the Evaluation Reports.

Provider Reference Cards¶

1. Local ML (Default)¶

┌─────────────────────────────────────────────────────────────────┐
│  LOCAL ML PROVIDERS                                             │
│  ═══════════════════                                            │
│                                                                 │
│  💰 Cost:     $0 (just electricity)                            │
│  ⚡ Speed:    Moderate (GPU dependent)                          │
│  🏆 Quality:  Good                                              │
│  🔒 Privacy:  ████████████████████ 100% (complete)              │
│                                                                 │
│  Components:                                                    │
│  ├── 🎙️ Transcription: OpenAI Whisper (local)                  │
│  ├── 👤 Speaker Det:   spaCy NER models                         │
│  └── 📝 Summarization: Hugging Face BART/LED                    │
│                                                                 │
│  Best For: Privacy, offline use, zero ongoing cost              │
└─────────────────────────────────────────────────────────────────┘

For detailed ML model comparisons, see ML Model Comparison Guide.

Hardware Requirements:

Component	Minimum	Recommended
RAM	8 GB	16 GB+
GPU VRAM	None (CPU)	8 GB+
Storage	5 GB	20 GB

2. OpenAI¶

┌─────────────────────────────────────────────────────────────────┐
│  OPENAI                                        Industry Leader  │
│  ══════                                                         │
│                                                                 │
│  💰 Cost:     $$$ (Premium pricing)                             │
│  ⚡ Speed:    Fast (100 tok/s)                                  │
│  🏆 Quality:  ████████████████████ Best                         │
│  🔒 Privacy:  ████████████░░░░░░░░ Standard (US)                │
│  📊 Tracking: Built-in token/audio usage metrics for cost eval  │
│  ⚠️ Limit:    25 MB audio file size limit for Whisper API       │
│                                                                 │
│  Models:                                                        │
│  ├── GPT-4o       $5.00/$15.00  │ Best quality                 │
│  ├── GPT-4o-mini  $0.15/$0.60   │ ⭐ Production recommended     │
│  └── Whisper      $0.006/min    │ Transcription                │
│                                                                 │
│  Best For: Quality-critical production, reliable workflows      │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Cloud LLMs (58.8% ROUGE-L, 92.7% embed — silver reference bias applies).

3. Anthropic (Claude)¶

┌─────────────────────────────────────────────────────────────────┐
│  ANTHROPIC CLAUDE                              Safety Focused   │
│  ═══════════════                                                │
│                                                                 │
│  💰 Cost:     $$ (Competitive)                                  │
│  ⚡ Speed:    Fast (100 tok/s)                                  │
│  🏆 Quality:  ███████████████████░ Excellent                    │
│  🔒 Privacy:  ████████████░░░░░░░░ Standard (US)                │
│  ⚠️  Summarization only (no transcription or speaker detection) │
│                                                                 │
│  Models:                                                        │
│  ├── Claude Haiku 4.5   $1/$5   │ ⭐ Eval/acceptance alias      │
│  ├── claude-haiku-4-5            │ Anthropic alias (current)    │
│  └── claude-3-5-sonnet-20241022 │ Deprecated (404)             │
│                                                                 │
│  Best For: Quality text, nuanced content, safety alignment      │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Cloud LLMs (29.4% ROUGE-L, 81.8% embed, 4.8s/ep).

4. Mistral¶

┌─────────────────────────────────────────────────────────────────┐
│  MISTRAL                                       European Leader  │
│  ═══════                                                        │
│                                                                 │
│  💰 Cost:     $-$$ (Competitive)                                │
│  ⚡ Speed:    Fast                                              │
│  🏆 Quality:  ██████████████████░░ Very Good                    │
│  🔒 Privacy:  ████████████████░░░░ High (EU servers)            │
│  ⚠️  Summarization only (no transcription or speaker detection) │
│                                                                 │
│  Models:                                                        │
│  ├── Large 3      $2/$6      │ ⭐ Production                   │
│  └── Small 3.1    $0.10/$0.30│ ⭐ Dev/test (cheapest!)         │
│                                                                 │
│  Best For: EU compliance, summarization with data residency     │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Cloud LLMs (32.5% ROUGE-L, 84.8% embed, 2.8s/ep — fastest cloud provider).

5. DeepSeek¶

┌─────────────────────────────────────────────────────────────────┐
│  DEEPSEEK                                      Ultra Low Cost   │
│  ════════                                                       │
│                                                                 │
│  💰 Cost:     $ (95% cheaper than OpenAI!)                      │
│  ⚡ Speed:    Fast (150 tok/s)                                  │
│  🏆 Quality:  ██████████████░░░░░░ Good                         │
│  🔒 Privacy:  ████████░░░░░░░░░░░░ China servers                │
│  ⚠️  Summarization only (no transcription or speaker detection) │
│                                                                 │
│  Models:                                                        │
│  ├── DeepSeek Chat      $0.28/$0.42 (cache miss)               │
│  ├── DeepSeek Chat      $0.028/$0.42 (cache hit!) 💰           │
│  └── DeepSeek Reasoner  Complex reasoning tasks                │
│                                                                 │
│  🔥 $0.016/100 episodes vs $0.55 OpenAI = 97% SAVINGS          │
│                                                                 │
│  Best For: Budget optimization, bulk processing, startups       │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Cloud LLMs (26.3% ROUGE-L, 85.0% embed, 14.2s/ep).

6. Google Gemini¶

┌─────────────────────────────────────────────────────────────────┐
│  GOOGLE GEMINI                                 Massive Context  │
│  ═════════════                                                  │
│                                                                 │
│  💰 Cost:     $ (Generous free tier!)                           │
│  ⚡ Speed:    Fast                                              │
│  🏆 Quality:  ██████████████████░░ Very Good                    │
│  🔒 Privacy:  ████████████░░░░░░░░ Standard (Google)            │
│  ✅ Transcription + summarization (no speaker detection)        │
│                                                                 │
│  Models:                                                        │
│  ├── Gemini 2.0 Flash  $0.10/$0.40  │ ⭐ Dev/test              │
│  ├── Gemini 1.5 Pro    $1.25/$5.00  │ ⭐ Production            │
│  └── Gemini 1.5 Flash  $0.075/$0.30 │ Budget                   │
│                                                                 │
│  🔥 2 MILLION TOKEN CONTEXT - Process entire seasons!          │
│                                                                 │
│  FREE TIER: 15 RPM, 1M TPM, 1500 RPD                           │
│                                                                 │
│  Best For: Long content, free development, multimodal           │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Cloud LLMs (33.3% ROUGE-L, 87.3% embed, 2.7s/ep — best non-OpenAI cloud).

7. Grok¶

┌─────────────────────────────────────────────────────────────────┐
│  GROK                                          Real-Time Info   │
│  ════                                                           │
│                                                                 │
│  💰 Cost:     $ (Affordable)                                    │
│  🌐 Feature:  ████████████████████ Real-time X/Twitter access  │
│  🏆 Quality:  ██████████████░░░░░░ Good (xAI models)           │
│  🔒 Privacy:  ████████████░░░░░░░░ Standard (US)                │
│  ⚠️  Summarization only (no transcription or speaker detection) │
│                                                                 │
│  Models (xAI's Grok):                                           │
│  ├── grok-3-mini                │ ⭐ Eval/acceptance (use this) │
│  └── grok-2                     │ 400 model not found           │
│                                                                 │
│  🌐 Access real-time information via X/Twitter integration!   │
│                                                                 │
│  FREE TIER: grok-beta available                                │
│                                                                 │
│  Best For: Real-time information, current events, X/Twitter     │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Cloud LLMs (29.5% ROUGE-L, 85.4% embed, 13.2s/ep).

8. Ollama (Local LLMs)¶

┌─────────────────────────────────────────────────────────────────┐
│  OLLAMA                                        Self-Hosted      │
│  ══════                                                         │
│                                                                 │
│  💰 Cost:     $0 per request (hardware only)                    │
│  ⚡ Speed:    Slow-Medium (hardware dependent, ~30 tok/s)        │
│  🏆 Quality:  ██████████████░░░░░░ Good (model dependent)       │
│  🔒 Privacy:  ████████████████████ 100% Complete                │
│  ✅ Full-stack: transcription, speaker detection, summarization │
│                                                                 │
│  Recommended Models (by Use Case):                             │
│  ├── qwen2.5:7b        8 GB+ RAM  │ ⭐ Best JSON, GIL          │
│  ├── llama3.1:8b       8 GB+ RAM  │ ⭐ General purpose          │
│  ├── mistral:7b        8 GB+ RAM  │ ⭐ Fastest inference        │
│  ├── gemma2:9b         12 GB+ RAM │ ⭐ Balanced quality/speed   │
│  └── phi3:mini         4 GB+ RAM  │ ⭐ Dev/test, lightweight   │
│                                                                 │
│  Setup Requirements:                                           │
│  ├── Install Ollama:   brew install ollama (macOS)            │
│  ├── Start server:     ollama serve (keep running)            │
│  └── Pull models:      ollama pull qwen2.5:7b (recommended)    │
│                                                                 │
│  Hardware Recommendations:                                      │
│  ├── 4 GB+ RAM:        phi3:mini (dev/test only)              │
│  ├── 8 GB+ RAM:        qwen2.5:7b, llama3.1:8b, mistral:7b    │
│  └── 12 GB+ RAM:       gemma2:9b (balanced quality/speed)     │
│                                                                 │
│  💡 Zero API costs, unlimited usage, complete data privacy      │
│                                                                 │
│  Best For: Privacy-critical, offline/air-gapped, unlimited    │
│            processing, enterprises, cost optimization           │
│                                                                 │
│  📖 See [Ollama Provider Guide](OLLAMA_PROVIDER_GUIDE.md)      │
└─────────────────────────────────────────────────────────────────┘

Measured performance: See Smoke v1 report — Local Ollama (top: Mistral Small 3.2 / Qwen 2.5:32b at 38.4% ROUGE-L).

Magic Quadrant¶

A Gartner-style analysis plotting all providers across two strategic dimensions:

X-Axis: Completeness of Vision — full-stack capabilities, context window, free tiers, innovation
Y-Axis: Ability to Execute — quality, speed, reliability, cost-effectiveness

                           ABILITY TO EXECUTE
                                  ▲
                                  │
High │    CHALLENGERS     │      LEADERS
     │                    │
     │    ┌───────────┐   │   ┌─────────────┐
     │    │ Anthropic │   │   │   OpenAI    │ ← Quality benchmark
     │    │  Claude   │   │   │   GPT-4o    │
     │    └───────────┘   │   └─────────────┘
             │         ▲          │          ▲
             │         │          │   ┌──────┴──────┐
             │    High quality    │   │   Gemini    │ ← 2M context + free tier
             │    but text-only   │   │             │
             │                    │   └─────────────┘
             │   ┌─────────┐      │          ▲
             │   │  Grok   │──────┼──────────┤
             │   │  🌐Real │      │   ┌──────┴──────┐
             │   └─────────┘      │   │   Mistral   │ ← EU summarization
             │       ▲            │   │     🇪🇺      │
             │   Speed champion   │   └─────────────┘
             │                    │
      ───────┼────────────────────┼──────────────────────────────►
             │                    │              COMPLETENESS OF VISION
             │                    │
             │    NICHE PLAYERS   │     VISIONARIES
             │                    │
             │   ┌───────────┐    │   ┌─────────────┐
             │   │  Ollama   │    │   │  Local ML   │ ← Zero cost + full stack
             │   │    🏠     │    │   │   (Default) │
             │   └───────────┘    │   └─────────────┘
             │       ▲            │          ▲
             │   Offline/private  │   Hardware required
             │   but needs HW     │   but complete control
             │                    │
             │   ┌───────────┐    │
             │   │ DeepSeek  │────┼───► Extreme value
             │   │   💰97%   │    │      but China-based
             │   └───────────┘    │
        Low  │                    │
             │                    │
             └────────────────────┴────────────────────────────────
                    Limited                              Comprehensive

Quadrant Analysis¶

Quadrant	Providers	Characteristics	Best For
Leaders	OpenAI, Gemini, Mistral	High quality, proven reliability, broad capabilities	Production workloads, quality-critical apps
Challengers	Anthropic, Grok	Excellent execution but limited scope	Text-only processing, real-time information
Visionaries	Local ML, DeepSeek	Innovative value proposition, some trade-offs	Cost optimization, privacy, experimentation
Niche Players	Ollama	Specialized use case, strong in specific domain	Offline, enterprise security, self-hosted

Provider Scores (0-10)¶

Provider	Vision Score	Execution Score	Quadrant	Key Strength
OpenAI	9	10	Leader	Quality benchmark
Gemini	10	8	Leader	2M context + free tier
Mistral	8	7	Leader	EU compliance + summarization
Anthropic	5	9	Challenger	Safety + quality
Grok	5	8	Challenger	Real-time info
Local ML	8	5	Visionary	Zero cost + privacy
DeepSeek	4	7	Visionary	97% cost savings
Ollama	4	6	Niche	Offline + self-hosted

Movement Predictions (2026)¶

                                    ▲ Ability to Execute
                                    │
                                    │     ┌──────────────┐
                                    │     │   OpenAI     │ ← Maintains lead
                                    │     │   ●──────●   │
                                    │     └──────────────┘
                                    │
                                    │     ┌──────────────┐
                                    │     │   Gemini     │ ← Rising challenger
                                    │     │      ●═══▶   │
                                    │     └──────────────┘
                                    │
                                    │     ┌──────────────┐
                                    │     │   Grok       │ ← Adding capabilities?
                                    │     │   ●═══▶      │
                                    │     └──────────────┘
                                    │
                                    │     ┌──────────────┐
                                    │     │  DeepSeek    │ ← Quality improving
                                    │     │      ●       │
                                    │     │      ║       │
                                    │     │      ▼       │
                                    │     └──────────────┘
                    ────────────────┼──────────────────────────────►
                                    │                    Completeness of Vision

    Legend:  ● Current position    ═══▶ Predicted movement

Strategic Recommendations by Quadrant¶

LEADERS (OpenAI, Gemini, Mistral)

"Safe bets for production. Choose based on specific needs."

Provider	Choose When...
OpenAI	Quality is paramount, budget available
Gemini	Need huge context (2M), want free tier
Mistral	EU data residency required (summarization only)

CHALLENGERS (Anthropic, Grok)

"Excellent at what they do, but not full-stack."

Provider	Choose When...
Anthropic	Text quality matters, safety-first
Grok	Real-time information access needed

VISIONARIES (Local ML, DeepSeek)

"Trade-offs for significant advantages."

Provider	Choose When...
Local ML	Zero cost + privacy + offline
DeepSeek	Extreme budget constraints (97% savings)

NICHE PLAYERS (Ollama)

"Perfect for specific use cases."

Provider	Choose When...
Ollama	Enterprise security, air-gapped, unlimited processing

Visual Comparisons¶

Cost Comparison (Text Processing per 100 Episodes)¶

Cost Scale (logarithmic feel - lower is better)
═══════════════════════════════════════════════════════════════════════════

Local ML     $0.00 │
Ollama       $0.00 │
DeepSeek     $0.02 │▏
Grok         $0.03 │▎
Mistral      $0.11 │█
Anthropic    $0.40 │███
OpenAI       $0.55 │████
                   └────────────────────────────────────────────────────────
                   $0                                                   $0.60

💡 DeepSeek is 97% cheaper than OpenAI for text processing!

Speed Comparison (Relative Performance)¶

Inference Speed (tokens/second)
═══════════════════════════════════════════════════════════════════════════

Grok         100  │██████████                                           1x
DeepSeek     150  │███████████████                                      1.5x
OpenAI       100  │██████████                                           1x
Anthropic    100  │██████████                                           1x
Gemini       100  │██████████                                           1x
Mistral      100  │██████████                                           1x
Local GPU     50  │█████                                               0.5x
Ollama        30  │███                                                 0.3x
              0   └────────────────────────────────────────────────────────
                  0                     100                            150

🌐 Grok provides real-time information access via X/Twitter integration!

Quality Ranking (Subjective)¶

Quality Score (1-10)
═══════════════════════════════════════════════════════════════════════════

OpenAI GPT-4o      │██████████████████████████████████████████████████│ 10
Claude Sonnet     │█████████████████████████████████████████████     │  9
Gemini Pro        │████████████████████████████████████████          │  8
Mistral Large     │███████████████████████████████████               │  7
Ollama 70B        │███████████████████████████████████               │  7
DeepSeek          │██████████████████████████████                    │  6
Grok              │██████████████████████████████                    │  6
Local BART        │█████████████████████████                         │  5
                  └────────────────────────────────────────────────────────
                  0                    5                              10

🏆 OpenAI remains the quality leader, but alternatives close the gap!

Privacy Level¶

Privacy Scale (Higher = More Private)
═══════════════════════════════════════════════════════════════════════════

Local ML    🔒🔒🔒🔒🔒 │████████████████████████████████████████████████│ Complete
Ollama      🔒🔒🔒🔒🔒 │████████████████████████████████████████████████│ Complete
Mistral     🔒🔒🔒🔒   │███████████████████████████████████████         │ EU Servers
OpenAI      🔒🔒🔒     │██████████████████████████████                  │ US Servers
Anthropic   🔒🔒🔒     │██████████████████████████████                  │ US Servers
Google      🔒🔒🔒     │██████████████████████████████                  │ Google Cloud
Grok        🔒🔒🔒     │██████████████████████████████                  │ US Servers
DeepSeek    🔒🔒       │████████████████████                            │ China Servers
                      └────────────────────────────────────────────────────

🔒 For maximum privacy, use Local ML or Ollama - data never leaves your device!

Capability Matrix¶

                    ┌─────────────────────────────────────────────────┐
                    │           CAPABILITY SUPPORT MATRIX              │
                    ├─────────────────────────────────────────────────┤
                    │  Provider      │ Status   │ 🎙️ Trans │ 👤 Speaker │ 📝 Summary │
                    ├────────────────┼──────────┼──────────┼────────────┼────────────┤
                    │  Local ML      │ ✅ Impl  │    ✅    │     ✅     │     ✅     │
                    │  Hybrid ML     │ ✅ Impl  │    ❌    │     ❌     │     ✅     │
                    │  OpenAI        │ ✅ Impl  │    ✅    │     ❌     │     ✅     │
                    │  Gemini        │ ✅ Impl  │    ✅    │     ❌     │     ✅     │
                    │  Ollama        │ ✅ Impl  │    ✅    │     ✅     │     ✅     │
                    ├────────────────┼──────────┼──────────┼────────────┼────────────┤
                    │  Anthropic     │ ✅ Impl  │    ❌    │     ❌     │     ✅     │
                    │  Mistral       │ ✅ Impl  │    ❌    │     ❌     │     ✅     │
                    │  DeepSeek      │ ✅ Impl  │    ❌    │     ❌     │     ✅     │
                    │  Grok          │ ✅ Impl  │    ❌    │     ❌     │     ✅     │
                    └─────────────────────────────────────────────────────────────────┘

    ✅ Implemented (9): Local ML, Hybrid ML, OpenAI, Gemini, Ollama, Anthropic, Mistral, DeepSeek, Grok

AI Provider Comparison Guide — decision matrix and recommended configurations
Evaluation Reports — methodology, metrics, and measured performance
Ollama Provider Guide — complete Ollama setup and troubleshooting
Provider Configuration Quick Reference
Provider Implementation Guide
ML Provider Reference