The fastest video engine
for LLMs
Cherry delivers 3x faster video understanding at 1/3 the cost. One unified API for extraction, search, and analysis across any video content.
-H "Authorization: Bearer $CHERRY_KEY" \
-d '{"video_url": "https://..."}'
How it works
Video understanding in three steps
Cherry abstracts away the complexity of video processing so you can focus on building your application.
Send a Video
Pass any video URL or file to Cherry's API. We handle ingestion, chunking, and frame extraction automatically.
Cherry Processes
Our optimized pipeline extracts visual and audio features, builds temporal indices, and prepares context for LLM inference.
Get Structured Output
Receive rich, structured results — summaries, timestamps, extracted entities, or answers to your natural language queries.
from cherry import Cherry
client = Cherry(api_key="sk-...")
# Analyze a video with a single call
result = client.analyze(
video_url="https://example.com/video.mp4",
prompt="Summarize the key points discussed",
output_format="structured"
)
print(result.summary)
print(result.timestamps)Benchmarks
MLVU benchmark comparison
Cherry leads on accuracy, latency, and cost-efficiency across standardized video understanding benchmarks.
| Provider | Accuracy | Latency | Cost/Correct |
|---|---|---|---|
| Cherry 1.0 Standard | 78.6% | 31.5s | $0.0098 |
| Gemini Native | 70.0% | 84.2s | $0.0292 |
| TwelveLabs | 58.3% | 219.9s | $0.5665 |
| Qwen-VL | 48.1% | 96.5s | $0.0739 |
Why Cherry
Built for performance
31.5s
3x Faster
Cherry processes videos in a fraction of the time compared to alternatives. Our optimized pipeline delivers results when you need them.
$0.0098
3x Cheaper
At less than a penny per correct answer, Cherry is the most cost-effective video understanding API on the market.
1 call
One Unified API
Extraction, search, summarization, and Q&A — all through a single API primitive. No more stitching together multiple services.
Our Story
Built for the Frontier
Cherry was born from a simple necessity: the need for reliable video understanding at scale. Existing solutions were too slow, too expensive, and too fragmented for modern LLM workflows. So we built our own—a high-performance engine designed from the ground up for the next generation of AI.
The Team
Our founding team comes from the front lines of AI research and engineering at Google, DeepMind, and Meta. With a background including Nature-published research, Imperial PhDs, and MIT engineering, we are obsessed with solving real-world problems through technical excellence.
Pricing
Simple, transparent pricing
Start free. Scale when you're ready.