試用 DeepSeek OCR 線上演示
即時體驗 DeepSeek OCR 的強大功能。上傳您的圖片,即可立即獲得高準確度的文字提取結果。
Loading DeepSeek OCR...

什麼是 DeepSeek OCR
DeepSeek OCR 是一個先進的光學字符辨識系統,運用尖端 AI 技術從圖像和文件中準確提取文字。採用精密的神經網絡和多語言支援技術,為複雜場景提供強大的文字檢測和辨識能力,同時提供直觀的網頁介面和穩健的 API 整合,實現高效靈活的文字處理工作流程。
- 多語言文字辨識運用先進的神經網絡技術和語言感知處理能力,準確從圖像中提取超過 80 種語言的文字。
- 複雜場景處理使用精密的檢測演算法,處理具有彎曲文字、多重方向和複雜背景的挑戰性文件版面。
- 高準確度辨識透過優化的光學字符辨識和先進的後處理技術,達到業界領先的文字提取準確度。
DeepSeek OCR 的核心功能
專為全球專業人士和開發者設計的先進 AI 驅動文字辨識能力。
多語言支援
透過語言感知字符辨識,辨識超過 80 種語言的文字,包括中文、英文、阿拉伯文等。
強大的文字檢測
在複雜版面中檢測文字區域,包括彎曲文字、多重方向和具有挑戰性的背景條件。
高速處理
透過優化的推理流程和 GPU 加速,快速處理圖像以獲得即時文字提取結果。
統一框架
利用整合的文字檢測和辨識系統,提供從圖像到文字的端到端提取。
結構化版面還原
在提取文字時保留文件結構,包括段落、欄位和表格,並維持適當的格式。
API 整合
透過 RESTful API 和多種程式語言的 SDK 支援,將強大的 OCR 功能整合到您的應用程式中。
大家在 X 上討論 DeepSeek-OCR 的內容
如果您喜歡使用 DeepSeek OCR,請在 Twitter 上分享您的體驗,並使用標籤
Massively unexpected update from DeepSeek: a powerful, high-compression MoE OCR model.
— Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxesTex) October 20, 2025
> In production, DeepSeek-OCR can generate 33 million pages of data per day for LLMs/VLMs using 20 nodes (x8 A100-40G).
They want ALL the tokens. You're welcome to have some too. https://t.co/ks97gjFuhd pic.twitter.com/mXV08ifRle
DeepSeek-OCR has some weird architectural choices for the LLM decoder: DeepSeek3B-MoE-A570M
— elie (@eliebakouch) October 20, 2025
-> uses MHA, no MLA (not even GQA?)
-> 2 shared experts (like DeepSeek V2, but V3 only has 1)
-> quite low sparsity, activation ratio is 12.5%. For V3 it’s 3.52%, for V2 it’s 5%
-> not… pic.twitter.com/nOYptOn3OE
Letsss gooo! DeepSeek just released a 3B OCR model on Hugging Face 🔥
— Vaibhav (VB) Srivastav (@reach_vb) October 20, 2025
Optimised to be token efficient AND scale ~200K+ pages/day on A100-40G
Same arch as DeepSeek VL2
Use it with Transformers, vLLM and more 🤗https://t.co/n4kHihS3At
NEW DeepSeek OCR model that outperforms dots ocr while prefilling 3x less tokens pic.twitter.com/g9T93PndFb
— Casper Hansen (@casper_hansen_) October 20, 2025
🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.
— vLLM (@vllm_project) October 20, 2025
🧠 Compresses visual contexts up to 20× while keeping… pic.twitter.com/bx3d7LnfaR
🚨 DeepSeek just did something wild.
— God of Prompt (@godofprompt) October 20, 2025
They built an OCR system that compresses long text into vision tokens literally turning paragraphs into pixels.
Their model, DeepSeek-OCR, achieves 97% decoding precision at 10× compression and still manages 60% accuracy even at 20×. That… pic.twitter.com/5ChoESanC8
is it just me or is this deepseek paper really…weird? like the flagship results are all about compression ratios and they’re gesturing at implications for LLM memory but… it’s an OCR model? are they suggesting that LLMs should ingest OCR embeddings of screenshots of old notes?? pic.twitter.com/ptxkgANIeW
— will brown (@willccbb) October 20, 2025
DeepSeek-OCR: https://t.co/Hww4tubUiS
— Ray Fernando (@RayFernando1337) October 20, 2025
I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter.
— Andrej Karpathy (@karpathy) October 20, 2025
The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language… https://t.co/AxRXBdoO0F
Compress everything visually!
— 机器之心 JIQIZHIXIN (@jiqizhixin) October 20, 2025
DeepSeek has just released DeepSeek-OCR, a state-of-the-art OCR model with 3B parameters.
Core idea: explore long-context compression via 2D optical mapping.
Architecture:
- DeepEncoder → compresses high-res inputs into few vision tokens;
-… pic.twitter.com/qbRTi8ViLY
