最新🚀 基於 DeepSeek OCR 3B 模型 - 開源!

Deepseek-OCR：情境光學壓縮

DeepSeek OCR 是由 DeepSeek 打造的新一代光學字符辨識 (OCR) 解決方案,現已透過其開源模型中心和 API 提供服務。它支援複雜的視覺文字輸入——包括掃描文件、照片、表單和混合版面頁面——將文字提取、版面理解和視覺情境理解整合到一個無縫模型中。DeepSeek OCR 能夠以工業規模轉換高解析度影像(例如,在單個 A100 級 GPU 上每天處理數十萬頁)。立即免費試用 DeepSeek OCR!

試用 DeepSeek OCR 線上演示

即時體驗 DeepSeek OCR 的強大功能。上傳您的圖片,即可立即獲得高準確度的文字提取結果。

Loading DeepSeek OCR...

什麼是 DeepSeek OCR

DeepSeek OCR 是一個先進的光學字符辨識系統,運用尖端 AI 技術從圖像和文件中準確提取文字。採用精密的神經網絡和多語言支援技術,為複雜場景提供強大的文字檢測和辨識能力,同時提供直觀的網頁介面和穩健的 API 整合,實現高效靈活的文字處理工作流程。

多語言文字辨識
運用先進的神經網絡技術和語言感知處理能力,準確從圖像中提取超過 80 種語言的文字。
複雜場景處理
使用精密的檢測演算法,處理具有彎曲文字、多重方向和複雜背景的挑戰性文件版面。
高準確度辨識
透過優化的光學字符辨識和先進的後處理技術,達到業界領先的文字提取準確度。

DeepSeek OCR 的核心功能

專為全球專業人士和開發者設計的先進 AI 驅動文字辨識能力。

多語言支援

透過語言感知字符辨識,辨識超過 80 種語言的文字,包括中文、英文、阿拉伯文等。

強大的文字檢測

在複雜版面中檢測文字區域,包括彎曲文字、多重方向和具有挑戰性的背景條件。

高速處理

透過優化的推理流程和 GPU 加速,快速處理圖像以獲得即時文字提取結果。

統一框架

利用整合的文字檢測和辨識系統,提供從圖像到文字的端到端提取。

結構化版面還原

在提取文字時保留文件結構,包括段落、欄位和表格,並維持適當的格式。

API 整合

透過 RESTful API 和多種程式語言的 SDK 支援,將強大的 OCR 功能整合到您的應用程式中。

大家在 X 上討論 DeepSeek-OCR 的內容

如果您喜歡使用 DeepSeek OCR,請在 Twitter 上分享您的體驗,並使用標籤

Massively unexpected update from DeepSeek: a powerful, high-compression MoE OCR model.
> In production, DeepSeek-OCR can generate 33 million pages of data per day for LLMs/VLMs using 20 nodes (x8 A100-40G).
They want ALL the tokens. You're welcome to have some too. https://t.co/ks97gjFuhd pic.twitter.com/mXV08ifRle
— Teortaxes▶️ (DeepSeek 推特🐋铁粉 2023 – ∞) (@teortaxesTex) October 20, 2025

DeepSeek-OCR has some weird architectural choices for the LLM decoder: DeepSeek3B-MoE-A570M
-> uses MHA, no MLA (not even GQA?)
-> 2 shared experts (like DeepSeek V2, but V3 only has 1)
-> quite low sparsity, activation ratio is 12.5%. For V3 it’s 3.52%, for V2 it’s 5%
-> not… pic.twitter.com/nOYptOn3OE
— elie (@eliebakouch) October 20, 2025

Letsss gooo! DeepSeek just released a 3B OCR model on Hugging Face 🔥

Optimised to be token efficient AND scale ~200K+ pages/day on A100-40G

Same arch as DeepSeek VL2

Use it with Transformers, vLLM and more 🤗https://t.co/n4kHihS3At
— Vaibhav (VB) Srivastav (@reach_vb) October 20, 2025

NEW DeepSeek OCR model that outperforms dots ocr while prefilling 3x less tokens pic.twitter.com/g9T93PndFb
— Casper Hansen (@casper_hansen_) October 20, 2025

🚀 DeepSeek-OCR — the new frontier of OCR from @deepseek_ai , exploring optical context compression for LLMs, is running blazingly fast on vLLM ⚡ (~2500 tokens/s on A100-40G) — powered by vllm==0.8.5 for day-0 model support.

🧠 Compresses visual contexts up to 20× while keeping… pic.twitter.com/bx3d7LnfaR
— vLLM (@vllm_project) October 20, 2025

🚨 DeepSeek just did something wild.

They built an OCR system that compresses long text into vision tokens literally turning paragraphs into pixels.

Their model, DeepSeek-OCR, achieves 97% decoding precision at 10× compression and still manages 60% accuracy even at 20×. That… pic.twitter.com/5ChoESanC8
— God of Prompt (@godofprompt) October 20, 2025

is it just me or is this deepseek paper really…weird? like the flagship results are all about compression ratios and they’re gesturing at implications for LLM memory but… it’s an OCR model? are they suggesting that LLMs should ingest OCR embeddings of screenshots of old notes?? pic.twitter.com/ptxkgANIeW
— will brown (@willccbb) October 20, 2025

DeepSeek-OCR: https://t.co/Hww4tubUiS
— Ray Fernando (@RayFernando1337) October 20, 2025

I quite like the new DeepSeek-OCR paper. It's a good OCR model (maybe a bit worse than dots), and yes data collection etc., but anyway it doesn't matter.

The more interesting part for me (esp as a computer vision at heart who is temporarily masquerading as a natural language… https://t.co/AxRXBdoO0F
— Andrej Karpathy (@karpathy) October 20, 2025

Compress everything visually!

DeepSeek has just released DeepSeek-OCR, a state-of-the-art OCR model with 3B parameters.

Core idea: explore long-context compression via 2D optical mapping.

Architecture:

- DeepEncoder → compresses high-res inputs into few vision tokens;
-… pic.twitter.com/qbRTi8ViLY
— 机器之心 JIQIZHIXIN (@jiqizhixin) October 20, 2025

Deepseek-OCR：情境光學壓縮

試用 DeepSeek OCR 線上演示

什麼是 DeepSeek OCR

DeepSeek OCR 的核心功能

多語言支援

強大的文字檢測

高速處理

統一框架

結構化版面還原

API 整合

大家在 X 上討論 DeepSeek-OCR 的內容

常見問題

什麼是 DeepSeek OCR,它如何運作?

DeepSeek OCR 可以處理哪些類型的文件?

使用 DeepSeek OCR 需要安裝任何軟體嗎?

DeepSeek OCR 辨識系統的主要功能有哪些?

我可以將 DeepSeek OCR 與其他軟體和應用程式整合嗎?

與其他 OCR 系統相比,DeepSeek OCR 的準確度如何?