δΈζ β’
DeepX OCR is a high-performance, multi-threaded asynchronous OCR inference engine based on PP-OCRv5, optimized for DeepX NPU acceleration.
- System Architecture - Detailed architecture diagrams, data flow, and model configuration.
- π High Performance: Asynchronous pipeline optimized for DeepX NPU.
- π Multi-threading: Efficient thread pool management for concurrent processing.
- π οΈ Modular Design: Decoupled Detection, Classification, and Recognition modules.
- π Multi-language Support: Built-in
freetypesupport for rendering multi-language text. - π Comprehensive Benchmarking: Integrated tools for performance analysis.
# Clone the repository and initialize submodules
git clone --recursive git@github.com:Chris-godz/DEEPX-OCR.git
cd DEEPX-OCR# Install freetype dependencies (for multi-language text rendering)
sudo apt-get install libfreetype6-dev libharfbuzz-dev libfmt-dev# Build the project
./build.sh
# Download/Setup models
./setup.sh
# Set DXRT environment variables (Example)
source ./set_env.sh 1 2 1 3 2 4# Run the interactive test menu
./run.shThis project uses Git Submodules to manage dependencies (nlohmann/json, Clipper2, spdlog, OpenCV, opencv_contrib).
Includes opencv_contrib for better text rendering support.
# Update submodules
git submodule update --init 3rd-party/opencv
git submodule update --init 3rd-party/opencv_contrib
# Build
./build.shFaster build if you already have OpenCV installed.
# Set environment variable
export BUILD_OPENCV_FROM_SOURCE=OFF
# Build
./build.shOCR/
βββ π src/ # Source Code
β βββ π common/ # Common Utilities (geometry, visualizer, logger)
β βββ π preprocessing/ # Preprocessing (uvdoc, image_ops)
β βββ π detection/ # Text Detection Module
β βββ π classification/ # Orientation Classification
β βββ π recognition/ # Text Recognition Module
β βββ π pipeline/ # Main OCR Pipeline
βββ π 3rd-party/ # Dependencies (Git Submodules)
β βββ π¦ json # nlohmann/json
β βββ π¦ clipper2 # Polygon Clipping
β βββ π¦ spdlog # Logging
β βββ π¦ opencv # Computer Vision
β βββ π¦ opencv_contrib # Extra Modules (freetype)
β βββ π¦ crow # HTTP Framework
β βββ π¦ poppler # PDF Rendering
β βββ π¦ cpp-base64 # Base64 Encoding
β βββ π¦ googletest # Unit Testing Framework
βββ π engine/model_files # Model Weights
β βββ π server/ # High-Accuracy Models
β βββ π mobile/ # Lightweight Models
βββ π server/ # HTTP Server
β βββ π benchmark/ # API Benchmark
β βββ π tests/ # Server Tests
β βββ π webui/ # Web Interface
βββ π benchmark/ # Performance Benchmarking
βββ π test/ # Unit & Integration Tests
βββ π docs/ # Documentation
βββ π build.sh # Build Script
βββ π run.sh # Interactive Runner
βββ π setup.sh # Model Setup Script
βββ π set_env.sh # Environment Setup
./run.sh# Pipeline Test
./build_Release/bin/test_pipeline_async
# Module Tests
./build_Release/test_detector # Detection
./build_Release/test_recognizer # Recognition (Server)
./build_Release/test_recognizer_mobile # Recognition (Mobile)# Run Python benchmark wrapper
python3 benchmark/run_benchmark.py --model server
python3 benchmark/run_benchmark.py --model mobileTest Configuration (from docs/results/local/x86/ reports):
- Model: PP-OCR v5 (DEEPX NPU acceleration)
- Dataset Size: 20 images
- Success Rate: 100% (20/20)
Performance Summary (Server):
| Setup | Avg Inference Time (ms) | Avg FPS | Avg CPS (chars/s) | Avg Character Accuracy |
|---|---|---|---|---|
| Single Card | 135.06 | 7.40 | 243.22 | 96.93% |
| Dual Cards | 67.89 | 14.73 | 483.88 | 96.93% |
| Three Cards | 45.55 | 21.96 | 721.23 | 96.93% |
Performance Summary (Mobile):
| Setup | Avg Inference Time (ms) | Avg FPS | Avg CPS (chars/s) | Avg Character Accuracy |
|---|---|---|---|---|
| Single Card | 82.93 | 12.06 | 378.63 | 89.60% |
| Dual Cards | 44.24 | 22.61 | 709.83 | 89.60% |
| Three Cards | 33.00 | 30.30 | 951.57 | 89.60% |
Detailed Reports:
| Setup | Server | Mobile |
|---|---|---|
| Single Card | Report | Report |
| Dual Cards | Report | Report |
| Three Cards | Report | Report |
Test Configuration (from docs/results/local/arm/ reports):
- Model: PP-OCR v5 (DEEPX NPU acceleration)
- Dataset Size: 20 images
- Success Rate: 100% (20/20)
Performance Summary:
| Model | Avg Inference Time (ms) | Avg FPS | Avg CPS (chars/s) | Avg Character Accuracy |
|---|---|---|---|---|
| Server | 133.88 | 7.47 | 245.74 | 96.82% |
| Mobile | 60.00 | 16.67 | 524.96 | 89.37% |
Detailed Reports:
| Model | Report |
|---|---|
| Server | Report |
| Mobile | Report |
π Reproduce Benchmark Results
To reproduce the benchmark results above, run the following commands:
# 1. Build the project
./build.sh
# 2. Download/setup models
./setup.sh
# 3. Set DXRT environment variables (example)
source ./set_env.sh 1 2 1 3 2 4
# 4. Run benchmark (server model, 60 runs per image)
python3 benchmark/run_benchmark.py --model server --runs 60 \
--images_dir test/twocode_images
# 5. Run benchmark (mobile model, 60 runs per image)
python3 benchmark/run_benchmark.py --model mobile --runs 60 \
--images_dir test/twocode_imagesParameters:
| Parameter | Description | Default |
|---|---|---|
--model |
Model type (server / mobile) |
server |
--runs |
Number of runs per image | 3 |
--images_dir |
Test images directory | images |
--no-acc |
Skip accuracy calculation | - |
--no-cpp |
Skip C++ benchmark (use existing results) | - |
Test configuration (same across all reports):
- Mode: throughput
- Concurrency: 20
- Runs per sample: 20
Server Model:
| Setup | QPS | Success Rate | CPS (chars/s) | Accuracy | Avg Latency (ms) | P50 (ms) | P99 (ms) |
|---|---|---|---|---|---|---|---|
| Single Card | 7.64 | 100% | 236.88 | 96.93% | 2594.17 | 2618.61 | 3498.46 |
| Dual Cards | 13.62 | 100% | 401.24 | 89.60% | 1423.65 | 1438.99 | 1786.95 |
| Three Cards | 21.50 | 100% | 605.96 | 96.93% | 900.14 | 907.47 | 1517.51 |
Mobile Model:
| Setup | QPS | Success Rate | CPS (chars/s) | Accuracy | Avg Latency (ms) | P50 (ms) | P99 (ms) |
|---|---|---|---|---|---|---|---|
| Single Card | 13.62 | 100% | 401.24 | 89.60% | 1423.65 | 1438.99 | 1786.95 |
| Dual Cards | 23.97 | 100% | 692.24 | 89.60% | 788.05 | 763.87 | 1586.34 |
| Three Cards | 28.00 | 100% | 801.66 | 89.60% | 635.59 | 564.74 | 1299.82 |
Detailed reports:
| Setup | Server | Mobile |
|---|---|---|
| Single Card | Report | Report |
| Dual Cards | Report | Report |
| Three Cards | Report | Report |
| Model | QPS | Success Rate | CPS (chars/s) | Accuracy | Avg Latency (ms) | P50 (ms) | P99 (ms) |
|---|---|---|---|---|---|---|---|
| Server | 7.45 | 100% | 225.62 | 96.82% | 2635.66 | 2646.28 | 4270.81 |
| Mobile | 16.11 | 100% | 469.57 | 89.37% | 1192.55 | 1200.13 | 1673.76 |
Detailed reports:
| Model | Report |
|---|---|
| Server | Report |
| Mobile | Report |
π Reproduce API Server Benchmark Results
- Start the OCR server:
cd server
./run_server.sh- Install benchmark dependencies:
cd server/benchmark
pip install -r requirements.txt- Run throughput test:
./quick_start.sh
# Select option 2 to run the throughput test- Start the OCR server (required for the WebUI backend):
cd server
./run_server.sh- Start the WebUI:
cd server/webui
python3 -m venv venv && source venv/bin/activate
pip install -r requirements.txt
python app.pyAccess: http://localhost:7860
