-
Notifications
You must be signed in to change notification settings - Fork 258
Description
Bug
Docling's picture description API endpoints (picture_description_api and picture_description_local fields) reject valid JSON configuration when sent via multipart/form-data requests, returning Pydantic validation errors:
{
"detail": [
{
"type": "model_attributes_type",
"loc": ["body", "picture_description_api"],
"msg": "Input should be a valid dictionary or object to extract fields from",
"input": "params"
}
]
}Root Cause: HTTP multipart/form-data can only transmit strings and binary data, not JavaScript objects. Web clients must serialize configuration objects to JSON strings using JSON.stringify(), but Docling's Pydantic models expect dict objects, not JSON strings. This creates an impossible requirement for web applications that need to upload files AND send complex configuration.
Impact: Complete feature dysfunction for web applications. Any integration requiring file uploads with picture description configuration fails. This affects Open WebUI and other web-based Docling integrations.
Related Issue: Open WebUI #15002 confirms this bug prevents Docling picture description from working through web interfaces.
Steps to reproduce
- Start Docling service (using Docker or local installation)
- Attempt to send a picture description request via multipart/form-data:
curl -X POST http://localhost:5001/v1/documents/convert \
-F 'file=@test-image.jpg' \
-F 'picture_description_api={"url":"http://example.com/api","params":{"model":"test"}}'- Observe the Pydantic validation error rejecting the JSON string
- Note that the same configuration works perfectly when sent via application/json (without file upload)
Alternative reproduction via Open WebUI:
- Configure Open WebUI with Docling integration
- Enable picture description in Docling settings
- Enter any JSON configuration in "Picture Description API Config" field
- Save configuration and upload a PDF with images
- Observe validation errors in browser console and server logs
Docling version
Latest Docker image: ds4sd/docling:latest
(This affects all recent versions as it's a fundamental API design issue)
Python version
Python 3.11+ (version-independent - this is a Pydantic/HTTP protocol issue)
Proposed Fix: Add Pydantic validator to parse JSON strings for multipart compatibility:
from typing import Union
import json
from pydantic import validator
@validator('picture_description_api', 'picture_description_local', pre=True)
def parse_json_config(cls, v):
if isinstance(v, str):
try:
return json.loads(v)
except json.JSONDecodeError:
raise ValueError("Invalid JSON string")
return vThis is the standard solution used by GitHub API, AWS S3, and other APIs that accept both files and complex metadata in multipart requests.