Replies: 2 comments 3 replies
-
|
Hi @eleaner! I'm Dosu and I’m helping the docling team. Yes, you can configure docling to work with llama.cpp (or other OpenAI-compatible servers) running on a different machine. Docling supports remote LLMs via OpenAI-compatible API endpoints—users have successfully used llama-server and vllm as remote backends, and there’s no vendor lock-in to Ollama or any specific provider. The key is to enable remote services and set the API endpoint URL and model name in your configuration. Here’s a simple step-by-step guide:
from docling.datamodel.pipeline_options import ApiVlmOptions, VlmPipelineOptions, ResponseFormat
APIURL = "http://<remote-ip>:8000/v1/chat/completions" # Replace <remote-ip> with your server's address
APIMODEL = "your-model-name" # The model name as served by llama.cpp
def llama_vlm_options(model: str, prompt: str):
return ApiVlmOptions(
url=APIURL,
params={"model": model},
prompt=prompt,
timeout=600,
scale=1.0,
response_format=ResponseFormat.MARKDOWN,
)
pipeline_options = VlmPipelineOptions(
enable_remote_services=True # This must be True for remote connections
)
pipeline_options.vlm_options = llama_vlm_options(
model=APIMODEL,
prompt="Your prompt here"
)
Caveats:
For more details and troubleshooting, see the discussion and example code. Let me know if you want a more detailed example or help with a specific part of the setup! To reply, just mention @dosu. How did I do? Good | Irrelevant | Incorrect | Verbose | Hallucination | Report 🐛 | Other |
Beta Was this translation helpful? Give feedback.
-
|
Working with the |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
The idea is that I would love to use the power of LLM but I do not want to run my private LLM on the same machine as docling.
And if so - then how to do it?
Unfortunately the I am afraid I am too simple to understand what i should do (if at all possible)
Thanks
Beta Was this translation helpful? Give feedback.
All reactions