Skip to content

[BUG]: Prompt template uses str.format(), so any literal {/} added to prompt.txt 500s every /forms/fill #581

Description

@vharkins1

⚡️ Describe the Bug

LLM.build_prompt renders the prompt with str.format():

# app/services/llm.py:30
return template.format(field=current_field, type=current_type, text=self._transcript_text)

str.format() parses the template (app/services/prompt.txt) for {...}
replacement fields. Today the template only contains {field}, {type} and
{text}, so it works. But the moment anyone adds a literal brace to the prompt
that is not one of those three placeholders — e.g. a JSON output example or a
few-shot block like {"value": "..."}str.format() raises KeyError
(or ValueError for an unbalanced brace) on every extraction call.

This is a latent footgun rather than a bug in the current prompt: it is
guaranteed to bite the next person who edits prompt.txt to improve extraction
(adding JSON examples to the prompt is one of the most common ways to do exactly
that). Because the prompt is read fresh from disk on each request, a bad edit
takes down /forms/fill with no code change and no obvious link to the prompt.

Note: braces in the user's transcript or in a field name are safe —
str.format() does not re-scan substituted values. The hazard is solely the
template file's own contents.

👣 Steps to Reproduce

  1. Edit app/services/prompt.txt and append an output example containing a
    literal brace, e.g.:
    EXAMPLE: return {"value": "John Smith"}
    
  2. Create a template and call POST /forms/fill (or use the Fill Form screen).
  3. The request fails. Server raises KeyError: '"value"' from
    template.format(...) in build_prompt, surfaced as a generic 500 via the
    handler in app/api/routes/forms.py.

Minimal confirmation without the app:

tmpl = open("app/services/prompt.txt").read() + '\nEXAMPLE: {"value": "..."}'
tmpl.format(field="Signature", type="signature", text="hello")
# -> KeyError: '"value"'

📉 Expected Behavior

Editing the prompt text should never crash the extraction endpoint. The prompt
should be allowed to contain arbitrary characters (including { and } in
examples) and only the intended field / type / text slots should be
substituted.

🖥️ Environment Information

  • OS: macOS (Apple Silicon) — not OS-specific; reproduces anywhere the backend runs
  • Docker/Compose Version: N/A (logic bug, independent of compose)
  • Ollama Model used: N/A (fails before the model is ever called)

📸 Screenshots/Logs

KeyError: '"value"'
  File "app/services/llm.py", line 30, in build_prompt
    return template.format(field=current_field, type=current_type, text=self._transcript_text)

🕵️ Possible Fix

Stop using str.format() for free-form prompt templating. Options, roughly in
order of preference:

  • Use string.Template with $field / $type / $text placeholders and
    safe_substitute() — braces in the prompt body become harmless.
  • Or replace only the named tokens explicitly:
    template.replace("{field}", current_field).replace("{type}", current_type).replace("{text}", self._transcript_text).
  • Or, if str.format() is kept, document that every literal brace in
    prompt.txt must be doubled ({{ / }}) — brittle and easy to forget, so
    least preferred.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions