missing option for trapping exceptions during indexing

i have run into several reasons why indexing a doc can fail (api error, pdfreader error, pdf error, etc.,.). however, the current CLI and functions for indexing simply error out when this happens. a more graceful strategy could be implemented, to move the skipped pdfs/docs into a list, and reporting failed instances at the end instead of erroring out.

for the moment, i'm reusing the indexing functions and adding to a list (https://github.com/sensein/paperqa-test/blob/53d5dcf6af3d44645668a01103247d6d8b0de86a/index_abcd.py#L26). but this iterative process is slowly and relies on monitoring.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

missing option for trapping exceptions during indexing #1275

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

missing option for trapping exceptions during indexing #1275

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions