Skip to content

refactor ocr and table recognition logic#80

Merged
myhloli merged 4 commits intoopendatalab:xiaomeng_devfrom
myhloli:main
Aug 8, 2024
Merged

refactor ocr and table recognition logic#80
myhloli merged 4 commits intoopendatalab:xiaomeng_devfrom
myhloli:main

Conversation

@myhloli
Copy link
Copy Markdown
Collaborator

@myhloli myhloli commented Aug 8, 2024

  • refactor(pdf_extract): optimize image processing and table recognition
  • fix(pdf_extract): optimize batch size and worker count for DataLoader
  • feat(self_modify): refine text and formula detection box updating logic
  • refactor(extract_pdf): When converting a PDF to a list of images, do not perform a BGR channel conversion upfront.

myhloli added 4 commits August 7, 2024 18:54
…not perform a BGR channel conversion upfront.
Update the logic for merging and refining detection boxes in self_modify module.
Replace hardcoded checks with dynamic calculations for determining overlapping regions,
resulting in more accurate detection box merging when formulae are identified within texts.
Reduce the batch size from 128 to 64 and set the number of workers to 0 in the DataLoaderto improve stability and performance on systems with limited resources.

refactor(pdf_extract): refactor ocr and table recognition logicRefactor the ocr and table recognition logic to enhance readability and maintainability.This includes the adjustment of formula recognition coordinates relative to the cropped
image and streamlining the process for handling OCR results and table recognition.
- Rename loop variable 'idx' to 'pdf_idx' for clarity.- Adjust image pasting and coordinate handling during OCR processing.- Add comments for improved code understanding.- Ensure proper rendering of images during PDF visualization.
- Refactor logging and utility imports in self_modify module.

The changes include improvements to image processing routines, better variable naming,
and streamlined table recognition logic. Also, the visualization process has been tweaked
to handle images more accurately. Additionally, redundant logging and utility importshave been cleaned up in the self_modify module to declutter the codebase.
@myhloli myhloli merged commit a7e95f5 into opendatalab:xiaomeng_dev Aug 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant