Improve pdf creation: implement `--pages-per-dict`, support for background images

Hi,

I propose a patch which changes jbig2 behavior at two aspects. First, the files generated in the "-p" mode now retain their original names, and just the extension is changed (I use ".jbig2", but whatever else would be OK). A numerical suffix is added in case of name clashes (or for images which go from multipage tiff files). For this reason the 'basename' parameter is gone. The reason for this change is that source images may have some accompanying files (such as background images previously separated with a scan processing application). In such case file names contain some useful information which should not be lost during the processing/conversion.

The second change allows to generate more than just one symbol dictionary, so that the loading speed for large PDF files can be increased. There is now a new option (-P, --pages-per-dict), which specifies how many pages should be processed at the same pass. The default value for this parameter is 15.

I also propose a modified version of pdf.py, implementing support for background images, which can be combined with the foreground mask in the same pdf file. Several graphical formats (PNG, TIFF, JPEG) are supported. It is possible either to use graphics stripped by jbig2 at the previous stage, or prepage images separately in a different application, given that the file names follow the same convention.

BTW it might be reasonable to rename pdf.py to something more meaningful, so that the script could be safely installed somewhere into the PATH.

The files can be downloaded here:
http://www.thessalonica.org.ru/downloads/jbig2.patch.gz
http://www.thessalonica.org.ru/downloads/pdf.py.gz


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve pdf creation: implement `--pages-per-dict`, support for background images #10

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Improve pdf creation: implement --pages-per-dict, support for background images #10

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

Improve pdf creation: implement `--pages-per-dict`, support for background images #10