Skip to content

Improve pdf creation: implement --pages-per-dict, support for background images #10

@akryukov

Description

@akryukov

Hi,

I propose a patch which changes jbig2 behavior at two aspects. First, the files generated in the "-p" mode now retain their original names, and just the extension is changed (I use ".jbig2", but whatever else would be OK). A numerical suffix is added in case of name clashes (or for images which go from multipage tiff files). For this reason the 'basename' parameter is gone. The reason for this change is that source images may have some accompanying files (such as background images previously separated with a scan processing application). In such case file names contain some useful information which should not be lost during the processing/conversion.

The second change allows to generate more than just one symbol dictionary, so that the loading speed for large PDF files can be increased. There is now a new option (-P, --pages-per-dict), which specifies how many pages should be processed at the same pass. The default value for this parameter is 15.

I also propose a modified version of pdf.py, implementing support for background images, which can be combined with the foreground mask in the same pdf file. Several graphical formats (PNG, TIFF, JPEG) are supported. It is possible either to use graphics stripped by jbig2 at the previous stage, or prepage images separately in a different application, given that the file names follow the same convention.

BTW it might be reasonable to rename pdf.py to something more meaningful, so that the script could be safely installed somewhere into the PATH.

The files can be downloaded here:
http://www.thessalonica.org.ru/downloads/jbig2.patch.gz
http://www.thessalonica.org.ru/downloads/pdf.py.gz

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions