pdfsizeopt-a Free and Open Source PDF Manipulation Tool to Reduce PDF File Size

pdfsizeopt is open source project hosting on Google Code, the main feature is PDF file size optimizer.

About

pdfsizeopt is a collection of best practices and scripts for Unix to optimize the size of PDF files, with focus on PDFs created from TeX and LaTeX documents. pdfsizeopt is developed on a Linux system, and it depends on existing tools such as Python 2.4, Ghostscript 8.50, jbig2enc (optional), sam2p, pngtopnm, pngout (optional), and the Multivalent PDF compressor (optional) written in Java.

The author says it is A Linux solution, and I have test it on my DreamHost, it works. I have tried a PDF, the original PDF is 5.6M, and the optimized/converted PDF is 4.4M, great!

Another great thing, I am working on port it to windows, and all tools needed are ready(some download from website, some compiled by myself, for example jbig2), and have successfully modified pdfsizeopt.py to let it work under windows now, though it still has many bugs(I have submit them to the author) and I will release it later.

Installation instructions

Please note that not all the software mentioned in the instructions below is free software (if we consider freedom). Details:

  • pdfsizeopt: free
  • Python: free
  • Ghostscript: free version available
  • Java: free version available (OpenJDK)
  • sam2p: free
  • jbig2: free (http://github.com/agl/jbig2enc/tree/master)
  • png22pnm: free
  • pngtopnm: free
  • Multivalent.jar: not free software, but you don’t have to pay for using it, and you can download it from the official web site without having to pay
  • PNGOUT: not free software, but you don’t have to pay for using it, and you can download it from the official web site without having to pay

Necessary:

  1. A Unix system is needed, Linux is recommended. The following instructions have been tested on Debian Etch and Ubuntu Hardy.
  2. Install Python 2.4, Python 2.5 or Python 2.6 from package. Earlier or later versions won’t work.
  3. Install Ghostscript 8.61 or later. (You may try pdfsizeopt with Ghostscript 8.54 as well, but 8.54 has some known font conversion problems, so it will produce an error for some PDF files.) Earlier versions won’t work. Make sure the command gs is on your $PATH.
  4. Create a directory named pdfsizeopt.
  5. Check out the source code at http://code.google.com/p/pdfsizeopt/source/checkout , or just download http://pdfsizeopt.googlecode.com/svn/trunk/pdfsizeopt.py as pdfsizeopt/pdfsizeopt.py.
  6. Install a recent sam2p and copy the binary to pdfsizeopt/sam2p. For Linux, the recommended binary is http://pdfsizeopt.googlecode.com/files/sam2p . Please note that the sam2p in Ubuntu Intrepid and Debian Etch is too old. Either compile it yourself, or use the recommended download above.
  7. Install pngtopnm from package, or download the Linux binary from http://pdfsizeopt.googlecode.com/files/png22pnm to pdfsizeopt/png22pnm.

Optional, but strongly recommended:

  1. Install Java 1.5 or newer from package. javac is not necessary. Sun’s Java and OpenJDK are OK, gcj and gij won’t work. Make sure that java -version works and prints something at least 1.5.
  2. Download Multivalent*.jar from http://sourceforge.net/project/showfiles.php?group_id=44509&package_id=37068 (example: Multivalent20060102.jar), and copy it to pdfsizeopt/Multivalent.jar.
  3. Compile jbig2 for yourself, or download the Linux binary from http://pdfsizeopt.googlecode.com/files/jbig2 to pdfsizeopt/jbig2.

Optional, but recommended:

  1. Download the PNGOUT binary for your system. Recommended for Linux: the http://static.jonof.id.au/dl/kenutils/pngout-20070430-linux-static.tar.gz archive on http://www.jonof.id.au/kenutils . For other PNGOUT downloads, visit http://advsys.net/ken/utils.htm . Copy the file pngout-*-linux-static to pdfsizeopt/pngout.

Try it:

  1. Create a file test.pdf, and run pdfsizeopt.py --use-pngout=true --use-jbig2=true --use-multivalent=true test.pdf. The output file will be test.pso.pdf.
  2. If you haven’t installed some of the tools above, try changing =true to =false in the command line.

references,
pdfsizeopt home page
Convert JBIG2 to PDF with free and open source software agl’s jbig2enc
Windows version JBIG2 Encoder-Jbig2.exe

Leave a Reply