pdftohtml, convert pdf to html and xml,even excel

pdftohtml is a utility which converts PDF files into HTML and XML formats. It bases on XPDF. And it is open source and written in C++ .

Usage: pdftohtml [options] [ ]
-f : first page to convert
-l : last page to convert
-q : don’t print any messages or errors
-h : print usage information
-help : print usage information
-p : exchange .pdf links by .html
-c : generate complex document
-i : ignore images
-noframes : generate no frames
-stdout : use standard output
-zoom : zoom the pdf document (default 1.5)
-xml : output for XML post-processing
-hidden : output hidden text
-nomerge : do not merge paragraphs
-enc : output text encoding name
-dev : output device name for Ghostscript (png16m, jpeg etc)
-v : print copyright and version info
-opw : owner password (for encrypted files)
-upw : user password (for encrypted files)

I have even use it to generate Excel from PDF, converting a 927 PDF file to Excel document.
btw, it supports windows, linux, mac OSX and so on.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.