Get the finial MEAP Version iText in action PDF Ebook today
Today, Manning Publications Co. sent me the finished version iText in Action PDF Ebook, it is a wonderful moring.
have a look of the last page of iText in action PDF Ebook
Today, Manning Publications Co. sent me the finished version iText in Action PDF Ebook, it is a wonderful moring.
have a look of the last page of iText in action PDF Ebook
pdftohtml is a utility which converts PDF files into HTML and XML formats. It bases on XPDF. And it is open source and written in C++ .
Usage: pdftohtml [options]
[ ]
-f: first page to convert
-l: last page to convert
-q : don’t print any messages or errors
-h : print usage information
-help : print usage information
-p : exchange .pdf links by .html
-c : generate complex document
-i : ignore images
-noframes : generate no frames
-stdout : use standard output
-zoom: zoom the pdf document (default 1.5)
-xml : output for XML post-processing
-hidden : output hidden text
-nomerge : do not merge paragraphs
-enc: output text encoding name
-dev: output device name for Ghostscript (png16m, jpeg etc)
-v : print copyright and version info
-opw: owner password (for encrypted files)
-upw: user password (for encrypted files)
I have even use it to generate Excel from PDF, converting a 927 PDF file to Excel document.
btw, it supports windows, linux, mac OSX and so on.
I collect some useful program here, hope it is useful for you.
only Acrobat is commercial, others are all open source.
if you know other program that can extract text from PDF under dotnet, please let me known, thanks in advance.