RubyPDF Blog English,PDF pdf2xml, another tool not introduced by convert pdf to xml

pdf2xml, another tool not introduced by convert pdf to xml

pdf2xml convertor based on Xpdf library (http://www.foolabs.com/xpdf/home.html). The component converts information contained in a PDF file into XML. First, you need to install xpdf and libxml2 (see documentation).
It supports convert PDF to xml with Text and Image.

pdftoxml version 1.0
(Based on Xpdf version 3.01, Copyright 1996-2005 Glyph & Cog, LLC)
Copyright 2004-2006 XEROX XRCE
Usage: pdftoxml [options]  []
  -f                : first page to convert
  -l                : last page to convert
  -verbose               : display pdf attributes
  -noText                : do not extract textual objects
  -noImage               : do not extract Images (Bitmap and Vectorial)
  -noImageInline         : do not include images inline in the stream
  -outline               : create an outline file xml
  -annots                : create an annotations file xml
  -cutPages              : cut all pages in separately files
  -blocks                : add blocks informations whithin the structure
  -fullFontName          : fonts names are not normalized
  -nsURI         : add the specified namespace URI
  -opw           : owner password (for encrypted files)
  -upw           : user password (for encrypted files)
  -q                     : don't print any messages or errors
  -v                     : print copyright and version info
  -h                     : print usage information
  -help                  : print usage information
  --help                 : print usage information
  -?                     : print usage information

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.