Convert PDF to XML with Microsoft XPS Document Writer

After installed either .Net Framework 3.0 or the XPS Essential Pack on Windows XP SP2/Windows 2003, or you use Vista, you would discover that a “Microsoft XPS Document Writer” appeared in your “Printers and Faxes”. but what is it?
The Microsoft XPS Document Writer (MXDW) is a print-to-file driver that enables any Windows XP SP2/2003/Vista application to create XML Paper Specification (XPS) Document files.

An XPS file is in fact a ZIP archive using the Open Packaging Convention, which contains the files which make up the document. These include an XML markup file for each page, text, embedded fonts, raster images, 2D vector graphics, as well as the digital rights management information. The contents of an XPS file can be examined simply by opening it in an application which supports ZIP files.

As you know, it is not a easy job to convert PDF to XML, even using Adobe Acrobat, but I found The Microsoft XPS Document Writer can easily print PDF to XML, also including images.
But it has an issue when you try to Print the PDF document double-byte characters(for example Chinese PDF document), all double-byte characters will be converted to vector graphics.

Leave a Reply