» Archive for the 'iTextSharp(iText#)' Category

Font CharSet and Encoding

Wednesday, July 16th, 2008 by rubypdf

yesterday, A Chinese developer sent me WMF with Chinese character, and the FontFace name is also Chinese, he said the iTextSharp could convert the WMF to PDF correctly. so I try to debug into the source code, and find MetaFont only support ASCII.

try {
faceName = System.Text.Encoding.GetEncoding(1252).GetString(name, 0, k);
}
catch {
faceName = System.Text.ASCIIEncoding.ASCII.GetString(name, 0, k);
}

but charset of Chinese font(GB2312) is 134, so I think the source code needs to get encoding from charset, and then get the right faceName.
Finally, I get the following codes,

public enum FontCharSet : byte
{
ANSI_CHARSET = 0,// ANSI charset (Windows-1252)
DEFAULT_CHARSET = 1,
SYMBOL_CHARSET = 2,
MAC_CHARSET = 77,
SHIFTJIS_CHARSET = 128,// Shift JIS charset (Windows-932)
HANGEUL_CHARSET = 129,// Hangeul charset (Windows-949)
HANGUL_CHARSET = 129,
JOHAB_CHARSET = 130, // Johab charset (Windows-1361)
GB2312_CHARSET = 134,// GB2312 charset (Windows-936)
CHINESEBIG5_CHARSET = 136,// Chinese Big5 charset (Windows-950)
GREEK_CHARSET = 161, // Greek charset (Windows-1253)
TURKISH_CHARSET = 162,// Turkish charset (Windows-1254)
VIETNAMESE_CHARSET = 163,// Vietnamese charset (Windows-1258)
HEBREW_CHARSET = 177,// Hebrew charset (Windows-1255)
ARABIC_CHARSET = 178,// Arabic charset (Windows-1256)
BALTIC_CHARSET = 186,// Baltic charset (Windows-1257)
RUSSIAN_CHARSET = 204,// Cyrillic charset (Windows-1251)
THAI_CHARSET = 222,// Thai charset (Windows-874)
EASTEUROPE_CHARSET = 238, // Eastern european charset (Windows-1250)
OEM_CHARSET = 255,
}

and some very useful source codes( WMFUtilities.java and WMFConstants.java ) file from Apache Batik, it has implement the charset issue.

BTW, iText also has the same issue,

187 try {
188 font = BaseFont.createFont(fontName, "Cp1252", false);
189 }
190 catch (Exception e) {
191 throw new ExceptionConverter(e);
192 }

Batch Extract XMP from PDF to XML

Saturday, June 28th, 2008 by rubypdf

Here is a requirement that want to batch dump xmp from PDF to xml file or Database,

I’d like to know if you have developed or if you can develop an application for extracting customize XMP from PDF documents.

I’ll try to be more relevant: I customized a specific card for additional metadata in Acrobat Professional. If I save the xmp properties in xml format, I obtain the value that I inserted, after that I import xml file in Database. I’d like to know if is possible to develop an application that can extract xmp customized value from a group of PDF files.

And what is XMP?

Adobe’s Extensible Metadata Platform (XMP) is a labeling technology that allows you to embed data about a file, known as metadata, into the file itself. With XMP, desktop applications and back-end publishing systems gain a common method for capturing, sharing, and leveraging this valuable metadata — opening the door for more efficient job processing, workflow automation, and rights management, among many other possibilities. With XMP, Adobe has taken the “heavy lifting” out of metadata integration, offering content creators an easy way to embed meaningful information about their projects and providing industry partners with standards-based building blocks to develop optimized workflow solutions.

Finally, I used iTextSharp(of course iText also ok) to batch extract XMP from PDF, and save it to XML.

Web service (C#) to digitally sign PDF documents

Friday, June 27th, 2008 by rubypdf

I just finished a bid from RentAcoder, the title is “Web service (C#) to digitally sign PDF documents”, it is a private bid, the requirement is,

Buyer needs a C# web service that will digitally sign a PDF created using ITextSharp.

The PDF will be already created and would be updated with a digital signature by the new web service. The web service would take as parameters the name of the document and some user authentication information (user id and password). The web service would validate the user id/password and sign the document digitally with a self-signed certificate.

The signature that appears on the document will be either an image of a signature or a unique text field like Sergie.Norwinski.123456.

The web service will also return a numeric return code and return message indicating success or failure.

The web service should also add the success or failure status to a SQL Server 2005 database. The database already contains the stored procedures to authenticate user identity and log the status of the request; the new web service simply needs to execute them.

Buyer will provide the self-signed certificates that are to be used by the web service.

And the buyer said,

Steven was an absolute delight to work with. He was very timely with his deliverables and the code provided worked as intended and was easily integrated into an existing application.

He was extremely flexible, accomodating, resourceful and has a great sense of ingenuity and creativity. Great C# skills and exceptional PDF and iTextSharp knowledge.

The pitfall of this bid, need parser PDF and get the x and y coordinates from the special given text, for example, get the x and y coordinates of “Signature/Credentials:” and put the signature after it, get the x and y coordinates of “Date Signed:”, and put the date string after it, but by default, iTextSharp does not support it.