Font CharSet and Encoding

yesterday, A Chinese developer sent me WMF with Chinese character, and the FontFace name is also Chinese, he said the iTextSharp could convert the WMF to PDF correctly. so I try to debug into the source code, and find MetaFont only support ASCII.

try {
faceName = System.Text.Encoding.GetEncoding(1252).GetString(name, 0, k);
}
catch {
faceName = System.Text.ASCIIEncoding.ASCII.GetString(name, 0, k);
}

but charset of Chinese font(GB2312) is 134, so I think the source code needs to get encoding from charset, and then get the right faceName.
Finally, I get the following codes,

public enum FontCharSet : byte
{
ANSI_CHARSET = 0,// ANSI charset (Windows-1252)
DEFAULT_CHARSET = 1,
SYMBOL_CHARSET = 2,
MAC_CHARSET = 77,
SHIFTJIS_CHARSET = 128,// Shift JIS charset (Windows-932)
HANGEUL_CHARSET = 129,// Hangeul charset (Windows-949)
HANGUL_CHARSET = 129,
JOHAB_CHARSET = 130, // Johab charset (Windows-1361)

GB2312_CHARSET = 134,// GB2312 charset (Windows-936)
CHINESEBIG5_CHARSET = 136,// Chinese Big5 charset (Windows-950)
GREEK_CHARSET = 161, // Greek charset (Windows-1253)
TURKISH_CHARSET = 162,// Turkish charset (Windows-1254)
VIETNAMESE_CHARSET = 163,// Vietnamese charset (Windows-1258)
HEBREW_CHARSET = 177,// Hebrew charset (Windows-1255)
ARABIC_CHARSET = 178,// Arabic charset (Windows-1256)
BALTIC_CHARSET = 186,// Baltic charset (Windows-1257)
RUSSIAN_CHARSET = 204,// Cyrillic charset (Windows-1251)
THAI_CHARSET = 222,// Thai charset (Windows-874)
EASTEUROPE_CHARSET = 238, // Eastern european charset (Windows-1250)
OEM_CHARSET = 255,
}

and some very useful source codes( WMFUtilities.java and WMFConstants.java ) file from Apache Batik, it has implement the charset issue.

BTW, iText also has the same issue,

187 try {
188 font = BaseFont.createFont(fontName, "Cp1252", false);
189 }
190 catch (Exception e) {
191 throw new ExceptionConverter(e);
192 }

Leave a Reply