Archive for January, 2008

Google Docs Supports Save Presentations as PDF

From PC Pro, I got Google introduces Presentation PDF support,

Google has added PDF support to the presentations component of Google Docs.
Presentation can now export slides in the PDF format, enabling presentations to be saved offline. There’s also a set of PDF-based printing options that enable up to 12 slides to be printed on single sheet.

So I did a try, it also supports Chinese, but I found it try to use Japanese font by default, so some Chinese characters will be lost, maybe they do not exist in Kanji(Japanese character) database.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Netvouz
  • DZone
  • ThisNext
  • MisterWong
  • Wists
  • BlinkList
  • blogmarks
  • blogtercimlap
  • connotea
  • DotNetKicks
  • Fark
  • Fleck
  • Gwar
  • Haohao
  • IndianPad
  • Internetmedia
  • LinkaGoGo
  • MyShare
  • Netscape
  • NewsVine
  • Rec6
  • Reddit
  • Scoopeo
  • Slashdot
  • StumbleUpon
  • Technorati
  • Webride

Yahoo! CAPTCHA Has been broken

According to these Russians, They cracked the Yahoo Captcha and are giving away how they did it here.

The implementation of yahoo CAPTCHA recognition engine is here . It consists of two projects (client and server).
First project (server) needs MATLAB 2007a Compiler Runtime (MCR) installed. It waits for a connection and receives CAPTCHA, after that it sends recognized CAPTCHA text string back to client.
Client reads jpg-files in test1 directory and sends them one by one to the server located on the same machine.

btw, I have tested it.

Share and Enjoy:
  • Digg
  • del.icio.us
  • Netvouz
  • DZone
  • ThisNext
  • MisterWong
  • Wists
  • BlinkList
  • blogmarks
  • blogtercimlap
  • connotea
  • DotNetKicks
  • Fark
  • Fleck
  • Gwar
  • Haohao
  • IndianPad
  • Internetmedia
  • LinkaGoGo
  • MyShare
  • Netscape
  • NewsVine
  • Rec6
  • Reddit
  • Scoopeo
  • Slashdot
  • StumbleUpon
  • Technorati
  • Webride

PDFassassin-a module for SpamAssassin

PDFassassin is a module for SpamAssassin that allows for the scanning of PDF files in email message attachments. Email bodies are scanned upon connection and checked for PDF attachments. Text is extracted from the PDF via pdftotext and scanned by SpamAssassin. Should the PDF contain images, the gocr program is called to extract the text content. The total spam score of the PDF is compared against the global required_score setting; if it’s higher, a score equal to the one specified in pdf.cf is appended to the overall score of the email message.

With the recent torrent of PDF spam, we created a module for SpamAssassin that allows for the scanning of PDF files. The module, linked below this post, works in the following way:

  1. Email bodies are scanned upon connection, and checked for PDF attachments.
  2. Text is extracted from the PDF via pdftotext, and scanned by SpamAssassin.
  3. Should the PDF contain images, the gocr binary is called to extract the text content.
  4. The total spam score of the PDF is compared against the global required_score setting; if it’s higher, a score equal to the one specified in pdf.cf (default of 10) is appended to the overall score of the email message.

This approach is a departure from the usual method as it scans the content against the SpamAssassin engine, instead of using a word list filter.

Should you need to install the module, download it from: http://atmail.com/members/Pdf.tgz.

Installation directions can be found in the README file inside the archive.

PDFassassin forum: http://forum.atmail.com/viewforum.php?id=10

Share and Enjoy:
  • Digg
  • del.icio.us
  • Netvouz
  • DZone
  • ThisNext
  • MisterWong
  • Wists
  • BlinkList
  • blogmarks
  • blogtercimlap
  • connotea
  • DotNetKicks
  • Fark
  • Fleck
  • Gwar
  • Haohao
  • IndianPad
  • Internetmedia
  • LinkaGoGo
  • MyShare
  • Netscape
  • NewsVine
  • Rec6
  • Reddit
  • Scoopeo
  • Slashdot
  • StumbleUpon
  • Technorati
  • Webride