Converting ACTA201001 into text
Help us transform the 2010/01/18 ful ACTA consolidated text into fulltext format, in order to allow further analysis, etc.
the text is available on different sizes and resolutions:
Lots of strings appearing in grey (coloured in the original document) as well as background watermark might complicate (if not render impossible) the task of OCR software. :/
We can start transcripting in Etherpad (http://quadpad.lqdn.fr:9000/UuiV6nSfDm) and dump it in the wiki once it's done). (It seems like the progress has gone further and is going faster on http://chaos.faxpad.org:9000/lqactatranscript)
Transcription is done, text online at http://werebuild.eu/wiki/index.php?title=ACTA-leak_2010-03-23
Text here http://en.swpat.org/wiki/201001_acta.pdf_as_text has colors for additions/comments