Trainings‎ > ‎

How to Pull Data from a PDF

pulling data from a PDF:
http://documentcloud.github.com/docsplit/
https://github.com/Erol/yomu
Yumo is easier to use but less flexible. Docsplit would have better luck with more PDFs.
http://anemone.rubyforge.org/
Is a lightweight spider
All are ruby

SQL can also full text index it
http://www.adobe.com/support/downloads/detail.jsp?ftpID=4025
Comments