html
php
css
xml
python
mysql
xcode
ruby-on-rails
regex
objective-c
eclipse
flash
json
perl
algorithm
cocoa
apache
php5
postgresql
dom
Just a warning to others who may be hoping to extract data: PDF is a container, not a format. If the original document does not contain actual text, as opposed to bitmapped images of text or possibly even uglier things than I can imagine, nothing other than OCR can help you.
On top of that, in my sad experience there's no guarantee that apps which create PDF docs all behave the same, so the data in your table may or may not be read out in the desired order (as a result of the way the doc was built). Be cautious.
Probably better to make a couple grad students transcribe the data for you. They're cheap :-)
So... this gets me close even on a fairly complex table.
Download a sample pdf from bmi pdf
library(tm) pdf <- readPDF(PdftotextOptions = "-layout") dat <- pdf(elem = list(uri='bmi_tbl.pdf'), language='en', id='id1') dat <- gsub(' +', ',', dat) out <- read.csv(textConnection(dat), header=FALSE)