Using Acrobat SDK to read hyphenated text


i'm using gettext method, retrieving text, word word, pdf. i'm running 2 problems, both relate gettext treating hyphens punctuation, not text.

 

if source text in document contains date in form 30-jun-2013, gettext returns 30jun2013

 

if source text contains negative number, example, -90.20, gettext returns 90.20. similarly, source text of -$90.20 returned 2 text items, first $, 90.20

 

i'm using vba within access db read pdfs , populate data within database tables.

 

does know how either set option have sdk treat hyphens part of word or alternative gettext routine accomplsih analogous?

the vb apis don't provide access changing word definition.  can done using c/c++-based plugin apis.   alternatively, have acrobat save pdf text/xml/rtf/other document , parse that.



More discussions in Acrobat SDK


adobe

Comments

Popular posts from this blog

how to devide a circle into equal parts

"Could not fill because there are not enough opaque source pixels" - not solved by any other thread

Why can't I change the billing info for my account?