I am struggling with creating the following application
Extract specific data
from 1000s of policies
- Searchable PDFs - can get full text directy
- Image PDFs - using Tesseract to OCR to get full text
feed full policy text to ES and store the following indices
- Policy # - String
- Premium - $
and store them so that the end of the day, I have a table in say Oracle
Policy # Premium
12345 $ 2314
23451 $ 4231
And so on . . ..
There is a lot of analytics that I can do with this table (there is more
fields I am execting to extract of course, ~ 7-10 total fields)
We can get the full text and we can feed to ES i one field.
We are kinda on our way to create the indices we want
I just done know if there is a way to get the stored index data (label,
value) out of ES into a structured DB table.
If you have experience attempting something like this, Id love to hear
about the feasibility/challenges of such an attempt.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to firstname.lastname@example.org.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/91dce4e2-e28b-444a-aef8-1a48c123c740%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.