On 16/02/2015 14:52, Marria wrote:
Hi David,
Reading the code, made me conclude that I didn't explain well what I need,
What I mean by an automatic extraction is not to get the keywords i
already had entered in my metadata but an intelligent extraction from
the text. Like Alchemy that's is based on machine learning:
http://www.alchemyapi.com/products/demo/alchemylanguage/
I think, it is not possible with elasticsearch because it is not the
objective of this tool
Thanks a lot David for your generous help
Hi Marria,
Firstly, why do you need to extract the keywords? Are you trying to
extract entities (e.g. company names, people, places), tag for
sentiment, or do term expansion (automatically add synonyms or related
terms)?
We've used Stanford NLP successfully for entity extraction and basic
sentiment tagging http://nlp.stanford.edu/ Python NLTK is another option
http://www.nltk.org/
You're right, this isn't a core function of Elasticsearch, but rather
something you would do at index time to enhance the data before you
index it, or at query time to enhance a query before you use it on the
index. You should also bear in mind that most of these tools only have a
certain success rate, may need training and may have a significant
overhead. Certainly take with a very large pinch of salt any claims of
'intelligence' especially from closed-source vendors.
HTH
Cheers
Charlie
Le lundi 16 février 2015 14:00:38 UTC+1, Marria a écrit :
Hi all,
I started using ElasticSearch to index my corpus of PDF files, I
succeeded in indexing my PDF files as attachments (base64), my
search queries on the content go right but I couldn't find how to
extract automaticaly keywords from these files in ElasticSearch. Is
it possible to do that with ElasticSearch or not?
Could anybody help with relevent links or advices??
Thanks a lot.
--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to elasticsearch+unsubscribe@googlegroups.com
mailto:elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/0eea7804-e37b-494f-8c7f-4a70a723ff4a%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/0eea7804-e37b-494f-8c7f-4a70a723ff4a%40googlegroups.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.
--
Charlie Hull
Flax - Open Source Enterprise Search
tel/fax: +44 (0)8700 118334
mobile: +44 (0)7767 825828
web: www.flax.co.uk
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/54E20C90.4030008%40flax.co.uk.
For more options, visit https://groups.google.com/d/optout.