You can look at analysis plugins: http://www.elasticsearch.org/guide/en/elasticsearch/reference/master/modules-plugins.html#analysis-plugins
They provide analyzers, tokenizers, …
You can probably copy one of theses projects and add your own custom tokenizer.
For example: https://github.com/elasticsearch/elasticsearch-analysis-stempel
BTW, built-in tokenizers are here: http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-tokenizers.html
HTH
--
David Pilato | Technical Advocate | Elasticsearch.com
@dadoonet | @elasticsearchfr
Le 5 février 2014 at 23:54:13, Itamar Syn-Hershko (itamar@code972.com) a écrit:
Lital, why do you need Elasticsearch for this? it is going to be way easier for you to use Lucene directly to do this?
--
Itamar Syn-Hershko
http://code972.com | @synhershko
Freelance Developer & Consultant
Author of RavenDB in Action
On Wed, Feb 5, 2014 at 4:02 PM, Lital litalh@liveperson.com wrote:
Hi,
We would like to use elastic search in order to generate idf score for each token (for algorithm tf-idf).
What are the types of built in tokenizers in the elastic search ? Should we specify which tokenizer to use in the indexing level (when inserting the data) or when performing search on it ?
Is it also possible to make elastic search use a different tokenizer (that was implemented by me) ?
Thanks,
Lital
This message may contain confidential and/or privileged information.
If you are not the addressee or authorized to receive this on behalf of the addressee you must not use, copy, disclose or take action based on this message or any information herein.
If you have received this message in error, please advise the sender immediately by reply email and delete this message. Thank you.
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/58047432-3f73-4a55-84cd-20051ff8738f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4Zv1xWCpfb%3DMps-WGTpOYMpj7-g1nPzNPq4zGxh8SCkJ6Q%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.
--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/etPan.52f2c1fd.10233c99.d955%40MacBook-Air-de-David.local.
For more options, visit https://groups.google.com/groups/opt_out.