Custom tokenizer in .NET language, push already tokenized text into ES

Hi All,

I am from .NET world and I already have my custom tokenizer which is very
specific to my knowledge domain. I understand how to use my own
analyzer/tokenizer with a standalone Lucene.NET. However I have recently
discovered ElasticSearch with its great distributed features. This is so
great that I either will have to port my tokenizer to Java or will have to
find some tricks.

Could you please tell if it is possible to feed ElasticSearch with already
tokenized fields from a different application? I could form a JSON document
structure with all fields already tokenized by my own .NET application.

If the previous option is not possible, I could write a very simple
Analyzer in Java that will just consume the JSON structure from the
previous step and transform already tokenized fields to the internal format.

Do you think this is the simplest and good enough approach to marry a .NET
tokenizer with ElasticSearch, or there are some other better options?

Thanks!
Victor

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

The best route is really to translate your .NET code into Java and run the
analyzer normally within the ES context. Don't mess with the internals too
much.

On Thu, May 2, 2013 at 1:48 AM, Victor B. vbaybekov@gmail.com wrote:

Hi All,

I am from .NET world and I already have my custom tokenizer which is very
specific to my knowledge domain. I understand how to use my own
analyzer/tokenizer with a standalone Lucene.NET. However I have recently
discovered ElasticSearch with its great distributed features. This is so
great that I either will have to port my tokenizer to Java or will have
to find some tricks.

Could you please tell if it is possible to feed ElasticSearch with
already tokenized fields from a different application? I could form a JSON
document structure with all fields already tokenized by my own .NET
application.

If the previous option is not possible, I could write a very simple
Analyzer in Java that will just consume the JSON structure from the
previous step and transform already tokenized fields to the internal format.

Do you think this is the simplest and good enough approach to marry a .NET
tokenizer with ElasticSearch, or there are some other better options?

Thanks!
Victor

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.