Add pre-analyzed documents in Elasticsearch

Hi All,

I have some text that has been pre-analyzed (tokenized, stemmed, stopword
filtered, and the term frequency has been counted). So, for the text
"Example text has twice the word twice", I have "(example, 1), (text, 1),
(twice, 2), (word, 1)".
How can I add this to ElasticSearch?
I could just repeat the tokens and have them re-analyzed, or I could create
my own analyzer for this format, or I could try to directly access the term
vectors.
I assume that The Right Way would be the last option, but I don't know how
to do it. Any ideas?

[]`s
Cristiano.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

If you're adding text the regular way, then you get extra meta data such as
position. So if you care only about term vector like you outlined, then you
might want basic term queries not the regular text search. If you intend to
do text searches then you'll need a matching query analyzer.

Without fully understanding your usecase, I sense you're better off with
writing your own tokenizer and just not applying any fancy filters to it.

On Wednesday, May 1, 2013 6:08:53 AM UTC-7, Cristiano Lima wrote:

Hi All,

I have some text that has been pre-analyzed (tokenized, stemmed, stopword
filtered, and the term frequency has been counted). So, for the text
"Example text has twice the word twice", I have "(example, 1), (text, 1),
(twice, 2), (word, 1)".
How can I add this to Elasticsearch?
I could just repeat the tokens and have them re-analyzed, or I could
create my own analyzer for this format, or I could try to directly access
the term vectors.
I assume that The Right Way would be the last option, but I don't know how
to do it. Any ideas?

`s
Cristiano.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.