Add pre-analyzed documents in Elasticsearch

Cristiano_Lima · May 1, 2013, 1:08pm

Hi All,

I have some text that has been pre-analyzed (tokenized, stemmed, stopword
filtered, and the term frequency has been counted). So, for the text
"Example text has twice the word twice", I have "(example, 1), (text, 1),
(twice, 2), (word, 1)".
How can I add this to ElasticSearch?
I could just repeat the tokens and have them re-analyzed, or I could create
my own analyzer for this format, or I could try to directly access the term
vectors.
I assume that The Right Way would be the last option, but I don't know how
to do it. Any ideas?

[]`s
Cristiano.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

taras · May 1, 2013, 6:58pm

If you're adding text the regular way, then you get extra meta data such as
position. So if you care only about term vector like you outlined, then you
might want basic term queries not the regular text search. If you intend to
do text searches then you'll need a matching query analyzer.

Without fully understanding your usecase, I sense you're better off with
writing your own tokenizer and just not applying any fancy filters to it.

On Wednesday, May 1, 2013 6:08:53 AM UTC-7, Cristiano Lima wrote:

Hi All,

I have some text that has been pre-analyzed (tokenized, stemmed, stopword
filtered, and the term frequency has been counted). So, for the text
"Example text has twice the word twice", I have "(example, 1), (text, 1),
(twice, 2), (word, 1)".
How can I add this to Elasticsearch?
I could just repeat the tokens and have them re-analyzed, or I could
create my own analyzer for this format, or I could try to directly access
the term vectors.
I assume that The Right Way would be the last option, but I don't know how
to do it. Any ideas?

`s
Cristiano.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Term Vector pre-analyzed index Elasticsearch	1	330	July 6, 2017
Index pre-analyzed text by sending the actual terms/tokens? Elasticsearch	6	724	December 10, 2020
Text analysis Elasticsearch	6	1271	April 8, 2019
Term vectors filter by search word Elasticsearch	7	1452	January 14, 2019
Need help with ES Query Elasticsearch	1	316	July 6, 2017

Add pre-analyzed documents in Elasticsearch

Related topics