How to know details about how a document is indexed?

dadoonet · June 15, 2011, 4:34pm

Hi,

I would like to know how a given document has been analyzed by ES / Lucene
for debug needs.

Here is my story :

I define a setting (new analyzer) when creating my index. My analyzer is a
lowercase with keyword (aka key_lowercase)

I push document into this index. There is a field in it (prop1.prop2.prop3)
with the value "term1 term2 term3"

When I analyze this value with the _analyze API, I can see that there is
only one token "term1 term2 term3". That's what I am waiting for.

I try to search on this field with a wildcard like "term1*term3". No
results.

If I try to search for "term1", I got a result. So it seems that my Document
field has been tokenized into 3 tokens.

So, that's why I would like to know if there is a way to see what ES/Lucene
have done with my document.

Any idea ? Changing the log level ?

Thanks for any help

David.

kimchy · June 15, 2011, 4:48pm

There isn't a way to do it, aside from getting the mapping back and making sure its using the analyzer you use. Its not logged... .

On Wednesday, June 15, 2011 at 7:34 PM, David Pilato wrote:

Hi,

I would like to know how a given document has been analyzed by ES / Lucene for debug needs.

Here is my story :
I define a setting (new analyzer) when creating my index. My analyzer is a lowercase with keyword (aka key_lowercase)
I push document into this index. There is a field in it (prop1.prop2.prop3) with the value “term1 term2 term3”
When I analyze this value with the _analyze API, I can see that there is only one token “term1 term2 term3”. That’s what I am waiting for.
I try to search on this field with a wildcard like “term1*term3”. No results.
If I try to search for “term1”, I got a result. So it seems that my Document field has been tokenized into 3 tokens.

So, that’s why I would like to know if there is a way to see what ES/Lucene have done with my document…

Any idea ? Changing the log level ?

Thanks for any help
David.

dadoonet · June 15, 2011, 7:01pm

Thanks Shay.

As far as I can see, my mapping seems to be ok.

It seems that my problem doesn’t occur on first level properties of my document (such as prop1) but occur on sublevel properties (such as prop1.prop2.prop3).

I will try to make new tests tomorrow and gist it.

Cheers

De : Shay Banon [mailto:shay.banon@elasticsearch.com]
Envoyé : mercredi 15 juin 2011 18:48
À : users@elasticsearch.com
Objet : Re: How to know details about how a document is indexed ?

There isn't a way to do it, aside from getting the mapping back and making sure its using the analyzer you use. Its not logged... .

dadoonet · June 16, 2011, 6:58pm

Ok. Problem is solved now.

Just for sharing with others.

An old json mapping file was in my config/mappings dir.

So when I put mapping with ES API, my new mapping is not "active".

But, it seems that when I ask for the current mapping with ES API (curl -X GET http://localhost:9200/index/doctype/_mapping), ES seems to return the mapping I tried to put and not the active mapping.

I write "seems to" because I did not make any deeper tests as I found my issue.
I will try to reproduce it in the next days and if the problem occurs again, I will gist it.

Thanks for reading
David

kimchy · June 16, 2011, 7:29pm

When you ask for the mapping of a type, elasticsearch will return the mapping that is exactly used for it (the one built), it won't return something that its not using...

On Thursday, June 16, 2011 at 9:58 PM, dadoonet wrote:

Ok. Problem is solved now.

Just for sharing with others.

An old json mapping file was in my config/mappings dir.

So when I put mapping with ES API, my new mapping is not "active".

But, it seems that when I ask for the current mapping with ES API (curl -X
GET http://localhost:9200/index/doctype/_mapping), ES seems to return the
mapping I tried to put and not the active mapping.

I write "seems to" because I did not make any deeper tests as I found my
issue.
I will try to reproduce it in the next days and if the problem occurs again,
I will gist it.

Thanks for reading
David

--
View this message in context: http://elasticsearch-users.115913.n3.nabble.com/How-to-know-details-about-how-a-document-is-indexed-tp3068090p3073323.html
Sent from the Elasticsearch Users mailing list archive at Nabble.com (http://Nabble.com).

Topic		Replies	Views
Need suggestions on type of query to be used for a given analysis for better results? Elasticsearch	2	373	July 6, 2017
How to implement analyzer? Elasticsearch	3	292	July 6, 2017
Question about wildcard query Elasticsearch	9	472	May 5, 2021
Settings Configuration for Tokenizers/Analyzers Elasticsearch	2	243	July 6, 2017
Is there a way to search terms lower cased? Elasticsearch	9	479	July 6, 2017

How to know details about how a document is indexed?

Related topics