Query Time Analysis: Are field value also analyzed?

Karan_Verma · January 29, 2014, 4:21am

Hi

Lets say I have indexed a field person_name as a string, with a custom
analyzer. person_name is stored int the index in one of the documents as:
"Harry Greenberg"

I make a match query on the field : "harry g"

I have a custom edgengram tokenizer which breaks the query down as follows:

{
"tokens": [
{
"token": "h",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 1
},
{
"token": "ha",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 2
},
{
"token": "har",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 3
},
{
"token": "harr",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 4
},
{
"token": "harry",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 5
},
{
"token": "g",
"start_offset": 6,
"end_offset": 7,
"type": "word",
"position": 6
}
]
}

Will all of these tokens be matched agains "Harry Greenberg" or person_name
will also be broken down as defined by my custom analyzer?

If not, how can I make it so that it will also be broken down? Will it make
the search significantly slower?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/af5354e7-5f7b-4b6e-96e6-f5e81df825db%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

jsbonline2006 · January 29, 2014, 6:51am

Hi Karan,

Can you please tell us what mapping you have applied?

If you are applying EdgeNGram in Query time Analyzer then your search query
"harry g" will get tokenized as per your custom analyser.

Regards,
Jayesh Bhoyar

On Wednesday, January 29, 2014 9:51:08 AM UTC+5:30, Karan Verma wrote:

Hi

Lets say I have indexed a field person_name as a string, with a custom
analyzer. person_name is stored int the index in one of the documents as:
"Harry Greenberg"

I make a match query on the field : "harry g"

I have a custom edgengram tokenizer which breaks the query down as
follows:

{
"tokens": [
{
"token": "h",
"start_offset": 0,
"end_offset": 1,
"type": "word",
"position": 1
},
{
"token": "ha",
"start_offset": 0,
"end_offset": 2,
"type": "word",
"position": 2
},
{
"token": "har",
"start_offset": 0,
"end_offset": 3,
"type": "word",
"position": 3
},
{
"token": "harr",
"start_offset": 0,
"end_offset": 4,
"type": "word",
"position": 4
},
{
"token": "harry",
"start_offset": 0,
"end_offset": 5,
"type": "word",
"position": 5
},
{
"token": "g",
"start_offset": 6,
"end_offset": 7,
"type": "word",
"position": 6
}
]
}

Will all of these tokens be matched agains "Harry Greenberg" or
person_name will also be broken down as defined by my custom analyzer?

If not, how can I make it so that it will also be broken down? Will it
make the search significantly slower?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7301bfe9-ae7c-48ca-af38-ed369e7cc78d%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Binh_Ly · January 29, 2014, 12:55pm

Karan,

If you set person_name's analyzer to your custom one, analysis will
generally be done at both query and index time. You also have the ability
to set a different analyzer between index time and search time in which
case they will behave differently when you search and when you index. See
this for more details:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-root-object-type.html#_index_search_analyzers

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/be6a73e0-ee20-4a43-83bc-0be074c09fb7%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Karan_Verma · January 29, 2014, 10:22pm

Thanks your your answer Binh

My mapping is:

    "person_name" : {
      "type" : "string",
      "analyzer" : "person_name_analyzer"
    }

From your explanation looks like ES will analyze both the query string and
the stored value in the document. That is exactly what I want. Is there a
way to test this? I was having problems for a much complex query where I
thought that the tokens were matched against the full string value of the
person_name stored in the document.

On Wed, Jan 29, 2014 at 4:55 AM, Binh Ly binh@hibalo.com wrote:

Karan,

If you set person_name's analyzer to your custom one, analysis will
generally be done at both query and index time. You also have the ability
to set a different analyzer between index time and search time in which
case they will behave differently when you search and when you index. See
this for more details:

Elasticsearch Platform — Find real-time answers at scale | Elastic

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/uJPXFNRwlJk/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/be6a73e0-ee20-4a43-83bc-0be074c09fb7%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
Best,
Karan

Life saving Ninja & Software Engineer

Karan pronounced Ka (http://tiny.cc/0lu61w) + Run

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGX3c4GSk79XhcUy0G%2BAA1eSFW_OjSTV4n%3DXRmWjZz8%2BQ8_8OA%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.

Binh_Ly · January 29, 2014, 10:45pm

Karan,

It should work no problem, if you do a query like this, it should match
"Harry Greenberg":

{
"query": {
"match": {
"person_name": {
"query": "harry g",
"operator": "AND"
}
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/f3d9836f-d1d5-4fcc-9852-74a29f27aca4%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Topic		Replies	Views
Query analzyer with respect to field/index analzyer Elasticsearch	5	346	July 6, 2017
Indexing and searching for string '?' Elasticsearch	2	322	July 6, 2017
Elasticsearch: search and index time analyzer Elasticsearch	7	468	February 26, 2019
Real time match analysis Elasticsearch	4	417	July 6, 2017
Match Exact Value of a Field and not be Included as a Subset in That Field, not more not less Elasticsearch	1	335	April 16, 2019

Query Time Analysis: Are field value also analyzed?

Related topics