Mechanism of internal search with multiple indices


(golchhamohit) #1

Hi ,

  When a query is supplied with multiple indices (having same

structure), multiple types, along with some data to be searched, how does
it work internally ? Does it do a m*n comparisons where m is the data list
to be searched and n is the no of indices , or is there any other mechanism
?
Can you please explain the mechanism involved in it. ? I reckon there is
some other way working internally. Thanks.

--

Thanks & Regards,
Mohit Golchha

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGXH0MgqvpPwC2LC-%2BzMr5K%3DpBDX39NSt4P3bQmz0U1Q2wXUmg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Clinton Gormley) #2

Hi Mohit

All documents stored in a single index are stored at the same "level",
regardless of their type. The "_type" is just a hidden field in each
document. So if you do a search like:

GET /index_one,index_two/_search
{ "query": { "match": { "field_foo": "some search terms" }}}

then it queries each shard in index_one and index_two and reduces the
results. For each index, it looks for "field_foo" in all of the types
defined in that index and uses the mapping for the first "field_foo" that
it finds. (This is why fields with the same name but in different types
should be mapped in the same way).

If you do a search like this:

GET /index_on,index_two/type_one,type_two/_search
{ "query": { "match": { "field_foo": "some search terms" }}}

then everything works in the same way except:

  1. it looks only in type_one and type_two for "field_foo" (and uses the
    first one that it finds)
  2. it adds a filter like { "terms": { "_type": [ "type_one", "type_two" ]}}

If you do a search like this:

GET /index_on,index_two/type_one,type_two/_search
{ "query": {
      "multi_match": {
          "fields: [ "type_one.field_foo", "type_two.field_foo" ]
          "query": "some search terms"
      }
}

the the mapping for type_one.field_foo and type_two.field_foo are
considered independently, and the query is rewritten to look something like
this:

{
"query": {
"dis_max": {
"queries": [
{
"filtered": {
"query": { "match": {"field_foo": "some search terms" }},
"filter": { "term": { "_type": "type_one" }}
}
},
{
"filtered": {
"query": { "match": {"field_foo": "some search terms" }},
"filter": { "term": { "_type": "type_one" }}
}
}
]
}
}
}

clint

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPt3XKS37RArKVPM%3DnZjEs4roDHV89W2GUUxNjnqaAxD9ri2jw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Clinton Gormley) #3

On 17 March 2014 13:26, Clinton Gormley clint@traveljury.com wrote:

"query": {
"dis_max": {
"queries": [
{
"filtered": {
"query": { "match": {"field_foo": "some search terms" }},
"filter": { "term": { "_type": "type_one" }}
}
},
{
"filtered": {
"query": { "match": {"field_foo": "some search terms" }},
"filter": { "term": { "_type": "type_one" }}
}
}
]
}
}
}

I meant:

"query": {
"dis_max": {
"queries": [
{
"filtered": {
"query": { "match": {"field_foo": "some search terms" }},
"filter": { "term": { "_type": "type_one" }}
}
},
{
"filtered": {
"query": { "match": {"field_foo": "some search terms" }},
"filter": { "term": { "_type": "type_two" }}
}
}
]
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPt3XKSSEg_c60q5ZcsPZAXS2-boAdouHe3w8rNSzC5OitKyuQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(golchhamohit) #4
  • deleted -

(golchhamohit) #5

@ Clinton : Many Thanks for explaining clearly how the query modifies
itself when a query is given with multiple indices and types.

My another doubt is that which I represent here. I have multiple indices
(assume 5 - i1,i2,i3,i4,i5), multiple types(5 in each index, so total 25
types - t1,t2...,t25) , some field (called "field_foo" which is present in
all documents of all indices) and there are 100 documents in each
index(d1,d2,..,d100). So total there are 5*100 = 500 documents totally.

Now, my search query contains 2 indices(i2,i4) and
types(t2,t5,t7,t13,t17,t23) and to search for a particular list of values
["value_field_foo1",value_field_foo2",value_field_foo3"] in field
"field_foo" (where actually value_field_foo1 is present in index2,
value_field_foo2 is present in index2 and index4, assume this, but in real
time we would not be aware which value is present in which field). My
question is will all the values(value_field_foo1, value_field_foo2,etc) be
checked its presence in the documents of all indexes , or is there some
mechanism by which it determines that value is not present in that index
and hence it avoids the search in that index thereby saving time. ?

Sorry for being verbose.

Thanks in advance.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/d3012496-dd6d-4440-938d-d1b99d527666%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Clinton Gormley) #6

If the field doesn't exist in the mapping, then the index is not searched.

clint

On 18 March 2014 09:56, golchhamohit golchhamohit@gmail.com wrote:

Thanks for explaining clearly how the query modifies itself when a query is
given with multiple indices and types.

My another doubt is that which I represent here. I have multiple indices
(assume 5 - i1,i2,i3,i4,i5), multiple types(5 in each index, so total 25
types - t1,t2...,t25) , some field (called "field_foo" which is present in
all documents of all indices) and there are 100 documents in each
index(d1,d2,..,d100). So total there are 5*100 = 500 documents totally.

Now, my search query contains 2 indices(i2,i4) and
types(t2,t5,t7,t13,t17,t23) and to search for a particular list of values
["value_field_foo1",value_field_foo2",value_field_foo3"] in field
"field_foo" (where actually value_field_foo1 is present in index2,
value_field_foo2 is present in index2 and index4, assume this, but in real
time we would not be aware which value is present in which field). My
question is will all the values(value_field_foo1, value_field_foo2,etc) be
checked its presence in the documents of all indexes , or is there some
mechanism by which it determines that value is not present in that index
and
hence it avoids the search in that index thereby saving time. ?

Sorry for being verbose.

Thanks in advance.

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/Mechanism-of-internal-search-with-multiple-indices-tp4051988p4052080.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1395132968292-4052080.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAPt3XKQV2d6EXmW7ypSzSN-yaUFyrjYG%2BZ9%3Dhc5pik_9Nm4Riw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #7