Same document repeated in search results

Hi I am getting same document (with same "_id") repeated more than once in
search results.

My query looks like

POST http://XXXX:9200/_search/
{
"query": {
"multi_match" : {
"fields":[
"en_text_keywords_1^12",
"en_text_keywords_2^8",
"en_text_keywords_3^6",
"en_text_keywords_4^4",
"en_text_keywords_5^2",
"en_text_title^12"
],
"query":"animals"
}
},
"size": 1000
}

Is it a bug? or something I am missing?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/8c1c3f24-eb33-4035-bc9a-dc7fa26b6a87%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

I deleted the document and re-inserted it, it actually solved the issue
with that particular document. But now there is one more document that has
same problem. Ou results looks like containing :
...
{
"_index": "twitter",
"_type": "tweet",
"_id": "1739753",
"_score": 8.071245,
"fields": {
"en_text_title": [
"Young tiger portrait"
]
}
},
{
"_index": "twitter",
"_type": "tweet",
"_id": "1739753",
"_score": 8.071245,
"fields": {
"en_text_title": [
"Young tiger portrait"
]
}
},
...

Another important thing to mention here is that we have a multithreaded
application sending the documents to elasticsearch . Therefore there are
chances that we may send the same document more than once.

Thanks
Pir.

On Friday, February 21, 2014 11:57:35 AM UTC+1, Pir Abdul Rasool Qureshi
wrote:

Hi I am getting same document (with same "_id") repeated more than once in
search results.

My query looks like

POST http://XXXX:9200/_search/
{
"query": {
"multi_match" : {
"fields":[
"en_text_keywords_1^12",
"en_text_keywords_2^8",
"en_text_keywords_3^6",
"en_text_keywords_4^4",
"en_text_keywords_5^2",
"en_text_title^12"
],
"query":"animals"
}
},
"size": 1000
}

Is it a bug? or something I am missing?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0b154ae4-fd99-4f9a-830f-69e949e26b8b%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

We have 3 nodes in Elasticsearch cluster, (with number_of_replicas = 2 and
number of shards = 8). I stopped both replicas and executed the same
query but the problem was still there.

On Friday, February 21, 2014 11:57:35 AM UTC+1, Pir Abdul Rasool Qureshi
wrote:

Hi I am getting same document (with same "_id") repeated more than once in
search results.

My query looks like

POST http://XXXX:9200/_search/
{
"query": {
"multi_match" : {
"fields":[
"en_text_keywords_1^12",
"en_text_keywords_2^8",
"en_text_keywords_3^6",
"en_text_keywords_4^4",
"en_text_keywords_5^2",
"en_text_title^12"
],
"query":"animals"
}
},
"size": 1000
}

Is it a bug? or something I am missing?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/2c49db40-3ddd-4b88-8428-2d68ce72f821%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

May I ask which version of ES? And also are you using the REST API to index
the documents with an explicit ID?

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/7d86859a-b12b-4a37-83dd-1d62c29ec06f%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Hi,

do you specify a custom routing parameter when indexing the documents?
If so, you might have documents with the same ID in different shards:

"When indexing documents specifying a custom _routing, the uniqueness of
the _id is not guaranteed throughout all the shards that the index is
composed of. In fact, documents with the same _id might end up in
different shards if indexed with different _routing values."

Best regards,
Hannes

On 21.02.2014 11:57, Pir Abdul Rasool Qureshi wrote:

Hi I am getting same document (with same "_id") repeated more than once in
search results.

My query looks like

POST http://XXXX:9200/_search/
{
"query": {
"multi_match" : {
"fields":[
"en_text_keywords_1^12",
"en_text_keywords_2^8",
"en_text_keywords_3^6",
"en_text_keywords_4^4",
"en_text_keywords_5^2",
"en_text_title^12"
],
"query":"animals"
}
},
"size": 1000
}

Is it a bug? or something I am missing?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5307AA88.2070809%40hkorte.com.
For more options, visit https://groups.google.com/groups/opt_out.

We are using Elastic Search version 1.0, Elasticsearch official php-client
(Elasticsearch Platform — Find real-time answers at scale | Elastic)
version 1.0 and BulkApi.
We are not specifying any custom routing. Our request looks like this:

{"index":{"_index":"twitter","_type":"tweet","_id":"1739753"}}
{"field1":"value1" ... "field n":"value"}

I need help to understand,
why while using GET API, we find only one document, whereas, while
searching same document appears twice?
why does overwriting the same document fixes the issue with that document?
Is there any way to ensure the _id uniqueness throughout the index?

On Friday, February 21, 2014 8:35:36 PM UTC+1, Hannes Korte wrote:

Hi,

do you specify a custom routing parameter when indexing the documents?
If so, you might have documents with the same ID in different shards:

"When indexing documents specifying a custom _routing, the uniqueness of
the _id is not guaranteed throughout all the shards that the index is
composed of. In fact, documents with the same _id might end up in
different shards if indexed with different _routing values."

Elasticsearch Platform — Find real-time answers at scale | Elastic

Best regards,
Hannes

On 21.02.2014 11:57, Pir Abdul Rasool Qureshi wrote:

Hi I am getting same document (with same "_id") repeated more than once
in
search results.

My query looks like

POST http://XXXX:9200/_search/
{
"query": {
"multi_match" : {
"fields":[
"en_text_keywords_1^12",
"en_text_keywords_2^8",
"en_text_keywords_3^6",
"en_text_keywords_4^4",
"en_text_keywords_5^2",
"en_text_title^12"
],
"query":"animals"
}
},
"size": 1000
}

Is it a bug? or something I am missing?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a7095a2-aaa0-4912-a60f-ff16f7e85b3a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

They are unique.

--
David :wink:
Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs

Le 22 févr. 2014 à 14:53, Pir Abdul Rasool Qureshi pir@colourbox.com a écrit :

We are using Elastic Search version 1.0, Elasticsearch official php-client (Elasticsearch Platform — Find real-time answers at scale | Elastic) version 1.0 and BulkApi.
We are not specifying any custom routing. Our request looks like this:

{"index":{"_index":"twitter","_type":"tweet","_id":"1739753"}}
{"field1":"value1" ... "field n":"value"}

I need help to understand,
why while using GET API, we find only one document, whereas, while searching same document appears twice?
why does overwriting the same document fixes the issue with that document?
Is there any way to ensure the _id uniqueness throughout the index?

On Friday, February 21, 2014 8:35:36 PM UTC+1, Hannes Korte wrote:
Hi,

do you specify a custom routing parameter when indexing the documents?
If so, you might have documents with the same ID in different shards:

"When indexing documents specifying a custom _routing, the uniqueness of
the _id is not guaranteed throughout all the shards that the index is
composed of. In fact, documents with the same _id might end up in
different shards if indexed with different _routing values."

Elasticsearch Platform — Find real-time answers at scale | Elastic

Best regards,
Hannes

On 21.02.2014 11:57, Pir Abdul Rasool Qureshi wrote:

Hi I am getting same document (with same "_id") repeated more than once in
search results.

My query looks like

POST http://XXXX:9200/_search/
{
"query": {
"multi_match" : {
"fields":[
"en_text_keywords_1^12",
"en_text_keywords_2^8",
"en_text_keywords_3^6",
"en_text_keywords_4^4",
"en_text_keywords_5^2",
"en_text_title^12"
],
"query":"animals"
}
},
"size": 1000
}

Is it a bug? or something I am missing?

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6a7095a2-aaa0-4912-a60f-ff16f7e85b3a%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/E4162F2F-B9AD-41CE-AE88-542EE2031D77%40pilato.fr.
For more options, visit https://groups.google.com/groups/opt_out.