Exact duplicate results (same _id) for a search query. Is this a bug?


(Daniel Winterstein) #1

Hello,

I have an elasticsearch index which is returning two identical results. I
don't mean 2 copies of a similar document. These results have the same
elasticsearch _id.

Details below.

Does anyone know why this happens?
Is it a bug?

Best regards,

  • Daniel

Version: 1.2.1

Query: http://localhost:9200/workspace/group/_search?q=winterwell

Result:

{

  • took: 8,
  • timed_out: false,
  • _shards:
    {
    • total: 5,
    • successful: 5,
    • failed: 0
      },
  • hits:
    {
    • total: 2,
    • max_score: 0.89743817,
    • hits:
      [

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      },

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      }
      ]
      }

}

--
Dr Daniel Winterstein
Director

A: CodeBase Argyle House, Edinburgh, EH3 9DR
M: +44 (0)772 5172 612
http://winterwell.com http://sodash.com

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Itamar Syn-Hershko) #2

Is this 1 Elasticsearch instance running locally or do multiple servers /
nodes participate?

--

Itamar Syn-Hershko
http://code972.com | @synhershko https://twitter.com/synhershko
Freelance Developer & Consultant
Author of RavenDB in Action http://manning.com/synhershko/

On Fri, Jun 27, 2014 at 3:27 PM, Daniel Winterstein <
daniel.winterstein@gmail.com> wrote:

Hello,

I have an elasticsearch index which is returning two identical results. I
don't mean 2 copies of a similar document. These results have the same
elasticsearch _id.

Details below.

Does anyone know why this happens?
Is it a bug?

Best regards,

  • Daniel

Version: 1.2.1

Query: http://localhost:9200/workspace/group/_search?q=winterwell

Result:

{

  • took: 8,
  • timed_out: false,
  • _shards:
    {
    • total: 5,
    • successful: 5,
    • failed: 0
      },
  • hits:
    {
    • total: 2,
    • max_score: 0.89743817,
    • hits:
      [

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      },

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      }
      ]
      }

}

--
Dr Daniel Winterstein
Director

A: CodeBase Argyle House, Edinburgh, EH3 9DR
M: +44 (0)772 5172 612
http://winterwell.com http://sodash.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAHTr4ZsiuYicC4jKrwqnkPK6HrBm4geO_Eo01UySOpU1xnjPmw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Jörg Prante) #3

Maybe a segment-level effect. Will this disappear after optimizing the
index?

Do you have replica level > 0 ?

Jörg

On Fri, Jun 27, 2014 at 2:27 PM, Daniel Winterstein <
daniel.winterstein@gmail.com> wrote:

Hello,

I have an elasticsearch index which is returning two identical results. I
don't mean 2 copies of a similar document. These results have the same
elasticsearch _id.

Details below.

Does anyone know why this happens?
Is it a bug?

Best regards,

  • Daniel

Version: 1.2.1

Query: http://localhost:9200/workspace/group/_search?q=winterwell

Result:

{

  • took: 8,
  • timed_out: false,
  • _shards:
    {
    • total: 5,
    • successful: 5,
    • failed: 0
      },
  • hits:
    {
    • total: 2,
    • max_score: 0.89743817,
    • hits:
      [

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      },

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      }
      ]
      }

}

--
Dr Daniel Winterstein
Director

A: CodeBase Argyle House, Edinburgh, EH3 9DR
M: +44 (0)772 5172 612
http://winterwell.com http://sodash.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRWsSec1rCgvJj6cZTXs6CAF_2kbHvgDg8GZjrqL9syQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Alexander Reelsen) #4

Hey,

couple of possible reasons here. You might have used routing accidentally?
You can use the explain flag in your query to find out the shard, the
document lives in, see this example, which returns the shard id for both
indexed document

DELETE /foo
PUT /foo/bar/1?routing=2
{
"foo" : "bar"
}

PUT /foo/bar/1
{
"foo" : "bar"
}

GET /foo/bar/_search
{
"explain": true,
"query": { "term": {
"foo": {
"value": "bar"
}
}}
}

Also, was this a stock Elasticsearch 1.2.1? This was also the version all
of your data was indexed in (just to make you did not run into the 1.2.0
bug)

--Alex

On Fri, Jun 27, 2014 at 2:35 PM, joergprante@gmail.com <
joergprante@gmail.com> wrote:

Maybe a segment-level effect. Will this disappear after optimizing the
index?

Do you have replica level > 0 ?

Jörg

On Fri, Jun 27, 2014 at 2:27 PM, Daniel Winterstein <
daniel.winterstein@gmail.com> wrote:

Hello,

I have an elasticsearch index which is returning two identical results. I
don't mean 2 copies of a similar document. These results have the same
elasticsearch _id.

Details below.

Does anyone know why this happens?
Is it a bug?

Best regards,

  • Daniel

Version: 1.2.1

Query: http://localhost:9200/workspace/group/_search?q=winterwell

Result:

{

  • took: 8,
  • timed_out: false,
  • _shards:
    {
    • total: 5,
    • successful: 5,
    • failed: 0
      },
  • hits:
    {
    • total: 2,
    • max_score: 0.89743817,
    • hits:
      [

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      },

      {
      - _index: "workspace",
      - _type: "group",
      - _id: "winterwell@DBGroup",
      - _score: 0.89743817,
      - _source:
      {
      - name: "winterwell",
      - tags: { },
      - ...some details skipped...
      }
      }
      ]
      }

}

--
Dr Daniel Winterstein
Director

A: CodeBase Argyle House, Edinburgh, EH3 9DR
M: +44 (0)772 5172 612
http://winterwell.com http://sodash.com

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com
https://groups.google.com/d/msgid/elasticsearch/c19b5667-12af-4d4c-8457-05361a926d66%40googlegroups.com?utm_medium=email&utm_source=footer
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRWsSec1rCgvJj6cZTXs6CAF_2kbHvgDg8GZjrqL9syQ%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAKdsXoGRWsSec1rCgvJj6cZTXs6CAF_2kbHvgDg8GZjrqL9syQ%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGCwEM-RbNHc06bi34J%3DRZB6FSUhd%3D_Be2%2BE4gOT%3DkJeR7JOEg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Daniel Winterstein) #5

Thank you Alex, Jörg & Itamar.

I am using routing, so a mistake in routing looks like the most
likely culprit. The two documents are on different shards.

To answer the other questions: single server, optimize does not alter
it, and I believe it's been version 1.2.1 from setup.

Best regards,

  • Daniel

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAEmLStmmmdu2kOWMhvpQgfU4Zq_x6B55ZoGO3Oamh71VvR5ybA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #6