Search results not uniform

Hello,

When i do a search query using http://***...:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:
* _index: "tournament"
* _type: "master"
* _id: "995"
*
-
_source: {
o tournamentid: 995
o startdate: "2010-07-19T18:30:00.000Z"
o enddate: "2010-07-22T18:30:00.000Z"
o initiator: "system-gen"
o largeimage: 1735
o thumbimage: 1130
o gamename: "Avoid Responsibility"
o gamedescription: "

It's your chance to keep the responsibilities at bay.This is
your destiny

"
}

And when i do a refresh i get:

* _index: "tournament"
* _type: "master"
* _id: "995"
*
  -
  _source: {
      o tournamentid: 995
         startdate: "2010-07-20T07:35:00.000Z"
         enddate: "2010-07-23T07:35:00.000Z"
      o initiator: "system-gen"
      o largeimage: 1735
      o thumbimage: 1130
      o gamename: "Avoid Responsibility"
      o gamedescription: "<body> <p><font face="Tahoma"

size="1">It's your chance to keep the responsibilities at bay.This is
your destiny

"
}

Notice the different start date and end dates.

elasticsearch results are near real time, not full real time, this means
that if you update a document with the same id, that update will only be
"visible" after a certain period (or once you call refresh, which is an
expensive operations potentially, should not be called for every request).

-shay.banon

On Tue, Jul 20, 2010 at 4:48 PM, ajgamer abie.joseph14@gmail.com wrote:

Hello,

When i do a search query using http://
..*.**:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    o startdate: "2010-07-19T18:30:00.000Z"
    o enddate: "2010-07-22T18:30:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

And when i do a refresh i get:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    startdate: "2010-07-20T07:35:00.000Z"
    enddate: "2010-07-23T07:35:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

Notice the different start date and end dates.

Curious, how about if
org.elasticsearch.action.index.IndexRequest#operationThreaded(..) is set to
'true', will it assure that after
org.elasticsearch.action.ActionFuture#actionGet() finishes execution that
the index will be updated already?

Thanks,

Franz Allan Valencia See | Java Software Engineer
franz.see@gmail.com
LinkedIn: http://www.linkedin.com/in/franzsee
Twitter: http://www.twitter.com/franz_see

On Tue, Jul 20, 2010 at 9:55 PM, Shay Banon shay.banon@elasticsearch.comwrote:

elasticsearch results are near real time, not full real time, this means
that if you update a document with the same id, that update will only be
"visible" after a certain period (or once you call refresh, which is an
expensive operations potentially, should not be called for every request).

-shay.banon

On Tue, Jul 20, 2010 at 4:48 PM, ajgamer abie.joseph14@gmail.com wrote:

Hello,

When i do a search query using http://
..*.**:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    o startdate: "2010-07-19T18:30:00.000Z"
    o enddate: "2010-07-22T18:30:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

And when i do a refresh i get:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    startdate: "2010-07-20T07:35:00.000Z"
    enddate: "2010-07-23T07:35:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

Notice the different start date and end dates.

Hi Shay,

Thanks for your reply. The document was updated some 3 hrs before i checked
for this behavior. The strange thing here is when i refresh the page
continuously(say 5 times) the data remains constant, but when i come back
and refresh the page the date record changes again.

On Tue, Jul 20, 2010 at 7:25 PM, Shay Banon shay.banon@elasticsearch.comwrote:

elasticsearch results are near real time, not full real time, this means
that if you update a document with the same id, that update will only be
"visible" after a certain period (or once you call refresh, which is an
expensive operations potentially, should not be called for every request).

-shay.banon

On Tue, Jul 20, 2010 at 4:48 PM, ajgamer abie.joseph14@gmail.com wrote:

Hello,

When i do a search query using http://
..*.**:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    o startdate: "2010-07-19T18:30:00.000Z"
    o enddate: "2010-07-22T18:30:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

And when i do a refresh i get:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    startdate: "2010-07-20T07:35:00.000Z"
    enddate: "2010-07-23T07:35:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

Notice the different start date and end dates.

Maybe its caching done on the client side? Do you use the REST API using
javascript? Maybe I should add a no cache header or something to the
response.

-shay.banon

On Tue, Jul 20, 2010 at 5:19 PM, Abbie Joseph abie.joseph14@gmail.comwrote:

Hi Shay,

Thanks for your reply. The document was updated some 3 hrs before i checked
for this behavior. The strange thing here is when i refresh the page
continuously(say 5 times) the data remains constant, but when i come back
and refresh the page the date record changes again.

On Tue, Jul 20, 2010 at 7:25 PM, Shay Banon shay.banon@elasticsearch.comwrote:

elasticsearch results are near real time, not full real time, this means
that if you update a document with the same id, that update will only be
"visible" after a certain period (or once you call refresh, which is an
expensive operations potentially, should not be called for every request).

-shay.banon

On Tue, Jul 20, 2010 at 4:48 PM, ajgamer abie.joseph14@gmail.com wrote:

Hello,

When i do a search query using http://
..*.**:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    o startdate: "2010-07-19T18:30:00.000Z"
    o enddate: "2010-07-22T18:30:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

And when i do a refresh i get:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    startdate: "2010-07-20T07:35:00.000Z"
    enddate: "2010-07-23T07:35:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

Notice the different start date and end dates.

We are not doing anything client side. What we are doing is a bulk update of
data from MySql to Elasticsearch. Do i need to take care of anything while
i do bulk updates?. Is it due to data getting replicating in multiple
shards() and maybe the shards have different data in them?. My schema for
shards look like this:

"meta-data" : {
"max_number_of_shards_per_node" : 100,
"indices" : {
"tournament" : {
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "4"
}.

BTW i now see a constant data in the date field as startdate:
"2010-07-19T18:30:00.000Z" and when i do an explain=true i now see
_explanation: {

  • value: 1
  • description: "ConstantScoreQuery(QueryWrapperFilter(tournamentid: 

On Tue, Jul 20, 2010 at 10:59 PM, Shay Banon
shay.banon@elasticsearch.comwrote:

Maybe its caching done on the client side? Do you use the REST API using
javascript? Maybe I should add a no cache header or something to the
response.

-shay.banon

On Tue, Jul 20, 2010 at 5:19 PM, Abbie Joseph abie.joseph14@gmail.comwrote:

Hi Shay,

Thanks for your reply. The document was updated some 3 hrs before i
checked for this behavior. The strange thing here is when i refresh the page
continuously(say 5 times) the data remains constant, but when i come back
and refresh the page the date record changes again.

On Tue, Jul 20, 2010 at 7:25 PM, Shay Banon <shay.banon@elasticsearch.com

wrote:

elasticsearch results are near real time, not full real time, this means
that if you update a document with the same id, that update will only be
"visible" after a certain period (or once you call refresh, which is an
expensive operations potentially, should not be called for every request).

-shay.banon

On Tue, Jul 20, 2010 at 4:48 PM, ajgamer abie.joseph14@gmail.comwrote:

Hello,

When i do a search query using http://
..*.**:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    o startdate: "2010-07-19T18:30:00.000Z"
    o enddate: "2010-07-22T18:30:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

And when i do a refresh i get:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    startdate: "2010-07-20T07:35:00.000Z"
    enddate: "2010-07-23T07:35:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

Notice the different start date and end dates.

Yes, problem in replication might explain this. I fixed (a very rare corner
case) that might explain this in 0.9, can you give it a go.

p.s. Maan, this 0.9 version must come out already, not really nice of me to
ask this all the time from users :wink:

-shay.banon

On Wed, Jul 21, 2010 at 9:13 AM, Abbie Joseph abie.joseph14@gmail.comwrote:

We are not doing anything client side. What we are doing is a bulk update
of data from MySql to Elasticsearch. Do i need to take care of anything
while i do bulk updates?. Is it due to data getting replicating in multiple
shards() and maybe the shards have different data in them?. My schema for
shards look like this:

"meta-data" : {
"max_number_of_shards_per_node" : 100,
"indices" : {
"tournament" : {
"settings" : {
"index.number_of_shards" : "5",
"index.number_of_replicas" : "4"
}.

BTW i now see a constant data in the date field as startdate:
"2010-07-19T18:30:00.000Z" and when i do an explain=true i now see
_explanation: {

  • value: 1
  • description: "ConstantScoreQuery(QueryWrapperFilter(tournamentid:

On Tue, Jul 20, 2010 at 10:59 PM, Shay Banon <shay.banon@elasticsearch.com

wrote:

Maybe its caching done on the client side? Do you use the REST API using
javascript? Maybe I should add a no cache header or something to the
response.

-shay.banon

On Tue, Jul 20, 2010 at 5:19 PM, Abbie Joseph abie.joseph14@gmail.comwrote:

Hi Shay,

Thanks for your reply. The document was updated some 3 hrs before i
checked for this behavior. The strange thing here is when i refresh the page
continuously(say 5 times) the data remains constant, but when i come back
and refresh the page the date record changes again.

On Tue, Jul 20, 2010 at 7:25 PM, Shay Banon <
shay.banon@elasticsearch.com> wrote:

elasticsearch results are near real time, not full real time, this means
that if you update a document with the same id, that update will only be
"visible" after a certain period (or once you call refresh, which is an
expensive operations potentially, should not be called for every request).

-shay.banon

On Tue, Jul 20, 2010 at 4:48 PM, ajgamer abie.joseph14@gmail.comwrote:

Hello,

When i do a search query using http://
..*.**:9200/tournament/master/_search?q=995
i get multiple results(not sure why). The result(s) i get are:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    o startdate: "2010-07-19T18:30:00.000Z"
    o enddate: "2010-07-22T18:30:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

And when i do a refresh i get:

  • _index: "tournament"
  • _type: "master"
  • _id: "995"
    _source: {
    o tournamentid: 995
    startdate: "2010-07-20T07:35:00.000Z"
    enddate: "2010-07-23T07:35:00.000Z"
    o initiator: "system-gen"
    o largeimage: 1735
    o thumbimage: 1130
    o gamename: "Avoid Responsibility"
    o gamedescription: "

    It's your chance to keep the responsibilities at bay.This is
    your destiny

    "
    }

Notice the different start date and end dates.