Query search yielding unexpected results


(ajgamer) #1

Hi,

I am working with elasticsearch version 0.7 and have encountered my
first roadblock. I have two mappings named detail and master. The
schema for detail is as follows :
{
"properties": {

        \"userid\": {
            \"type\": \"string\",
            \"index_name\": \"userid\",
            \"index\": \"not_analyzed\",
            \"store\": \"no\",
            \"term_vector\": \"no\",
            \"boost\": 1.0,
            \"omit_norms\": true,
            \"omit_term_freq_and_positions\": true
        },
    }
}

What i'm trying to do here is search on the userid field using the
query params as
http://**...:9200/tournament/detail/_search?q=userid:potter
and it returns the data

{

* _index: "tournament"
* _type: "detail"
* _id: "22|potter"
*
  -
  _source: {
      o tournamentid: 22
      o userid: "potter"
      o date: "2008-06-16T18:30:00.000Z"
      o score: 276.62
      o tries: 1
      o gameid: 32
      o gamename: "Caravan Toss"
      o status: 3
      o scoretype: 3
      o result: 1
  }

}

{

* _index: "tournament"
* _type: "detail"
* _id: "593|harry_potter"
*
  -
  _source: {
      o tournamentid: 593
      o userid: "harry_potter"
      o date: "2009-04-01T18:30:00.000Z"
      o score: 212
      o tries: 1
      o gameid: 732
      o gamename: "Trick Blast Billiards"
      o status: 3
      o scoretype: 1
      o result: 0
  }

}

{

* _index: "tournament"
* _type: "detail"
* _id: "602|harry_potter"
*
  -
  _source: {
      o tournamentid: 602
      o userid: "harry_potter"
      o date: "2009-04-01T18:30:00.000Z"
      o score: 190
      o tries: 2
      o gameid: 16
      o gamename: "Ski Run"
      o status: 3
      o scoretype: 1
      o result: 0
  }

}
BTW this was when the schema was set to "index": "analyzed" for the
userid field.

But when i changed the index field to not_analysed i only got the
results for potter. But when i queried for the field
harry_potter(http://***...:9200/tournament/detail/_search?
q=userid:harry_potter) i am unable to get the data for it even though
its there and i can see the data when i use a wild card like
*harry_potter.

And since I've changed the omit_norms and omit_term_freq_and_positions
to true but to no effect.

What changes do i need to make in my schema to see the desired results.


(Shay Banon) #2

Before version 0.9 (which is master now), not_analyzed fields were still
tokenized when used in a query string explicitly. You should define the
analyzer to be "keyword" (sounds strange, I know) even when its not
analyzed. This is done for by default in upcoming 0.9.

-shay.banon

On Mon, Jul 19, 2010 at 4:09 PM, ajgamer abie.joseph14@gmail.com wrote:

Hi,

I am working with elasticsearch version 0.7 and have encountered my
first roadblock. I have two mappings named detail and master. The
schema for detail is as follows :
{
"properties": {

       \"userid\": {
           \"type\": \"string\",
           \"index_name\": \"userid\",
           \"index\": \"not_analyzed\",
           \"store\": \"no\",
           \"term_vector\": \"no\",
           \"boost\": 1.0,
           \"omit_norms\": true,
           \"omit_term_freq_and_positions\": true
       },
   }

}

What i'm trying to do here is search on the userid field using the
query params as
http://**...:9200/tournament/detail/_search?q=userid:potter
and it returns the data

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "22|potter"
    _source: {
    o tournamentid: 22
    o userid: "potter"
    o date: "2008-06-16T18:30:00.000Z"
    o score: 276.62
    o tries: 1
    o gameid: 32
    o gamename: "Caravan Toss"
    o status: 3
    o scoretype: 3
    o result: 1
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "593|harry_potter"
    _source: {
    o tournamentid: 593
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 212
    o tries: 1
    o gameid: 732
    o gamename: "Trick Blast Billiards"
    o status: 3
    o scoretype: 1
    o result: 0
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "602|harry_potter"
    _source: {
    o tournamentid: 602
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 190
    o tries: 2
    o gameid: 16
    o gamename: "Ski Run"
    o status: 3
    o scoretype: 1
    o result: 0
    }
    }
    BTW this was when the schema was set to "index": "analyzed" for the
    userid field.

But when i changed the index field to not_analysed i only got the
results for potter. But when i queried for the field
harry_potter(http://***...:9200/tournament/detail/_search?
q=userid:harry_potter) i am unable to get the data for it even though
its there and i can see the data when i use a wild card like
*harry_potter.

And since I've changed the omit_norms and omit_term_freq_and_positions
to true but to no effect.

What changes do i need to make in my schema to see the desired results.


(ajgamer) #3

Thank you very much for you're reply, that did the trick for me.

P.S. Keep up the good work, I'm already a fan.

On Mon, Jul 19, 2010 at 11:28 PM, Shay Banon
shay.banon@elasticsearch.comwrote:

Before version 0.9 (which is master now), not_analyzed fields were still
tokenized when used in a query string explicitly. You should define the
analyzer to be "keyword" (sounds strange, I know) even when its not
analyzed. This is done for by default in upcoming 0.9.

-shay.banon

On Mon, Jul 19, 2010 at 4:09 PM, ajgamer abie.joseph14@gmail.com wrote:

Hi,

I am working with elasticsearch version 0.7 and have encountered my
first roadblock. I have two mappings named detail and master. The
schema for detail is as follows :
{
"properties": {

       \"userid\": {
           \"type\": \"string\",
           \"index_name\": \"userid\",
           \"index\": \"not_analyzed\",
           \"store\": \"no\",
           \"term_vector\": \"no\",
           \"boost\": 1.0,
           \"omit_norms\": true,
           \"omit_term_freq_and_positions\": true
       },
   }

}

What i'm trying to do here is search on the userid field using the
query params as
http://**...:9200/tournament/detail/_search?q=userid:potter
and it returns the data

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "22|potter"
    _source: {
    o tournamentid: 22
    o userid: "potter"
    o date: "2008-06-16T18:30:00.000Z"
    o score: 276.62
    o tries: 1
    o gameid: 32
    o gamename: "Caravan Toss"
    o status: 3
    o scoretype: 3
    o result: 1
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "593|harry_potter"
    _source: {
    o tournamentid: 593
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 212
    o tries: 1
    o gameid: 732
    o gamename: "Trick Blast Billiards"
    o status: 3
    o scoretype: 1
    o result: 0
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "602|harry_potter"
    _source: {
    o tournamentid: 602
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 190
    o tries: 2
    o gameid: 16
    o gamename: "Ski Run"
    o status: 3
    o scoretype: 1
    o result: 0
    }
    }
    BTW this was when the schema was set to "index": "analyzed" for the
    userid field.

But when i changed the index field to not_analysed i only got the
results for potter. But when i queried for the field
harry_potter(http://***...:9200/tournament/detail/_search?
q=userid:harry_potter) i am unable to get the data for it even though
its there and i can see the data when i use a wild card like
*harry_potter.

And since I've changed the omit_norms and omit_term_freq_and_positions
to true but to no effect.

What changes do i need to make in my schema to see the desired results.


(system) #4