Query search yielding unexpected results

Hi,

I am working with elasticsearch version 0.7 and have encountered my
first roadblock. I have two mappings named detail and master. The
schema for detail is as follows :
{
"properties": {

        \"userid\": {
            \"type\": \"string\",
            \"index_name\": \"userid\",
            \"index\": \"not_analyzed\",
            \"store\": \"no\",
            \"term_vector\": \"no\",
            \"boost\": 1.0,
            \"omit_norms\": true,
            \"omit_term_freq_and_positions\": true
        },
    }
}

What i'm trying to do here is search on the userid field using the
query params as
http://**...:9200/tournament/detail/_search?q=userid:potter
and it returns the data

{

* _index: "tournament"
* _type: "detail"
* _id: "22|potter"
*
  -
  _source: {
      o tournamentid: 22
      o userid: "potter"
      o date: "2008-06-16T18:30:00.000Z"
      o score: 276.62
      o tries: 1
      o gameid: 32
      o gamename: "Caravan Toss"
      o status: 3
      o scoretype: 3
      o result: 1
  }

}

{

* _index: "tournament"
* _type: "detail"
* _id: "593|harry_potter"
*
  -
  _source: {
      o tournamentid: 593
      o userid: "harry_potter"
      o date: "2009-04-01T18:30:00.000Z"
      o score: 212
      o tries: 1
      o gameid: 732
      o gamename: "Trick Blast Billiards"
      o status: 3
      o scoretype: 1
      o result: 0
  }

}

{

* _index: "tournament"
* _type: "detail"
* _id: "602|harry_potter"
*
  -
  _source: {
      o tournamentid: 602
      o userid: "harry_potter"
      o date: "2009-04-01T18:30:00.000Z"
      o score: 190
      o tries: 2
      o gameid: 16
      o gamename: "Ski Run"
      o status: 3
      o scoretype: 1
      o result: 0
  }

}
BTW this was when the schema was set to "index": "analyzed" for the
userid field.

But when i changed the index field to not_analysed i only got the
results for potter. But when i queried for the field
harry_potter(http://***...:9200/tournament/detail/_search?
q=userid:harry_potter) i am unable to get the data for it even though
its there and i can see the data when i use a wild card like
*harry_potter.

And since I've changed the omit_norms and omit_term_freq_and_positions
to true but to no effect.

What changes do i need to make in my schema to see the desired results.

Before version 0.9 (which is master now), not_analyzed fields were still
tokenized when used in a query string explicitly. You should define the
analyzer to be "keyword" (sounds strange, I know) even when its not
analyzed. This is done for by default in upcoming 0.9.

-shay.banon

On Mon, Jul 19, 2010 at 4:09 PM, ajgamer abie.joseph14@gmail.com wrote:

Hi,

I am working with elasticsearch version 0.7 and have encountered my
first roadblock. I have two mappings named detail and master. The
schema for detail is as follows :
{
"properties": {

       \"userid\": {
           \"type\": \"string\",
           \"index_name\": \"userid\",
           \"index\": \"not_analyzed\",
           \"store\": \"no\",
           \"term_vector\": \"no\",
           \"boost\": 1.0,
           \"omit_norms\": true,
           \"omit_term_freq_and_positions\": true
       },
   }

}

What i'm trying to do here is search on the userid field using the
query params as
http://...:9200/tournament/detail/_search?q=userid:potter
and it returns the data

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "22|potter"
    _source: {
    o tournamentid: 22
    o userid: "potter"
    o date: "2008-06-16T18:30:00.000Z"
    o score: 276.62
    o tries: 1
    o gameid: 32
    o gamename: "Caravan Toss"
    o status: 3
    o scoretype: 3
    o result: 1
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "593|harry_potter"
    _source: {
    o tournamentid: 593
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 212
    o tries: 1
    o gameid: 732
    o gamename: "Trick Blast Billiards"
    o status: 3
    o scoretype: 1
    o result: 0
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "602|harry_potter"
    _source: {
    o tournamentid: 602
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 190
    o tries: 2
    o gameid: 16
    o gamename: "Ski Run"
    o status: 3
    o scoretype: 1
    o result: 0
    }
    }
    BTW this was when the schema was set to "index": "analyzed" for the
    userid field.

But when i changed the index field to not_analysed i only got the
results for potter. But when i queried for the field
harry_potter(http://..*.**:9200/tournament/detail/_search?
q=userid:harry_potter) i am unable to get the data for it even though
its there and i can see the data when i use a wild card like
*harry_potter.

And since I've changed the omit_norms and omit_term_freq_and_positions
to true but to no effect.

What changes do i need to make in my schema to see the desired results.

Thank you very much for you're reply, that did the trick for me.

P.S. Keep up the good work, I'm already a fan.

On Mon, Jul 19, 2010 at 11:28 PM, Shay Banon
shay.banon@elasticsearch.comwrote:

Before version 0.9 (which is master now), not_analyzed fields were still
tokenized when used in a query string explicitly. You should define the
analyzer to be "keyword" (sounds strange, I know) even when its not
analyzed. This is done for by default in upcoming 0.9.

-shay.banon

On Mon, Jul 19, 2010 at 4:09 PM, ajgamer abie.joseph14@gmail.com wrote:

Hi,

I am working with elasticsearch version 0.7 and have encountered my
first roadblock. I have two mappings named detail and master. The
schema for detail is as follows :
{
"properties": {

       \"userid\": {
           \"type\": \"string\",
           \"index_name\": \"userid\",
           \"index\": \"not_analyzed\",
           \"store\": \"no\",
           \"term_vector\": \"no\",
           \"boost\": 1.0,
           \"omit_norms\": true,
           \"omit_term_freq_and_positions\": true
       },
   }

}

What i'm trying to do here is search on the userid field using the
query params as
http://...:9200/tournament/detail/_search?q=userid:potter
and it returns the data

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "22|potter"
    _source: {
    o tournamentid: 22
    o userid: "potter"
    o date: "2008-06-16T18:30:00.000Z"
    o score: 276.62
    o tries: 1
    o gameid: 32
    o gamename: "Caravan Toss"
    o status: 3
    o scoretype: 3
    o result: 1
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "593|harry_potter"
    _source: {
    o tournamentid: 593
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 212
    o tries: 1
    o gameid: 732
    o gamename: "Trick Blast Billiards"
    o status: 3
    o scoretype: 1
    o result: 0
    }

}

{

  • _index: "tournament"
  • _type: "detail"
  • _id: "602|harry_potter"
    _source: {
    o tournamentid: 602
    o userid: "harry_potter"
    o date: "2009-04-01T18:30:00.000Z"
    o score: 190
    o tries: 2
    o gameid: 16
    o gamename: "Ski Run"
    o status: 3
    o scoretype: 1
    o result: 0
    }
    }
    BTW this was when the schema was set to "index": "analyzed" for the
    userid field.

But when i changed the index field to not_analysed i only got the
results for potter. But when i queried for the field
harry_potter(http://..*.**:9200/tournament/detail/_search?
q=userid:harry_potter) i am unable to get the data for it even though
its there and i can see the data when i use a wild card like
*harry_potter.

And since I've changed the omit_norms and omit_term_freq_and_positions
to true but to no effect.

What changes do i need to make in my schema to see the desired results.