Facet counting on non unique tag collection

Hi,
I have a pretty simple question (I hope):

I have the following "structure" -
curl -XPOST 'http://localhost:9200/test/records/' -d '{
"session_id" : "abc",
"user_id" : 1,
"timestamp" : 1359245610,
"coords" : ["1x1", "1x1", "1x1", "4x4", "5x5"]
}'
curl -XPOST 'http://localhost:9200/test/records/' -d '{
"session_id" : "abc",
"user_id" : 1,
"timestamp" : 1359245610,
"coords" : ["2x2", "3x3", "1x1", "4x4", "5x5"]
}'

When I do the following:
curl -XGET 'http://localhost:9200/test/_search?pretty=true' -d '{
"query" : {
"matchAll" : {}
},
"facets" : {
"somename" : { "terms" : {"field" : "coords"} }
}
,"size":0}'

I get:
{
"took" : 3,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 2,
"max_score" : 1.0,
"hits" : [ ]
},
"facets" : {
"somename" : {
"_type" : "terms",
"missing" : 0,
"total" : 8,
"other" : 0,
"terms" : [ {
"term" : "5x5",
"count" : 2
}, {
"term" : "4x4",
"count" : 2
}, {
"term" : "1x1",
"count" : 2
}, {
"term" : "3x3",
"count" : 1
}, {
"term" : "2x2",
"count" : 1
} ]
}
}
}

The question is - how do I make it say that there are actually FOUR "1x1" ?

Thank you!

--

I thought this would be an easy thing to solve :slight_smile:

--

Hi,

It the term facets return the number of document that have matched a
particular term, so 2 is correct in this case.
I guess in your case (and correct me if I'm wrong) you want to treat
each coord as a separate hit. If that is the case you
can use the nested field type 1 and use the nested option in
your somename facet.

Martijn

On 28 January 2013 15:53, rookie7799 pavelbaranov@gmail.com wrote:

I thought this would be an easy thing to solve :slight_smile:

--

--
Met vriendelijke groet,

Martijn van Groningen

Thank you for reply Martijn!

Yes you're correct I want every "string" inside coords to be counted as 1.
I've tried nested mapping and search but it didn't work for me, the example
that they have on elasticsearch page refers to the following structure:

"obj1" : [
{
"name" : "blue",
"count" : 4
},
{
"name" : "green",
"count" : 6
}
]

however , in my case I don't even have counts. I tried changing my layout a
bit as well but it didn't help:
curl -XPOST 'http://localhost:9200/test/mice/' -d '{
"session_id" : "12345abcde",
"user_id" : 1,
"timestamp" : 1359245610,
"coords" : [ {"v":"1x1"}, {"v":"1x1"}, {"v":"1x1"}, {"v":"4x4"},
{"v":"5x5"} ]
}'

The reason I want to be able to do it in the first place is to save space
really. I could save ever coordinate in its own recrord but that means
saving session_id, user_id, timestamp as well...

On Monday, January 28, 2013 10:46:54 AM UTC-5, Martijn v Groningen wrote:

Hi,

It the term facets return the number of document that have matched a
particular term, so 2 is correct in this case.
I guess in your case (and correct me if I'm wrong) you want to treat
each coord as a separate hit. If that is the case you
can use the nested field type 1 and use the nested option in
your somename facet.

Martijn

On 28 January 2013 15:53, rookie7799 <pavelb...@gmail.com <javascript:>>
wrote:

I thought this would be an easy thing to solve :slight_smile:

--

--
Met vriendelijke groet,

Martijn van Groningen

--

however , in my case I don't even have counts. I tried changing my layout a
bit as well but it didn't help:
curl -XPOST 'http://localhost:9200/test/mice/' -d '{
"session_id" : "12345abcde",
"user_id" : 1,
"timestamp" : 1359245610,
"coords" : [ {"v":"1x1"}, {"v":"1x1"}, {"v":"1x1"}, {"v":"4x4"},
{"v":"5x5"} ]
}'
It should work did you specify the nested type in your mapping (it is
not applied automatically)?
See my example here: https://gist.github.com/4657029

The reason I want to be able to do it in the first place is to save space
really. I could save ever coordinate in its own recrord but that means
saving session_id, user_id, timestamp as well...
I understand, however with using nested you keep the overhead to a
minimum (compared to storing each coord as a different document).

--
Met vriendelijke groet,

Martijn van Groningen

--

Yep:
{

  • test:
    {
    • mice:
      {
      • properties:
        {
        • coords:
          {
          • type: "nested",
          • properties:
            {
            • v:
              {
              • type: "string"
                }
                }
                },
        • session_id:
          {
          • type: "string"
            },
        • timestamp:
          {
          • type: "long"
            },
        • user_id:
          {
          • type: "long"
            }
            }
            }
            }

}

--

Oh man, I was using terms_stats instead :slight_smile: DUH

Thank you so much!

On Monday, January 28, 2013 11:36:55 AM UTC-5, Martijn v Groningen wrote:

however , in my case I don't even have counts. I tried changing my
layout a
bit as well but it didn't help:
curl -XPOST 'http://localhost:9200/test/mice/' -d '{
"session_id" : "12345abcde",
"user_id" : 1,
"timestamp" : 1359245610,
"coords" : [ {"v":"1x1"}, {"v":"1x1"}, {"v":"1x1"}, {"v":"4x4"},
{"v":"5x5"} ]
}'
It should work did you specify the nested type in your mapping (it is
not applied automatically)?
See my example here: https://gist.github.com/4657029

The reason I want to be able to do it in the first place is to save
space
really. I could save ever coordinate in its own recrord but that means
saving session_id, user_id, timestamp as well...
I understand, however with using nested you keep the overhead to a
minimum (compared to storing each coord as a different document).

--
Met vriendelijke groet,

Martijn van Groningen