Sum-aggregation script doesn't allow negative values?


(Valentin Pletzer) #1

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() < 1406332800000) ?
-1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/df03d929-afea-4cdb-9a15-926f746223a7%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Colin Goodheart-Smithe) #2

Hi,

I ran the commands in the following gist, on master, without error. Would
you be able to post the error you get and a similar reproducible example to
help diagnose the issue you are running into? Also, which version of
Elasticsearch are you running?

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() < 1406332800000)
? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/22d43f4f-cf42-4e74-b109-e21c4cfe4426%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Valentin Pletzer) #3

Hi Colin,

thanks for checking. I could successfully reproduce your example and I even
splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But as soon
as I try it with my data it doesnt work. I ran some additional tests and it
works if I only use the current index (day) and split it in half. But as
soon as I try to compare yesterday and the day before it only seems to get
the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe wrote:

Hi,

I ran the commands in the following gist, on master, without error. Would
you be able to post the error you get and a similar reproducible example to
help diagnose the issue you are running into? Also, which version of
Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() < 1406332800000)
? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e32f68f5-3c9d-485d-80e9-fd09ce6de92c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Colin Goodheart-Smithe) #4

How are you searching over the multiple indexes? are you using aliases? It
would be helpful if you could post your alias configuration (see [1]) and
an cURL example of a search request that fails

[1] http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and I
even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But as
soon as I try it with my data it doesnt work. I ran some additional tests
and it works if I only use the current index (day) and split it in half.
But as soon as I try to compare yesterday and the day before it only seems
to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/88a561f8-df8e-4e35-b5f8-46e18f56a871%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Valentin Pletzer) #5

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST
'http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase [query],
all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it always
seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe wrote:

How are you searching over the multiple indexes? are you using aliases? It
would be helpful if you could post your alias configuration (see [1]) and
an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and I
even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But as
soon as I try it with my data it doesnt work. I ran some additional tests
and it works if I only use the current index (day) and split it in half.
But as soon as I try to compare yesterday and the day before it only seems
to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/16ad0508-36ec-48b8-809c-43d6cfa74ea2%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Colin Goodheart-Smithe) #6

Firstly, I think the reason you are only getting results from one index
when you are asking for a size of 1 in your terms aggregation is because
you are asking for the top 1 bucket from each shard on each index. This
will then be merged together and only the top bucket will be kept. If the
top bucket is not the same on all indexes then you will not get results
from all indices. Setting the shard_size parameter to something like 10
can help with this
(see http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your script
is that you don't have a 'datetime' value for all of your documents in some
of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d
'{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d
'{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d
'{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe wrote:

How are you searching over the multiple indexes? are you using aliases?
It would be helpful if you could post your alias configuration (see [1])
and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and I
even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But as
soon as I try it with my data it doesnt work. I ran some additional tests
and it works if I only use the current index (day) and split it in half.
But as soon as I try to compare yesterday and the day before it only seems
to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/50971f05-ed13-48f1-b228-3cbc54997048%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Valentin Pletzer) #7

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices. It
has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe wrote:

Firstly, I think the reason you are only getting results from one index
when you are asking for a size of 1 in your terms aggregation is because
you are asking for the top 1 bucket from each shard on each index. This
will then be merged together and only the top bucket will be kept. If the
top bucket is not the same on all indexes then you will not get results
from all indices. Setting the shard_size parameter to something like 10
can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your script
is that you don't have a 'datetime' value for all of your documents in some
of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d
'{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d
'{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true' -d
'{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe wrote:

How are you searching over the multiple indexes? are you using aliases?
It would be helpful if you could post your alias configuration (see [1])
and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and I
even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But as
soon as I try it with my data it doesnt work. I ran some additional tests
and it works if I only use the current index (day) and split it in half.
But as soon as I try to compare yesterday and the day before it only seems
to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe
wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/6d1dab85-c04d-4896-a792-cdec654d62b4%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Valentin Pletzer) #8

Ok. I think I found the problem. As soon as I try to sort on the script
value it ceases to work

works, but unsorted
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

does not work:
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"order": {
"diff": "desc"
},
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices. It
has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe wrote:

Firstly, I think the reason you are only getting results from one index
when you are asking for a size of 1 in your terms aggregation is because
you are asking for the top 1 bucket from each shard on each index. This
will then be merged together and only the top bucket will be kept. If the
top bucket is not the same on all indexes then you will not get results
from all indices. Setting the shard_size parameter to something like 10
can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your script
is that you don't have a 'datetime' value for all of your documents in some
of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe wrote:

How are you searching over the multiple indexes? are you using aliases?
It would be helpful if you could post your alias configuration (see [1])
and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and I
even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But as
soon as I try it with my data it doesnt work. I ran some additional tests
and it works if I only use the current index (day) and split it in half.
But as soon as I try to compare yesterday and the day before it only seems
to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe
wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1940d9b0-3a8e-4639-9b45-14230c009d21%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Colin Goodheart-Smithe) #9

Would you be able to re-run your query and post the stack trace from the
Elasticsearch server logs. This might help to work out whats going on.

Thanks

Colin

On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:

Ok. I think I found the problem. As soon as I try to sort on the script
value it ceases to work

works, but unsorted
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

does not work:
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"order": {
"diff": "desc"
},
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices. It
has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe wrote:

Firstly, I think the reason you are only getting results from one index
when you are asking for a size of 1 in your terms aggregation is because
you are asking for the top 1 bucket from each shard on each index. This
will then be merged together and only the top bucket will be kept. If the
top bucket is not the same on all indexes then you will not get results
from all indices. Setting the shard_size parameter to something like 10
can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your
script is that you don't have a 'datetime' value for all of your documents
in some of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe wrote:

How are you searching over the multiple indexes? are you using
aliases? It would be helpful if you could post your alias configuration
(see [1]) and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and
I even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But
as soon as I try it with my data it doesnt work. I ran some additional
tests and it works if I only use the current index (day) and split it in
half. But as soon as I try to compare yesterday and the day before it only
seems to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe
wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/95def0cd-952b-46ed-89c1-33c4cc4254a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Colin Goodheart-Smithe) #10

Also, your shard_size parameter should always be greater than the size
parameter. So if you are asking for size of 10 then I would try setting
shard_size to 20 or 30.

On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote:

Would you be able to re-run your query and post the stack trace from the
Elasticsearch server logs. This might help to work out whats going on.

Thanks

Colin

On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:

Ok. I think I found the problem. As soon as I try to sort on the script
value it ceases to work

works, but unsorted
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

does not work:
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"order": {
"diff": "desc"
},
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices. It
has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe
wrote:

Firstly, I think the reason you are only getting results from one index
when you are asking for a size of 1 in your terms aggregation is because
you are asking for the top 1 bucket from each shard on each index. This
will then be merged together and only the top bucket will be kept. If the
top bucket is not the same on all indexes then you will not get results
from all indices. Setting the shard_size parameter to something like 10
can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your
script is that you don't have a 'datetime' value for all of your documents
in some of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe
wrote:

How are you searching over the multiple indexes? are you using
aliases? It would be helpful if you could post your alias configuration
(see [1]) and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example and
I even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0). But
as soon as I try it with my data it doesnt work. I ran some additional
tests and it works if I only use the current index (day) and split it in
half. But as soon as I try to compare yesterday and the day before it only
seems to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe
wrote:

Hi,

I ran the commands in the following gist, on master, without error.
Would you be able to post the error you get and a similar reproducible
example to help diagnose the issue you are running into? Also, which
version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/99841285-725a-41d0-b337-65a2eca0b4af%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Valentin Pletzer) #11

Hi Colin,

I try increasing it up to 40 but nothing changes. I would post the stack
trace but I don't know how to find them.

Thanks
Valentin

On Wednesday, July 30, 2014 10:24:09 AM UTC+2, Colin Goodheart-Smithe wrote:

Also, your shard_size parameter should always be greater than the size
parameter. So if you are asking for size of 10 then I would try setting
shard_size to 20 or 30.

On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote:

Would you be able to re-run your query and post the stack trace from the
Elasticsearch server logs. This might help to work out whats going on.

Thanks

Colin

On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:

Ok. I think I found the problem. As soon as I try to sort on the script
value it ceases to work

works, but unsorted
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

does not work:
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"order": {
"diff": "desc"
},
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}
}

On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices.
It has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe
wrote:

Firstly, I think the reason you are only getting results from one
index when you are asking for a size of 1 in your terms aggregation is
because you are asking for the top 1 bucket from each shard on each index.
This will then be merged together and only the top bucket will be kept.
If the top bucket is not the same on all indexes then you will not get
results from all indices. Setting the shard_size parameter to something
like 10 can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your
script is that you don't have a 'datetime' value for all of your documents
in some of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe
wrote:

How are you searching over the multiple indexes? are you using
aliases? It would be helpful if you could post your alias configuration
(see [1]) and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example
and I even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0).
But as soon as I try it with my data it doesnt work. I ran some additional
tests and it works if I only use the current index (day) and split it in
half. But as soon as I try to compare yesterday and the day before it only
seems to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe
wrote:

Hi,

I ran the commands in the following gist, on master, without
error. Would you be able to post the error you get and a similar
reproducible example to help diagnose the issue you are running into? Also,
which version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/3c13010a-799c-44c0-97af-cb342404a835%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Colin Goodheart-Smithe) #12

The Elasticsearch log files can be found in the logs directory of your
node's Elasticsearch directory. If you re-create the error and have a look
at the end of the log file you should see the stacktrace

Colin

On Wednesday, 30 July 2014 10:53:05 UTC+1, Valentin wrote:

Hi Colin,

I try increasing it up to 40 but nothing changes. I would post the stack
trace but I don't know how to find them.

Thanks
Valentin

On Wednesday, July 30, 2014 10:24:09 AM UTC+2, Colin Goodheart-Smithe
wrote:

Also, your shard_size parameter should always be greater than the size
parameter. So if you are asking for size of 10 then I would try setting
shard_size to 20 or 30.

On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote:

Would you be able to re-run your query and post the stack trace from the
Elasticsearch server logs. This might help to work out whats going on.

Thanks

Colin

On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:

Ok. I think I found the problem. As soon as I try to sort on the script
value it ceases to work

works, but unsorted
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}

does not work:
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"order": {
"diff": "desc"
},
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}

On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices.
It has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe
wrote:

Firstly, I think the reason you are only getting results from one
index when you are asking for a size of 1 in your terms aggregation is
because you are asking for the top 1 bucket from each shard on each index.
This will then be merged together and only the top bucket will be kept.
If the top bucket is not the same on all indexes then you will not get
results from all indices. Setting the shard_size parameter to something
like 10 can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your
script is that you don't have a 'datetime' value for all of your documents
in some of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1
: 1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST 'http://localhost:9200/aggtest/video/_search?pretty=true'
-d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because it
always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1
: 1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe
wrote:

How are you searching over the multiple indexes? are you using
aliases? It would be helpful if you could post your alias configuration
(see [1]) and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example
and I even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0).
But as soon as I try it with my data it doesnt work. I ran some additional
tests and it works if I only use the current index (day) and split it in
half. But as soon as I try to compare yesterday and the day before it only
seems to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin Goodheart-Smithe
wrote:

Hi,

I ran the commands in the following gist, on master, without
error. Would you be able to post the error you get and a similar
reproducible example to help diagnose the issue you are running into? Also,
which version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/525281c1-dce7-42ed-9926-8bf3605d89ba%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(Valentin Pletzer) #13

Hi Colin,

I now could solve the problem thanks to your advice. I set the "shard_size"
to 0 (max) and then it works. I still don't understand it a 100%.

Thanks for your patience.
Valentin

On Thursday, July 31, 2014 9:21:12 AM UTC+2, Colin Goodheart-Smithe wrote:

The Elasticsearch log files can be found in the logs directory of your
node's Elasticsearch directory. If you re-create the error and have a look
at the end of the log file you should see the stacktrace

Colin

On Wednesday, 30 July 2014 10:53:05 UTC+1, Valentin wrote:

Hi Colin,

I try increasing it up to 40 but nothing changes. I would post the stack
trace but I don't know how to find them.

Thanks
Valentin

On Wednesday, July 30, 2014 10:24:09 AM UTC+2, Colin Goodheart-Smithe
wrote:

Also, your shard_size parameter should always be greater than the size
parameter. So if you are asking for size of 10 then I would try setting
shard_size to 20 or 30.

On Wednesday, 30 July 2014 09:22:16 UTC+1, Colin Goodheart-Smithe wrote:

Would you be able to re-run your query and post the stack trace from
the Elasticsearch server logs. This might help to work out whats going on.

Thanks

Colin

On Tuesday, 29 July 2014 12:29:00 UTC+1, Valentin wrote:

Ok. I think I found the problem. As soon as I try to sort on the
script value it ceases to work

works, but unsorted
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}

does not work:
{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 10,
"order": {
"diff": "desc"
},
"shard_size": 4
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1 :
1",
"lang": "groovy"
}
}
}
}
}
}

On Tuesday, July 29, 2014 12:40:15 PM UTC+2, Valentin wrote:

Hi Colin,

I could figure out the shard_size problem thanks to your help.

For the 'datetime' error: I checked and it exists in all the indices.
It has the correct mappings and the therefor probably could not have wrong
values I guess. And using the elasticsearch-head plugin I dont get the
error but a wrong result which really seems strange.

Thanks
Valentin

On Tuesday, July 29, 2014 11:54:08 AM UTC+2, Colin Goodheart-Smithe
wrote:

Firstly, I think the reason you are only getting results from one
index when you are asking for a size of 1 in your terms aggregation is
because you are asking for the top 1 bucket from each shard on each index.
This will then be merged together and only the top bucket will be kept.
If the top bucket is not the same on all indexes then you will not get
results from all indices. Setting the shard_size parameter to something
like 10 can help with this (see
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_document_counts_are_approximate
for more information on this)

Second, I wonder if the reason you are getting the error from your
script is that you don't have a 'datetime' value for all of your documents
in some of your indices?

Regards,

Colin

On Monday, 28 July 2014 16:04:55 UTC+1, Valentin wrote:

Hi Colin,

now it gets really strange. First my alias
curl 'http://localhost:9200/_alias?pretty'
{
"live-2014-07-27" : {

"aliases" : { 

  "aggtest" : { } 

} 

},

"live-2014-07-26" : {

"aliases" : { 

  "aggtest" : { } 

} 

}

}

I tried two different queries:
curl -XPOST '
http://localhost:9200/aggtest/video/_search?pretty=true' -d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
},
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1
: 1",
"lang": "groovy"
}
}
}
}
}
}'

and

curl -XPOST '
http://localhost:9200/live-2014-07-26,live-2014-07-27/video/_search?pretty=true'
.....

both do give me a result (but a wrong one) when I do query using
elasticsearch-head but result in an error if I use the commandline

{

"error" : "SearchPhaseExecutionException[Failed to execute phase
[query], all shards failed; shardFailures
{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-26][1]:
QueryPhaseExecutionException[[live-2014-07-26][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-27][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-27][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]];
}{[_MxuihP3TfmZV4FYUQaRQQ][live-2014-07-27][1]:
QueryPhaseExecutionException[[live-2014-07-27][1]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script126]];
}{[FYhB58m7T1W3HjhzUmtzww][live-2014-07-26][0]:
RemoteTransportException[[live02][inet[/10.XXX.XX.XX:9300]][search/phase/query]];
nested: QueryPhaseExecutionException[[live-2014-07-26][0]:
query[ConstantScore(cache(_type:video))],from[0],size[0]: Query Failed
[Failed to execute main query]]; nested:
GroovyScriptExecutionException[MissingPropertyException[No such property:
datetime for class: Script119]]; }]",

"status" : 500

}

But I noticed something strange. This works:
curl -XPOST '
http://localhost:9200/aggtest/video/_search?pretty=true' -d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit"
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 26,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 3539,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      }, {

        "key_as_string" : "2014-07-27T00:00:00.000Z",

        "key" : 1406419200000,

        "doc_count" : 719

      } ]

    }

  }, {

But this does not: (notice the size-limit to 1)
curl -XPOST '
http://localhost:9200/aggtest/video/_search?pretty=true' -d '{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"size": 1
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
}
}
}
}
}'
result:

{

"took" : 17,

"timed_out" : false,

"_shards" : {

"total" : 4,

"successful" : 4,

"failed" : 0

},

"hits" : {

"total" : 89419,

"max_score" : 0.0,

"hits" : [ ]

},

"aggregations" : {

"winners" : {

  "buckets" : [ {

    "key" : "videotitle",

    "doc_count" : 2820,

    "articles_over_time" : {

      "buckets" : [ {

        "key_as_string" : "2014-07-26T00:00:00.000Z",

        "key" : 1406332800000,

        "doc_count" : 2820

      } ]

    }

  } ]

}

}

}

Which seems to be a related problem to my original query, because
it always seem to query one index but not the other

my original query I used in elasticsearch-head:
/aggtest/video/

{
"size": 0,
"aggs": {
"winners": {
"terms": {
"field": "tit",
"order": {
"diff": "desc"
}
},
"aggs": {
"articles_over_time": {
"date_histogram": {
"field": "datetime",
"interval": "1d"
}
},
"diff": {
"sum": {
"script": "(doc['datetime'].value < 1406412000000) ? -1
: 1",
"lang": "groovy"
}
}
}
}
}
}
and the result:

  • {
    • key: videotitle
    • doc_count: 719
    • articles_over_time: {
      • buckets: [
        • {
          • key_as_string: 2014-07-27T00:00:00.000Z
          • key: 1406419200000
          • doc_count: 719
            }
            ]
            }
    • diff: {
      • value: 719
        }
        }

Thanks,
Valentin

On Monday, July 28, 2014 3:42:46 PM UTC+2, Colin Goodheart-Smithe
wrote:

How are you searching over the multiple indexes? are you using
aliases? It would be helpful if you could post your alias configuration
(see [1]) and an cURL example of a search request that fails

[1]
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/indices-aliases.html#alias-retrieving

Thanks

Colin

On Monday, 28 July 2014 14:00:55 UTC+1, Valentin wrote:

Hi Colin,

thanks for checking. I could successfully reproduce your example
and I even splitted it into 2 indeces and it worked (Elasticsearch 1.3.0).
But as soon as I try it with my data it doesnt work. I ran some additional
tests and it works if I only use the current index (day) and split it in
half. But as soon as I try to compare yesterday and the day before it only
seems to get the data from one day but not the other.

Cheers,
Valentin

On Monday, July 28, 2014 10:07:43 AM UTC+2, Colin
Goodheart-Smithe wrote:

Hi,

I ran the commands in the following gist, on master, without
error. Would you be able to post the error you get and a similar
reproducible example to help diagnose the issue you are running into? Also,
which version of Elasticsearch are you running?

https://gist.github.com/colings86/46fbb0b22c2f3c4348ae

Thanks

Colin

On Sunday, 27 July 2014 17:53:29 UTC+1, Valentin wrote:

Hi,

I am trying to use this aggregation which does not work:
"aggs": {
"winners": {
"terms": {
"field": "urls",
"order": {
"diff": "desc"
}
},
"aggs": {
"diff": {
"sum": {
"script": "(doc['datetime'].date.getMillis() <
1406332800000) ? -1 : 1",
"lang": "groovy"
}
}
}
}
}

Can anyone help?

Cheers,
Valentin

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/0a9f861d-4321-4230-9827-59706b51b331%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


(system) #14