Date field sorting problem


(BBanzai) #1

Hi all.. thanks in advance...

I have what appears to be a date sorting issue.

I run the following _search query
{"fields":["publish_date"],"sort":[{"publish_date":
{"order":"desc"}}],"query":{"match_all":{}}}

I get back publish_date sorted by what looks like day..

{

took: 3
timed_out: false
_shards: {
    total: 5
    successful: 5
    failed: 0
}
hits: {
    total: 269
    max_score: null
    hits: [
        {
            _index: products
            _type: xxxx
            _id: 10-XXXX
            _score: null
            fields: {
                publish_date: 2011-06-28 16:05:50 -0000
            }
            sort: [
                1296230750000
            ]
        }
        {
            _index: products
            _type: xxxx
            _id: 10-YYYY
            _score: null
            fields: {
                publish_date: 2011-02-27 01:01:24 -0000
            }
            sort: [
                1296090084000
            ]
        }
        {
            _index: products
            _type: xxxx
            _id: 10-ZZZZ
            _score: null
            fields: {
                publish_date: 2011-02-27 01:00:16 -0000
            }
            sort: [
                1296090016000
            ]
        }
        {
            _index: products
            _type: xxxx
            _id: 10-AAAA
            _score: null
            fields: {
                publish_date: 2011-02-27 00:58:48 -0000
            }
            sort: [
                1296089928000
            ]
        },

{

_index: products
_type: xxxx
_id: 10-CCCC
_score: null
fields: {
    publish_date: 2011-07-21 22:20:35 -0000
}
sort: [
    1295648435000
]

}
... and so on.

my mapping looks like:

publish_date: {
format: "yyyy-mm-dd HH:mm:ss Z"
type: "date"
}

I load the data via _bulk

inside the bulk import (JSON) file.. the publish_date looks like:

,"publish_date":"2011-07-21 22:20:35 -0000",

what could be going wrong?


(BBanzai) #2

Here is the process to reproduce the problem:

curl -XDELETE 'http://localhost:9200/products'

curl -XPUT 'http://localhost:9200/products' -d @testmapping.json
testmapping.json:
{
"mappings": {
"mytype": {
"date_formats": [
"yyyy-mm-dd HH:mm:ss"
],
"dynamic": "true",
"properties": {
"publish_date": {
"type": "date",
"format": "yyyy-mm-dd"
},
"docid": {
"type": "string",
"index": "analyzed",
"analyzer": "exact_analyzer"
},
"title": {
"type": "string",
"analyzer": "std_stem_analyzer"
}
}
}
}
}

curl -XPOST 'http://localhost:9200/_bulk' --data-binary @bulk_1.json

bulk_1.json:

{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0001" }}
{"publish_date":"2011-02-27","docid":"10-0001",title:"expect at 3"}
{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0002" }}
{"publish_date":"2011-09-15","docid":"10-0002",title:"expect at 1"}
{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0003" }}
{"publish_date":"2011-06-10","docid":"10-0003",title:"expect at 2"}

$

curl -XGET 'http://localhost:9200/products/mytype/_search?pretty=true'
-d '
{
"sort": [
{
"publish_date": {
"order": "desc"
}
}
],
"query": {
"match_all": {}
}
}
'

You should get back this:

{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : null,
"hits" : [ {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0001",
"_score" : null, "_source" :
{"publish_date":"2011-02-27","docid":"10-0001",title:"expect at 3"},
"sort" : [ 1296086520000 ]
}, {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0002",
"_score" : null, "_source" :
{"publish_date":"2011-09-15","docid":"10-0002",title:"expect at 1"},
"sort" : [ 1295050140000 ]
}, {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0003",
"_score" : null, "_source" :
{"publish_date":"2011-06-10","docid":"10-0003",title:"expect at 2"},
"sort" : [ 1294617960000 ]
} ]
}

Note: I have tried various date formats with same result.


(ppearcy) #3

Hey,
Your date format is not defined correctly.

mm == minute
MM == month

This page is linked from the date format section of the guide:
http://joda-time.sourceforge.net/api-release/org/joda/time/format/DateTimeFormat.html

Best Regards,
Paul

On Sep 30, 12:23 pm, BBanzai prigor2...@mac.com wrote:

Here is the process to reproduce the problem:

curl -XDELETE 'http://localhost:9200/products'

curl -XPUT 'http://localhost:9200/products'-d @testmapping.json
testmapping.json:
{
"mappings": {
"mytype": {
"date_formats": [
"yyyy-mm-dd HH:mm:ss"
],
"dynamic": "true",
"properties": {
"publish_date": {
"type": "date",
"format": "yyyy-mm-dd"
},
"docid": {
"type": "string",
"index": "analyzed",
"analyzer": "exact_analyzer"
},
"title": {
"type": "string",
"analyzer": "std_stem_analyzer"
}
}
}
}

}

curl -XPOST 'http://localhost:9200/_bulk'--data-binary @bulk_1.json

bulk_1.json:

{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0001" }}
{"publish_date":"2011-02-27","docid":"10-0001",title:"expect at 3"}
{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0002" }}
{"publish_date":"2011-09-15","docid":"10-0002",title:"expect at 1"}
{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0003" }}
{"publish_date":"2011-06-10","docid":"10-0003",title:"expect at 2"}

$

curl -XGET 'http://localhost:9200/products/mytype/_search?pretty=true'
-d '
{
"sort": [
{
"publish_date": {
"order": "desc"
}
}
],
"query": {
"match_all": {}
}}

'

You should get back this:

{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : null,
"hits" : [ {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0001",
"_score" : null, "_source" :
{"publish_date":"2011-02-27","docid":"10-0001",title:"expect at 3"},
"sort" : [ 1296086520000 ]
}, {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0002",
"_score" : null, "_source" :
{"publish_date":"2011-09-15","docid":"10-0002",title:"expect at 1"},
"sort" : [ 1295050140000 ]
}, {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0003",
"_score" : null, "_source" :
{"publish_date":"2011-06-10","docid":"10-0003",title:"expect at 2"},
"sort" : [ 1294617960000 ]
} ]
}

Note: I have tried various date formats with same result.


(BBanzai) #4

Paul.. Great catch! I had been staring at this for a couple of
hours.. I am no stranger to data formatting.. but for some reason.. I
just didn't see it.. wow.. looks so obvious now!!! Thanks for
finding it..

-Paul... too.

On Sep 30, 12:18 pm, ppearcy ppea...@gmail.com wrote:

Hey,
Your date format is not defined correctly.

mm == minute
MM == month

This page is linked from the date format section of the guide:http://joda-time.sourceforge.net/api-release/org/joda/time/format/Dat...

Best Regards,
Paul

On Sep 30, 12:23 pm, BBanzai prigor2...@mac.com wrote:

Here is the process to reproduce the problem:

curl -XDELETE 'http://localhost:9200/products'

curl -XPUT 'http://localhost:9200/products'-d@testmapping.json
testmapping.json:
{
"mappings": {
"mytype": {
"date_formats": [
"yyyy-mm-dd HH:mm:ss"
],
"dynamic": "true",
"properties": {
"publish_date": {
"type": "date",
"format": "yyyy-mm-dd"
},
"docid": {
"type": "string",
"index": "analyzed",
"analyzer": "exact_analyzer"
},
"title": {
"type": "string",
"analyzer": "std_stem_analyzer"
}
}
}
}

}

curl -XPOST 'http://localhost:9200/_bulk'--data-binary@bulk_1.json

bulk_1.json:

{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0001" }}
{"publish_date":"2011-02-27","docid":"10-0001",title:"expect at 3"}
{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0002" }}
{"publish_date":"2011-09-15","docid":"10-0002",title:"expect at 1"}
{"create" : { "_index" : "products", "_type" : "mytype", "_id" :
"10-0003" }}
{"publish_date":"2011-06-10","docid":"10-0003",title:"expect at 2"}

$

curl -XGET 'http://localhost:9200/products/mytype/_search?pretty=true'
-d '
{
"sort": [
{
"publish_date": {
"order": "desc"
}
}
],
"query": {
"match_all": {}
}}

'

You should get back this:

{
"took" : 4,
"timed_out" : false,
"_shards" : {
"total" : 5,
"successful" : 5,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : null,
"hits" : [ {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0001",
"_score" : null, "_source" :
{"publish_date":"2011-02-27","docid":"10-0001",title:"expect at 3"},
"sort" : [ 1296086520000 ]
}, {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0002",
"_score" : null, "_source" :
{"publish_date":"2011-09-15","docid":"10-0002",title:"expect at 1"},
"sort" : [ 1295050140000 ]
}, {
"_index" : "products",
"_type" : "threatintel",
"_id" : "10-0003",
"_score" : null, "_source" :
{"publish_date":"2011-06-10","docid":"10-0003",title:"expect at 2"},
"sort" : [ 1294617960000 ]
} ]
}

Note: I have tried various date formats with same result.


(system) #5