Date field type problem in reindexing

I am trying to merge multiple indices with same fields in Elasticsearch by using _reindex API as follow:
POST _reindex
{
"source": {
"index": ["a","b"]

},
"dest": {
"index": "ab"
}
}

my problem is the date field. In "ab" index the date field becomes string and I cannot understand why. I used the following command in _reindex and tried to create and set mapping for index "ab" before _reindex, but the problem is not resolved.

"script": {
"source": "ctx._source.date = new SimpleDateFormat('yyyy-MM-dd HH:mm:ss').parse(ctx._source.date);"
}

PUT /data
{
"settings": {
"number_of_shards": 1,
"number_of_replicas": 1
},
"mappings": {
"properties" : {
"cashtags" : {
"type" : "keyword"
},
"conversation_id" : {
"type" : "long"
},
"created_at" : {
"type" : "long"
},
"date" : {
"type": "date",
"format" : "yyyy-MM-dd HH:mm:ss"
}
}
}

1 Like

would you be able to share example date field values?

they are tweets which were downloaded by using twint library in python.
here are the example:

id:
1,203,786,367,391,674,368
conversation_id:
1,203,786,367,391,674,368
created_at:
1,575,840,009,000
date:
Dec 8, 2019 @ 23:20:09.000
timezone:
CET
place:
tweet:
واقعا چجوری روتون میشه از انتخابات و رای دادن حرف بزنید، اونم زمانی که صندوق رای جمهوری اسلامی بوی خون گرفته . . . #انتخابات #انتخابات_مجلس #تحریم_خبری_انتخابات #آبان98 #آبان_خونین #پویا_بختیاری
hashtags:
#انتخابات, #انتخابات_مجلس, #تحریم_خبری_انتخابات, #آبان98, #آبان_خونین, #پویا_بختیاری
cashtags:
user_id_str:
844586778942230529
username:
PHNTOMCAT
name:
مرد تنهای شب
day:
7
hour:
22
link:
https://twitter.com/PHNTOMCAT/status/1203786367391674370
retweet:
false
essid:
nlikes:
3
nreplies:
1
nretweets:
0
quote_url:
video:
0
search:
رای دادن مجلس
near:
reply_to:
{ "user_id": "844586778942230529", "username": "PHNTOMCAT" }
id:
1203786367391674370_raw

_type:
_doc
_index:
votingparliament
_score:
1

from what I see you have a custom date format on that field. Can you show a mapping for that field on index "a" and "b" ?
you should use the same mapping on your index "ab" or use a script which would parse with pattern "MMM dd, uuuu @ hh:mm:ss.SSS"
I assumed you are using ES version 7+. if not use YYYY instead of uuuu

Yes I am using ES 7.4.2.
Since "a" and "b" are exactly the same I just print the "a" mapping here. I got the mapping by
GET /a/_mapping:
{
"a" : {
"mappings" : {
"properties" : {
"cashtags" : {
"type" : "keyword",
"normalizer" : "hashtag_normalizer"
},
"conversation_id" : {
"type" : "long"
},
"created_at" : {
"type" : "long"
},
"date" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"day" : {
"type" : "integer"
},
"essid" : {
"type" : "keyword"
},
"geo_near" : {
"type" : "geo_point"
},
"geo_tweet" : {
"type" : "geo_point"
},
"hashtags" : {
"type" : "keyword",
"normalizer" : "hashtag_normalizer"
},
"hour" : {
"type" : "integer"
},
"id" : {
"type" : "long"
},
"link" : {
"type" : "text"
},
"location" : {
"type" : "keyword"
},
"mentions" : {
"type" : "keyword",
"normalizer" : "hashtag_normalizer"
},
"name" : {
"type" : "text"
},
"near" : {
"type" : "text"
},
"nlikes" : {
"type" : "integer"
},
"nreplies" : {
"type" : "integer"
},
"nretweets" : {
"type" : "integer"
},
"photos" : {
"type" : "text"
},
"place" : {
"type" : "keyword"
},
"profile_image_url" : {
"type" : "text"
},
"quote_url" : {
"type" : "text"
},
"reply_to" : {
"type" : "nested",
"properties" : {
"user_id" : {
"type" : "keyword"
},
"username" : {
"type" : "keyword"
}
}
},
"retweet" : {
"type" : "text"
},
"retweet_date" : {
"type" : "date",
"format" : "yyyy-MM-dd HH:mm:ss"
},
"retweet_id" : {
"type" : "keyword"
},
"search" : {
"type" : "text"
},
"source" : {
"type" : "keyword"
},
"timezone" : {
"type" : "keyword"
},
"trans_dest" : {
"type" : "keyword"
},
"trans_src" : {
"type" : "keyword"
},
"translate" : {
"type" : "text"
},
"tweet" : {
"type" : "text"
},
"urls" : {
"type" : "keyword"
},
"user_id_str" : {
"type" : "keyword"
},
"user_rt" : {
"type" : "keyword"
},
"user_rt_id" : {
"type" : "keyword"
},
"username" : {
"type" : "keyword",
"normalizer" : "hashtag_normalizer"
},
"video" : {
"type" : "integer"
}
}
}
}
}

Finally, I managed to solve this silly problem just by removing the saved object "ab" in kibana. no need for "script" in _reindex and the PUT /ab {..."mappings":{...}} is enough.

@Sina_Gholami looks like the mapping on a (and b) is the same as on the ab you created. It is surprising to see date in this format Dec 8, 2019 @ 23:20:09.000
Maybe it wasn't the date format it was actually sent to elasticsearch. I expect it would only parse yyyy-MM-dd HH:mm:ss

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.