Hello Team
We are using Elasticsearch version 7.8.0
We are having Index with Pipeline Defined ..
Our Problem is data is getting SWAPPED between two fields of Elasticsearch.
Data of "OBJ_NAM_FILDT" is getting posted in "USER_TAGS_AK" and vice-versa. That too only for FEW Records and not for all the data records being pushed to Elasticsearch.
Could you please help us if you are aware of such behaviour in Elasticsearch with Pipeline "PROCESSOR - SPLIT " define on a field.
Field Definition :
"OBJ_NAM_FILDT" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"USER_TAGS_AK" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
},
"analyzer" : "autocomplete"
},
Index Pipeline :
"tf_usr_tags_smry_array_tags" : {
"processors" : [
{
"split" : {
"if" : "ctx.USER_TAGS_AK != null ",
"field" : "USER_TAGS_AK",
"separator" : ",",
"ignore_failure" : true
}
}
]
}
Also we have tokenizer define as below in our index settings ..
"settings" : {
"index" : {
"default_pipeline" : "tf_usr_tags_smry_array_tags",
"creation_date" : "1626846772008",
"analysis" : {
"analyzer" : {
"autocomplete" : {
"filter" : [
"lowercase"
],
"tokenizer" : "alz_tkn_tej_usr_tags_sum_9_a3"
}
},
"tokenizer" : {
"alz_tkn_tej_usr_tags_sum_9_a3" : {
"punctuation" : {
"pattern" : "[-]",
"type" : "pattern"
},
"token_chars" : [ ],
"min_gram" : "1",
"side" : "front",
"type" : "edge_ngram",
"max_gram" : "10"
}
}
}