Hello!
I have an Elasticsearch Index, where a lot of request calls are being logged from our Company's API. So all the documents have "send" and "response" objects. The content of these objects changes based on the API endpoint being accessed.
The issue I am facing is, the way the engineers have designed the backend systems. Some API calls the field "abc.xyz.HomeAnalysis", some call it, "abc.xyz.homeanalysis" and some others call it "abc.xyz.home_analysis".
Essentially all three could be bundled as "abc.xyz.HomeAnalysis". Currently, this is leading to a burst in number of Indexed fields ~1000+. But that could be cut into half, just by merging fields with different names into one unified field.
Ps. One document within the index will only have ONE out of the list of possible naming structures defined above.
What is the best way to handle something like this?
One way I thought of, was defining painless Scripts in the Ingestion pipeline that will do that for me. But that seems like a vary manual process. I was hoping there was a better way to handle this in elasticsearch.
Sample documents. The field I am talking about: send.actionModels.Body.Application.hasEc --> This field has different field name in all three documents.
Document1
{
"createdAt": "2023-05-08T19:24:08.3473356Z",
"ipGeo": {
"continent_name": "North America",
"region_iso_code": "US-IA",
"city_name": "Des Moines",
"country_iso_code": "US",
"country_name": "United States",
"region_name": "Iowa",
"location": {
"lon": -93.6124,
"lat": 41.6021
}
},
"controllerName": "RaterV3",
"send": {
"actionModels": {
"Body": {
"Application": {
"lastName": "Gruss",
"constructionType": "wood",
"city": "Port St. Joe",
"deliveryMethod": "electronic",
"hasEc": "false",
"contentCoverage": "80000"
}
}
}
},
"actionName": "PostQuote",
"isIpBlock": false,
"elapsedMS": 381,
"responseCode": 200
}
Document2
{
"createdAt": "2023-05-08T19:24:08.3473356Z",
"ipGeo": {
"continent_name": "North America",
"region_iso_code": "US-IA",
"city_name": "Des Moines",
"country_iso_code": "US",
"country_name": "United States",
"region_name": "Iowa",
"location": {
"lon": -93.6124,
"lat": 41.6021
}
},
"controllerName": "RaterV3",
"send": {
"actionModels": {
"Body": {
"Application": {
"lastName": "Gruss",
"constructionType": "wood",
"city": "Port St. Joe",
"deliveryMethod": "electronic",
"has_Ec": "false",
"contentCoverage": "80000"
}
}
}
},
"actionName": "PostQuote",
"isIpBlock": false,
"elapsedMS": 381,
"responseCode": 200
}
Document3
{
"createdAt": "2023-05-08T19:24:08.3473356Z",
"ipGeo": {
"continent_name": "North America",
"region_iso_code": "US-IA",
"city_name": "Des Moines",
"country_iso_code": "US",
"country_name": "United States",
"region_name": "Iowa",
"location": {
"lon": -93.6124,
"lat": 41.6021
}
},
"controllerName": "RaterV3",
"send": {
"actionModels": {
"Body": {
"Application": {
"lastName": "Gruss",
"constructionType": "wood",
"city": "Port St. Joe",
"deliveryMethod": "electronic",
"hasec": "false",
"contentCoverage": "80000"
}
}
}
},
"actionName": "PostQuote",
"isIpBlock": false,
"elapsedMS": 381,
"responseCode": 200
}