Hi Everyone,
I am using Elasticsearch 7.15, and python to ingest data into Elasticsearch.
Before I start the ingestion, I define a pipeline and a mapping. Everything works very smoothly without the geoip processor.
When I add the geoip processor, if it tries to process an empty field, an error is returned and the documents are rejected. I would expect the option "ignore_missing": true, to solve that issue, but it does not work.
The pipeline:
My_pipeline = """
{
"processors":[
{
"date": {
"field": "date-time",
"formats": ["strict_date_optional_time","yyyy-MM-dd'T'HH:mm:ss.SSSZ"],
"ignore_failure" : true
}
},
{
"geoip" : {
"field": "original-client-ip",
"target_field": "original-client-ip.geo",
"ignore_missing": true
}
}
]
}
"""
The Mapping (simplified):
My_map="""
{
"settings": {
"index": {
"default_pipeline": "My_pipeline",
"number_of_shards": 1,
"number_of_replicas": 0
}
},
"mappings":{
"properties":{
<... removed fields for simplicity ... >
"original-server-ip": {"type": "text", "fields": {"keyword": {"type": "keyword"}}},
"location": {"type": "geo_point", "ignore_malformed": true}
}
}
}
"""
The error:
Detail: ('276 document(s) failed to index.', [{
'index': {
'_index': '<redacted>_messagetrackinglog',
'_type': '_doc',
'_id': None,
'status': 400,
'error': {
'type': 'illegal_argument_exception',
'reason': "'' is not an IP string literal."
},
'data': {
'date-time': '2021-10-05T15:00:26.249Z',
'client-ip': '',
'client-hostname': '',
'server-ip': '',
'server-hostname': '<redacted>',
'source-context': 'No suitable shadow servers',
'connector-id': '',
'source': 'SMTP',
'event-id': 'HAREDIRECTFAIL',
'internal-message-id': '101305393610752',
'message-id': '<c6726aaa<redacted>8c12335b5a@p<redacted>>',
'network-message-id': 'ed795896-2537-44bd<redacted>3e',
'recipient-address': '<redacted>@<redacted>',
'recipient-status': '',
'total-bytes': '13019',
'recipient-count': '1',
'related-recipient-address': '',
'reference': '',
'message-subject': 'Accepted: <redacted>',
'sender-address': '<redacted>@<redacted>',
'return-path': '<redacted>@<redacted>',
'message-info': '',
'directionality': 'Originating',
'tenant-id': '',
'original-client-ip': '',
'original-server-ip': '',
'custom-data': 'S:DeliveryPriority=Normal;S:AccountForest=<redacted>',
'transport-traffic-type': 'Email',
'log-id': 'ec996<redacted>dd40',
'schema-version': '15.01.2176.009'
}
}
}
I would like to underline, that without the geoip processor all the logs are ingested smoothly and that I would expect the "ignore_missing": true to address this exact issue of empty fields.
It there anything I am doing wrong or any suggestion?
I appreciate any help. Thanks.