Convert two field string into a single field geo_point

Nop the problem is still not solved.

Exemple Reproduce :

  • ElasticSearch 6.6.2
  • Kibana 6.6.2

The logs are parsed before with a Regex so i know the field "IP_Source" in the processor is correct.

Add a mapping though a template in ElasticSearch :

PUT _template/my_template
{
  "index_patterns": "my_indices_*", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

Add a pipeline to process the field "IP_Source"

PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }
    }
  ]
}

got the following error everytime the "set" processor do the merging :

{"took":96,"ingest_took":3,"errors":true,"items":[{"index":{"_index":"my_indices_2019-03-22","_type":"flb_type","_id":"IPXtpWkBJGtfi-AKtWOP","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation] of type [geo_point]","caused_by":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation.keyword] of type [geo_point]","caused_by":{"type":"illegal_argument_exception","reason":"illegal external value class [java.lang.String]. Should be org.elasticsearch.common.geo.GeoPoint"}}}}},

So I ran that:

DELETE _template/my_template
PUT _template/my_template
{
  "index_patterns": "my_indices_*", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

DELETE _ingest/pipeline/geoip_ips
PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }
    }
  ]
}

DELETE my_indices_
POST my_indices_/flb_type
{
  "IP_Source": "8.8.8.8"
}

It does not produce any error.

Please provide a full script that reproduces your problem.

I got the error above every time an entry is processed though the processor. It's not an error produced when i add the template or the pipeline.

If i try to do a manual check with the simulate API, i don't have any error but when it processed incoming entries, i got the java error. It occure only when the entry are processed.

The configuration i did is so small you still want a full script ? i'm just going to copy/paste the code content above.

Exactly what I did with:

POST my_indices_/flb_type
{
  "IP_Source": "8.8.8.8"
}

So this request does not throw any error.

Alright i'm updating the post again.
There is an exemple of a full message comming from my parser to elasticsearch :

<134> id=FIREWALL-SYSLOG time="2019-03-21 12:30:24" fw=1.1.1.1 pri=6 c=1024 gcat=6 m=537 msg="Connection Closed" srcMac=aa:bb:cc:dd:ee:ff src=2.2.2.2:12345:X1 srcZone=LAN natSrc=3.3.3.3:54321 dstMac=gg:hh:ii:jj:kk:ll dst=4.4.4.4:443:X2 dstZone=WAN natDst=5.5.5.5:443 usr="Unknown" proto=tcp/https sent=1348 rcvd=8230 spkt=10 rpkt=9 cdur=2266 rule="(WAN->WAN)" app=49177 appName="HTTPS" n=55917537

I don't really know how the .keyword work but i think the error is occuring when elasticsearch try to fill the .keyword and found out there is 2 field type different between the "geolisation" field and the "geolocalisation.keyword" field.
But this make no sense to me because once, i did the exact same thing in an another field but instead of having "geo_point" type in the keyword, i put "integer" to do some addition of bytes to calculate the bandwith.

Tell me what informations you need or what did you mean by a "full script that reproduces your problem" to help me figure this out.

A full recreation script is described in About the Elasticsearch category. It helps to better understand what you are doing. Please, try to keep the example as simple as possible.

Ok so here we go :
I'm using :

  • ElasticSearch 6.6.2
  • Kibana 6.6.2
  • Fluent-Bit 1.0.4

The logs are parsed by fluent-bit by a regex made by me. If you want this regex, tell me. (it's pretty long tbh ^^)

In the output section of the fluent-bit config, i setup the pipeline as below :

[OUTPUT]
    Name es
    Match MY_FIREWALL
    Host 1.1.1.1
    Port 9200
    Pipeline geoip
    Logstash_Format True
    Logstash_Prefix my_firewall
    Logstash_DateFormat %Y-%m-%d

Next, there is my pipeline setup in Elasticsearch (configured though the Console tab in Kibana) :

PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }
    }
  ]
}

There is the template configuration for my indices :

PUT _template/my_firewall_template
{
  "index_patterns": "my_firewall-*", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

There is an exemple of a message in the simulate pipeline :

POST _ingest/pipeline/_simulate
{
  "pipeline" : {
    "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
          
        "ignore_failure" : true
      }
    },
          {
        "set" : {
          "field" : "geolocalisation",
          "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
        }
      }
  ]
  },
  "docs" : [
    { 
      "index_":"sonicwall_siel37_nsa2650_dcc-2019-03-25",
      "_source": 
      {"IP_Source":"8.8.8.8", "IP_Destination":"1.1.1.1", "ID_Firewall":"MY_FIREWALL"} 
    }

  ]
}

In the end, i got the result expected :

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "geoip" : {
            "continent_name" : "North America",
            "location" : {
              "lon" : -97.822,
              "lat" : 37.751
            },
            "country_iso_code" : "US"
          },
          "IP_Source" : "8.8.8.8",
          "IP_Destination" : "1.1.1.1",
          "ID_Firewall" : "MY_FIREWALL",
          "geolocalisation" : "37.751,-97.822"
        },
        "_ingest" : {
          "timestamp" : "2019-03-25T09:15:48.577Z"
        }
      }
    }
  ]
}

But when the entries are processed live, i got these errors for each entry that is processed by the pipeline :

{"took":96,"ingest_took":3,"errors":true,"items":[{"index":{"_index":"my_indices_2019-03-22","_type":"flb_type","_id":"IPXtpWkBJGtfi-AKtWOP","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation] of type [geo_point]","caused_by":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation.keyword] of type [geo_point]","caused_by":{"type":"illegal_argument_exception","reason":"illegal external value class [java.lang.String]. Should be org.elasticsearch.common.geo.GeoPoint"}}}}},

I can't reproduce this error in the simulate pipeline.
I also haven't test yet to try this out on a new indice, maybe it will solve the problem.
I hope this will help you understand :confused:

I understand the concept but I can't fix without a way to reproduce the problem.

My suggestion is that you create a script which:

  • DELETE a test index
  • CREATE a test index with the mapping you wish
  • DELETE the pipeline
  • CREATE the pipeline
  • INDEX a document manually which calls this pipeline

Then share all that if you can reproduce your issue.

These is a full script which create a new mapping, a new pipeline and process a message though the pipeline in the new indice. Everything is new :

PUT _template/test_geoip
{
  "index_patterns": "test_geoip", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "target_field" : "geolocalisation",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

POST test_geoip/flb_type?pipeline=test_geoip_pipeline
{
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Message":"Message de test (:"
}

And the error i got below :

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse field [geolocalisation.keyword] of type [geo_point]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "failed to parse field [geolocalisation.keyword] of type [geo_point]",
    "caused_by": {
      "type": "array_index_out_of_bounds_exception",
      "reason": "0"
    }
  },
  "status": 400
}
1 Like

Great. Thanks.

This come back to the original question. Why do you want to do this?

          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }

Here is something that you might want to have?

DELETE test_geoip
PUT test_geoip
{
  "mappings": {
    "_doc": {
      "properties": {
        "geolocalisation": {
          "type": "geo_point"
        }
      }
    }
  }
}

DELETE _ingest/pipeline/test_geoip_pipeline
PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

PUT test_geoip/_doc/1?pipeline=test_geoip_pipeline
{
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Message":"Message de test (:"
}
GET test_geoip/_doc/1

It gives:

{
  "_index" : "test_geoip",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "geoip" : {
      "continent_name" : "North America",
      "country_iso_code" : "US",
      "location" : {
        "lon" : -97.822,
        "lat" : 37.751
      }
    },
    "Message" : "Message de test (:",
    "IP_Source" : "8.8.8.8",
    "IP_Destination" : "12.34.56.78",
    "geolocalisation" : "37.751,-97.822"
  }
}

Ok so i tried the code you sent and yes it's working.
But, if i try to setup the mapping directly though the template, it's not working anymore :

DELETE _template/test_geoip
PUT _template/test_geoip
{
  "index_patterns": "test_geoip", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "geo_point"
        }
      }
    }
  }
}

DELETE _ingest/pipeline/test_geoip_pipeline
PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "target_field" : "geolocalisation",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

POST test_geoip/flb_type?pipeline=test_geoip_pipeline
{
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Message":"Message de test (:"
}

GET test_geoip/flb_type/1

I still got the same error as above.

But putting the mapping in the template is the same thing as doing this ? :

PUT test_geoip
{
  "mappings": {
    "flb_type": {
      "properties": {
        "geolocalisation": {
          "type": "geo_point"
        }
      }
    }
  }
}

You did not copy exactly my ingest pipeline.
Note that I removed:

        "target_field" : "geolocalisation",

Ups i copy paste the wrong pipeline :sweat_smile:

I test this on my side and it's working.
When i'm testing this on my live server, i still got this error, even when i :

  • Delete the template and create a new one with the exemple above
  • Delete the pipeline and create a new one
  • Delete the indice
  • Let fluent-bit send log and create the new indice in the same time.

It's eating time for me but when i'm back i will send you the scripts and logs i got :confused:

Let fluent-bit send log

You need to find a document which is failing the pipeline.

it is occuring with every document that are entering the elasticsearch input buffer and goes though the pipeline. Without pipeline, all the parsed field are named correctly inside the indice. I also can visualize them correctly on Kibana.
There is an exemple of the regex i'm using if you want to see it : https://regex101.com/r/uDHpsP/1/

Maybe there is some sort of issue when the message are send by fluent-bit to elasticsearch, this looks weird.
I also setup a "debug" regex that send the entire message without parsing it to elasticsearch so i know the entire message is sent to elasticsearch without any loss of data between the 2 services

So can you update the same reproduction script with a typical document that is stored in elasticsearch without going through the pipeline?

Processing the message though the console into the index with the pipeline works well :

POST "my live indices"/flb_type?pipeline=geoip
{
  "@timestamp":"2019-03-25T12:10:30Z",
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Description":"Message de test (:"
}

I got the expected result :
image

Seems like the problem come from when the entries comming from my parser are processed.
I can also specified the field type of specific parsed field before they are send to elasticsearch but i don't think it change anything because the field "geolocalisation" is created in the pipeline.

If, for exemple, add a geo_point string (0,0 for an exemple) in the field "geolocalisation", it's working properly :

PUT _ingest/pipeline/geoip
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "target_field" : "location",
        "ignore_failure" : true,
        "properties" : ["location","city_name"]
      },
      "set" : {
        "field" : "geolocalisation",
        "value" : "0,0"
      }
    }
  ]
}

The geo_point understand correctly the value and is not throwing any errors.
It's happening ONLY when the incoming messages are processed by the "set" processor. If i setup a random value as above, it's working correctly, even if i add it to incoming messages

I'm running out of ideas. I don't think I have a clear vision of what is actually being sent to elasticsearch, ie. a JSON document that is sent from your Fluent-Bit to elasticsearch.

If you can provide one untouched one, I can have a look.

Of course, there is an Untouched Json generated by fluent-bit:

{"ID_Firewall":"MY_FIREWALL", "timestamp":"2019-03-25 15:14:04", "IP_Firewall":"1.1.1.1", "Niveau":"7", "MAC_Source":"aa:bb:cc:dd:ee:ff", "IP_Source":"2.2.2.2", "Port_Source":"51348", "INT_Source":"X1654", "Zone_Source":"WAN", "NAT_Source":"3.3.3.3", "NAT_Port_Source":"51348", "MAC_Destination":"gg:hh:ii:jj:kk:ll", "IP_Destination":"4.4.4.4", "Port_Destination":"443", "INT_Destination":"X0", "Zone_Destination":"LAN", "NAT_Destination":"5.5.5.5", "NAT_Port_Destination":"443", "Protocole":"tcp/https", "Regle":"(WAN->LAN)", "Note":"TCP Flag(s): ACK RST"}

So I can't reproduce the problem with:

DELETE _template/test_geoip
PUT _template/test_geoip
{
  "index_patterns": "test_geoip", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "geo_point"
        }
      }
    }
  }
}

DELETE _ingest/pipeline/test_geoip_pipeline
PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

DELETE test_geoip
PUT test_geoip/flb_type/1?pipeline=test_geoip_pipeline
{
  "ID_Firewall": "MY_FIREWALL",
  "timestamp": "2019-03-25 15:14:04",
  "IP_Firewall": "1.1.1.1",
  "Niveau": "7",
  "MAC_Source": "aa:bb:cc:dd:ee:ff",
  "IP_Source": "2.2.2.2",
  "Port_Source": "51348",
  "INT_Source": "X1654",
  "Zone_Source": "WAN",
  "NAT_Source": "3.3.3.3",
  "NAT_Port_Source": "51348",
  "MAC_Destination": "gg:hh:ii:jj:kk:ll",
  "IP_Destination": "4.4.4.4",
  "Port_Destination": "443",
  "INT_Destination": "X0",
  "Zone_Destination": "LAN",
  "NAT_Destination": "5.5.5.5",
  "NAT_Port_Destination": "443",
  "Protocole": "tcp/https",
  "Regle": "(WAN->LAN)",
  "Note": "TCP Flag(s): ACK RST"
}

GET test_geoip/flb_type/1