Convert two field string into a single field geo_point

Hello everyone,

i'm currently struggling with geo coordinates in my elasticsearch and i can't find any solution to my problem.

I'm using the ingest-geoip plugin which allow me to process the incoming logs and add geolocalisation to my incoming IP.

Right now, i configured the following pipeline to process my entries :

PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    }
  ]
}

This generate coordinates in two field : geoip.location.lat and geoip.location.lon.

I wanted to merge these two fields to make a geohash field which can be used later to be used in a map.
So i add this to my pipeline :

PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
          {
        "set" : {
          "field" : "geolocalisation",
          "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
        }
      }
  ]
}

I also add in my mapping though a template the geolocalisation field as a geo_point. There is the piece of my template where i setup this :

"geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "geo_point"
            }
          }
        }

But when the entries are processed, i got the following error for every entry processed by the pipeline. This is produced by the "set" processor :

{"took":28,"ingest_took":2,"errors":true,"items":[{"index":{"_index":"myindex","_type":"flb_type","_id":"vFgfpWkBJGtfi-AK9Sur","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation.keyword] of type [geo_point]","caused_by":{"type":"array_index_out_of_bounds_exception","reason":"0"}}}}

I can't find what is causing this and i'm thinking it's caused by the set processor trying to write with a string value in the geo_point field.

Using the simulate api does not throw error and it's working correctly ? (done in kibana) But it's writing in the field itself, not the geolocalisation.keyword :

POST _ingest/pipeline/_simulate
{
  "pipeline" : {
    "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
          {
        "set" : {
          "field" : "geolocalisation",
          "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
        }
      }
  ]
  },
  "docs" : [
    { "_source": {"IP_Source":"8.8.8.8"} },
    { "_source": {"IP_Source":"8.8.8.8"} }
  ]
}

Result :

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "geoip" : {
            "location" : {
              "lon" : -97.822,
              "lat" : 37.751
            },
            "country_iso_code" : "US"
          },
          "IP_Source" : "8.8.8.8",
          "geolocalisation" : "37.751,-97.822"
        },
        "_ingest" : {
          "timestamp" : "2019-03-22T14:30:20.800Z"
        }
      }
    },
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "geoip" : {
            "location" : {
              "lon" : -97.822,
              "lat" : 37.751
            },
            "country_iso_code" : "US"
          },
          "IP_Source" : "8.8.8.8",
          "geolocalisation" : "37.751,-97.822"
        },
        "_ingest" : {
          "timestamp" : "2019-03-22T14:30:20.800Z"
        }
      }
    }
  ]
}

The following message is processed correctly but it does not tell my why it's not working on my entries ...

It clearly tell me it cannot be done for the .keyword but i'm not really sure how to setup this with my current mapping and the set pipeline i configured :confused:

Please format your code, logs or configuration files using </> icon as explained in this guide and not the citation button. It will make your post more readable.

Or use markdown style like:

```
CODE
```

This is the icon to use if you are not using markdown format:

There's a live preview panel for exactly this reasons.

Lots of people read these forums, and many of them will simply skip over a post that is difficult to read, because it's just too large an investment of their time to try and follow a wall of badly formatted text.
If your goal is to get an answer to your questions, it's in your interest to make it as easy to read and understand as possible.
Please update your post.

Could you provide a full recreation script as described in About the Elasticsearch category. It will help to better understand what you are doing. Please, try to keep the example as simple as possible. Here you can simply use the _simulate ingest endpoint.

1 Like

Updated the post. Sorry for the inconvenience

Pas de problème :slight_smile:

Why do you want to have both text and geo_point for the same content? I don't understand the use case.

I was trying to understand why it didn't want to merge the two value so i put the value text unstead of geo_point.

I tried to setup this in another field and it's working (maybe it's a mapping issue) :

"set" : {
        "field" : "location",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }

I also tried a few times to make a condition where, if one of the latitude or longitude field exist, then it execute the "set" processor. If they are not present, the processor is not execute because even if i don't have any lat or lon, it still add the "," in the field which is totally normal.

So? Did you solve your problem at the end?
If not, can you share a full example that reproduces the problem?

Nop the problem is still not solved.

Exemple Reproduce :

  • ElasticSearch 6.6.2
  • Kibana 6.6.2

The logs are parsed before with a Regex so i know the field "IP_Source" in the processor is correct.

Add a mapping though a template in ElasticSearch :

PUT _template/my_template
{
  "index_patterns": "my_indices_*", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

Add a pipeline to process the field "IP_Source"

PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }
    }
  ]
}

got the following error everytime the "set" processor do the merging :

{"took":96,"ingest_took":3,"errors":true,"items":[{"index":{"_index":"my_indices_2019-03-22","_type":"flb_type","_id":"IPXtpWkBJGtfi-AKtWOP","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation] of type [geo_point]","caused_by":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation.keyword] of type [geo_point]","caused_by":{"type":"illegal_argument_exception","reason":"illegal external value class [java.lang.String]. Should be org.elasticsearch.common.geo.GeoPoint"}}}}},

So I ran that:

DELETE _template/my_template
PUT _template/my_template
{
  "index_patterns": "my_indices_*", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

DELETE _ingest/pipeline/geoip_ips
PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }
    }
  ]
}

DELETE my_indices_
POST my_indices_/flb_type
{
  "IP_Source": "8.8.8.8"
}

It does not produce any error.

Please provide a full script that reproduces your problem.

I got the error above every time an entry is processed though the processor. It's not an error produced when i add the template or the pipeline.

If i try to do a manual check with the simulate API, i don't have any error but when it processed incoming entries, i got the java error. It occure only when the entry are processed.

The configuration i did is so small you still want a full script ? i'm just going to copy/paste the code content above.

Exactly what I did with:

POST my_indices_/flb_type
{
  "IP_Source": "8.8.8.8"
}

So this request does not throw any error.

Alright i'm updating the post again.
There is an exemple of a full message comming from my parser to elasticsearch :

<134> id=FIREWALL-SYSLOG time="2019-03-21 12:30:24" fw=1.1.1.1 pri=6 c=1024 gcat=6 m=537 msg="Connection Closed" srcMac=aa:bb:cc:dd:ee:ff src=2.2.2.2:12345:X1 srcZone=LAN natSrc=3.3.3.3:54321 dstMac=gg:hh:ii:jj:kk:ll dst=4.4.4.4:443:X2 dstZone=WAN natDst=5.5.5.5:443 usr="Unknown" proto=tcp/https sent=1348 rcvd=8230 spkt=10 rpkt=9 cdur=2266 rule="(WAN->WAN)" app=49177 appName="HTTPS" n=55917537

I don't really know how the .keyword work but i think the error is occuring when elasticsearch try to fill the .keyword and found out there is 2 field type different between the "geolisation" field and the "geolocalisation.keyword" field.
But this make no sense to me because once, i did the exact same thing in an another field but instead of having "geo_point" type in the keyword, i put "integer" to do some addition of bytes to calculate the bandwith.

Tell me what informations you need or what did you mean by a "full script that reproduces your problem" to help me figure this out.

A full recreation script is described in About the Elasticsearch category. It helps to better understand what you are doing. Please, try to keep the example as simple as possible.

Ok so here we go :
I'm using :

  • ElasticSearch 6.6.2
  • Kibana 6.6.2
  • Fluent-Bit 1.0.4

The logs are parsed by fluent-bit by a regex made by me. If you want this regex, tell me. (it's pretty long tbh ^^)

In the output section of the fluent-bit config, i setup the pipeline as below :

[OUTPUT]
    Name es
    Match MY_FIREWALL
    Host 1.1.1.1
    Port 9200
    Pipeline geoip
    Logstash_Format True
    Logstash_Prefix my_firewall
    Logstash_DateFormat %Y-%m-%d

Next, there is my pipeline setup in Elasticsearch (configured though the Console tab in Kibana) :

PUT _ingest/pipeline/geoip_ips
{
  "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "properties" : ["location", "city_name", "country_iso_code"],  
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}",
        "ignore_failure": true
      }
    }
  ]
}

There is the template configuration for my indices :

PUT _template/my_firewall_template
{
  "index_patterns": "my_firewall-*", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 2
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

There is an exemple of a message in the simulate pipeline :

POST _ingest/pipeline/_simulate
{
  "pipeline" : {
    "description" : "Translate les adresses IP Source en coordonnées",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
          
        "ignore_failure" : true
      }
    },
          {
        "set" : {
          "field" : "geolocalisation",
          "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
        }
      }
  ]
  },
  "docs" : [
    { 
      "index_":"sonicwall_siel37_nsa2650_dcc-2019-03-25",
      "_source": 
      {"IP_Source":"8.8.8.8", "IP_Destination":"1.1.1.1", "ID_Firewall":"MY_FIREWALL"} 
    }

  ]
}

In the end, i got the result expected :

{
  "docs" : [
    {
      "doc" : {
        "_index" : "_index",
        "_type" : "_type",
        "_id" : "_id",
        "_source" : {
          "geoip" : {
            "continent_name" : "North America",
            "location" : {
              "lon" : -97.822,
              "lat" : 37.751
            },
            "country_iso_code" : "US"
          },
          "IP_Source" : "8.8.8.8",
          "IP_Destination" : "1.1.1.1",
          "ID_Firewall" : "MY_FIREWALL",
          "geolocalisation" : "37.751,-97.822"
        },
        "_ingest" : {
          "timestamp" : "2019-03-25T09:15:48.577Z"
        }
      }
    }
  ]
}

But when the entries are processed live, i got these errors for each entry that is processed by the pipeline :

{"took":96,"ingest_took":3,"errors":true,"items":[{"index":{"_index":"my_indices_2019-03-22","_type":"flb_type","_id":"IPXtpWkBJGtfi-AKtWOP","status":400,"error":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation] of type [geo_point]","caused_by":{"type":"mapper_parsing_exception","reason":"failed to parse field [geolocalisation.keyword] of type [geo_point]","caused_by":{"type":"illegal_argument_exception","reason":"illegal external value class [java.lang.String]. Should be org.elasticsearch.common.geo.GeoPoint"}}}}},

I can't reproduce this error in the simulate pipeline.
I also haven't test yet to try this out on a new indice, maybe it will solve the problem.
I hope this will help you understand :confused:

I understand the concept but I can't fix without a way to reproduce the problem.

My suggestion is that you create a script which:

  • DELETE a test index
  • CREATE a test index with the mapping you wish
  • DELETE the pipeline
  • CREATE the pipeline
  • INDEX a document manually which calls this pipeline

Then share all that if you can reproduce your issue.

These is a full script which create a new mapping, a new pipeline and process a message though the pipeline in the new indice. Everything is new :

PUT _template/test_geoip
{
  "index_patterns": "test_geoip", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }
        }
      }
    }
  }
}

PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "target_field" : "geolocalisation",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

POST test_geoip/flb_type?pipeline=test_geoip_pipeline
{
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Message":"Message de test (:"
}

And the error i got below :

{
  "error": {
    "root_cause": [
      {
        "type": "mapper_parsing_exception",
        "reason": "failed to parse field [geolocalisation.keyword] of type [geo_point]"
      }
    ],
    "type": "mapper_parsing_exception",
    "reason": "failed to parse field [geolocalisation.keyword] of type [geo_point]",
    "caused_by": {
      "type": "array_index_out_of_bounds_exception",
      "reason": "0"
    }
  },
  "status": 400
}
1 Like

Great. Thanks.

This come back to the original question. Why do you want to do this?

          "geolocalisation" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type": "geo_point"
            }
          }

Here is something that you might want to have?

DELETE test_geoip
PUT test_geoip
{
  "mappings": {
    "_doc": {
      "properties": {
        "geolocalisation": {
          "type": "geo_point"
        }
      }
    }
  }
}

DELETE _ingest/pipeline/test_geoip_pipeline
PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

PUT test_geoip/_doc/1?pipeline=test_geoip_pipeline
{
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Message":"Message de test (:"
}
GET test_geoip/_doc/1

It gives:

{
  "_index" : "test_geoip",
  "_type" : "_doc",
  "_id" : "1",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "geoip" : {
      "continent_name" : "North America",
      "country_iso_code" : "US",
      "location" : {
        "lon" : -97.822,
        "lat" : 37.751
      }
    },
    "Message" : "Message de test (:",
    "IP_Source" : "8.8.8.8",
    "IP_Destination" : "12.34.56.78",
    "geolocalisation" : "37.751,-97.822"
  }
}

Ok so i tried the code you sent and yes it's working.
But, if i try to setup the mapping directly though the template, it's not working anymore :

DELETE _template/test_geoip
PUT _template/test_geoip
{
  "index_patterns": "test_geoip", 
  "settings": {
    "number_of_replicas": 0,
    "number_of_shards": 1
  },
  "mappings": {
    "flb_type": {
	  	"properties" : {
          "geolocalisation" : {
            "type" : "geo_point"
        }
      }
    }
  }
}

DELETE _ingest/pipeline/test_geoip_pipeline
PUT _ingest/pipeline/test_geoip_pipeline
{
  "description" : "Add geoip info",
  "processors" : [
    {
      "geoip" : {
        "field" : "IP_Source",
        "target_field" : "geolocalisation",
        "ignore_failure" : true
      }
    },
    {
      "set" : {
        "field" : "geolocalisation",
        "value" : "{{geoip.location.lat}},{{geoip.location.lon}}"
      }
    }
  ]
}

POST test_geoip/flb_type?pipeline=test_geoip_pipeline
{
  "IP_Source" : "8.8.8.8",
  "IP_Destination" : "12.34.56.78",
  "Message":"Message de test (:"
}

GET test_geoip/flb_type/1

I still got the same error as above.

But putting the mapping in the template is the same thing as doing this ? :

PUT test_geoip
{
  "mappings": {
    "flb_type": {
      "properties": {
        "geolocalisation": {
          "type": "geo_point"
        }
      }
    }
  }
}

You did not copy exactly my ingest pipeline.
Note that I removed:

        "target_field" : "geolocalisation",

Ups i copy paste the wrong pipeline :sweat_smile:

I test this on my side and it's working.
When i'm testing this on my live server, i still got this error, even when i :

  • Delete the template and create a new one with the exemple above
  • Delete the pipeline and create a new one
  • Delete the indice
  • Let fluent-bit send log and create the new indice in the same time.

It's eating time for me but when i'm back i will send you the scripts and logs i got :confused:

Let fluent-bit send log

You need to find a document which is failing the pipeline.