Issues ingesting log files with fresh installation

Hello. I'm hoping someone can please help me with an indexing issue that i'm having. I'm working with a fresh setup of ElasticSearch and Kibana. I have everything setup, Kibana loads, I can log in with security, and the status page shows everything in the green. I am now attempting to load logs into ElasticSearch for the first time so that an index can be created. I have an ingest pipeline all setup and an index template loaded. Upon trying to load a log file into Elastic, I'm seeing this error quite frequently in my elastic log:

[2020-05-20T14:49:01,948][INFO ][o.e.a.b.TransportShardBulkAction] [logs-node-1] [2020-05-18][0] mapping update rejected by primary
java.lang.IllegalArgumentException: mapper [source.geo.location] of different type, current_type [geo_point], merged_type [ObjectMapper]

Not sure if this is a show stopper issue, but an index isn't getting created thus far and this is the only error that I see.

Here is part of my ingest pipeline where this is defined:

{
          "geoip": {
            "field": "ClientIP",
            "target_field": "source.geo",
            "properties": [
              "ip",
              "country_name",
              "continent_name",
              "region_iso_code",
              "region_name",
              "city_name",
              "timezone",
              "location"
            ]
          }
        }

And here is part of my relevant index template:

"source.geo": {
        "properties": {
           "ip": {
              "type": "ip"
           },
           "postal_code": {
              "type": "keyword"
           },
           "location": {
              "type": "geo_point"
           },

My inbound log file has a "ClientIP" field that should be triggering this. Any ideas as to why the data type of "geo_point" is having issues here? Please let me know if you need any additional information to assist me with this. Thanks in advance!

To provide a little more information, I tried creating a quick ingest pipeline and a small index to see what "location" is returning:

curl --user <user>:<password> -X PUT "##.##.##.##:9243/_ingest/pipeline/testgeoip" -H "Content-Type: application/json" -d '{"description" : "Add geoip info","processors" : [{"geoip" : {"field" : "ip"}}]}'
curl --user <user>:<password> -X PUT "##.##.##.##:9243/my_index/_doc/my_id?pipeline=testgeoip" -H "Content-Type: application/json" -d '{"ip":"8.8.8.8"}'

I then fetched the contents of the index:

curl --user : -X GET "##.##.##.##:9243/my_index/_doc/my_id"

    "_id": "my_id",
    "_index": "my_index",
    "_primary_term": 1,
    "_seq_no": 0,
    "_source": {
        "geoip": {
            "continent_name": "North America",
            "country_iso_code": "US",
            "location": {
                "lat": 37.751,
                "lon": -97.822
            }
        },
        "ip": "8.8.8.8"
    },
    "_type": "_doc",
    "_version": 1,
    "found": true
}

This gives me a valid lat/lon object back which should work just fine with the geo_point data type. So, I'm not clear why I'm getting this error while ingesting a log file. Any insights would be great! Thanks.

Any chance this is due to not having Logstash installed? I've got the latest version of ElasticSearch and Kibana installed already v7.7. I didn't think I needed it. The geoip and geo_point mapping "seem" to be common with Logstash, but it also appears to be available in the ingest processor. I've come across some information that the geo_point mapping has to be specifically declared - for example:

https://www.elastic.co/guide/en/elasticsearch/reference/current/geoip-processor.html

"Although this processor enriches your document with a location field containing the estimated latitude and longitude of the IP address, this field will not be indexed as a geo_point type in Elasticsearch without explicitly defining it as such in the mapping."

So, my mapping already has this, unless there is an extra step I have to take? Has anyone else come across this error before? Thanks.

@Andrew_Cholakian1 or @shahzad31 since you guys were helping me quite well with a couple of my other posts, if you wouldn't mind seeing if yourselves or anyone else from the Elastic Team could take a look at this issue, I would appreciate it. This is an important one for me to solve soon and it is driving me crazy :slight_smile: Thanks!

Just wanted to provide some additional details here. I'm currently trying to import log files from Cloudflare into the ElasticStack. I’m attempting to follow the instructions here:

In addition to the above tests, I have also performed a more accurate test:

Create pipeline (pulled from the Cloudflare file):

PUT /_ingest/pipeline/jmggeoip
{
  "description": "My Log Pipeline",
  "processors": [
    {
      "geoip": {
        "field": "ClientIP",
        "target_field": "source.geo",
        "properties": [
          "ip",
          "country_name",
          "continent_name",
          "region_iso_code",
          "region_name",
          "city_name",
          "timezone",
          "location"
        ]
      }
    }
  ]
}

Create index template mapping (pulled from the Cloudflare file):

PUT /_template/jmgtemplate
{
   "index_patterns": [
     "jmgindex-*"
   ],
   "mappings": {
      "properties": {
         "source.geo": {
            "properties": {
               "ip": {
                  "type": "ip"
               },
               "postal_code": {
                  "type": "keyword"
               },
               "location": {
                  "type": "geo_point"
               },
               "dma_code": {
                  "type": "long"
               },
               "country_code3": {
                  "type": "keyword"
               },
               "latitude": {
                  "type": "float"
               },
               "longitude": {
                  "type": "float"
               },
               "region_name": {
                  "type": "keyword"
               },
               "city_name": {
                  "type": "keyword"
               },
               "timezone": {
                  "type": "keyword"
               },
               "country_code2": {
                  "type": "keyword"
               },
               "continent_code": {
                  "type": "keyword"
               },
               "country_name": {
                  "type": "keyword"
               },
               "region_code": {
                  "type": "keyword"
               },
               "continent_name": {
                  "type": "keyword"
               },
               "region_iso_code": {
                  "type": "keyword"
              }
            }
         }
      }
   },
   "settings": {
      "index": {
         "number_of_shards": "1",
         "number_of_replicas": "1",
         "mapping.ignore_malformed": true
      }
   }
}

Create index (index pattern matching above and pipline created above):

PUT /jmgindex-test/_doc/my_id?pipeline=jmggeoip
{"ClientIP":"8.8.8.8"}

Fetch the index:
GET /jmgindex-test/_doc/my_id

This call returns the following information:

{
  "_index" : "jmgindex-test",
  "_type" : "_doc",
  "_id" : "my_id",
  "_version" : 1,
  "_seq_no" : 0,
  "_primary_term" : 1,
  "found" : true,
  "_source" : {
    "source" : {
      "geo" : {
        "continent_name" : "North America",
        "timezone" : "America/Chicago",
        "ip" : "8.8.8.8",
        "country_name" : "United States",
        "location" : {
          "lon" : -97.822,
          "lat" : 37.751
        }
      }
    },
    "ClientIP" : "8.8.8.8"
  }
}

So, as you can see, we are still getting latitude and longitude back. Now, let’s look at the field mapping:

Now, we are properly mapping to “geo_point”. However, while this example seems to be working, the ingest process I set up for Cloudflare is not working. So, there must be something about the setup process that is missing. I'm leaning towards the Lambda function that is provided by Cloudflare.

Just ran an even more accurate test to try and narrow this down. I created a brand new index using the existing Cloudflare pipeline and Cloudflare index template that I already submitted to Elastic. I also pulled a single JSON record from one of our edge logs that is getting dumped up to S3:

PUT /cloudflare-123/_doc/my_ip?pipeline=cloudflare-pipeline-weekly
{"BotScore":76,"BotScoreSrc":"Machine Learning","CacheCacheStatus":"hit","CacheResponseBytes":257510,"CacheResponseStatus":200,"CacheTieredFill":false,"ClientASN":####,"ClientCountry":"us","ClientDeviceType":"desktop","ClientIP":"###.###.###.###,"ClientIPClass":"noRecord","ClientRequestBytes":4147,"ClientRequestHost":"www.sample.com","ClientRequestMethod":"GET","ClientRequestPath":"/shop/test","ClientRequestProtocol":"HTTP/2","ClientRequestReferer":"https://www.sample.com/shop/test2","ClientRequestURI":"/shop/test","ClientRequestUserAgent":"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/81.0.4044.138 Safari/537.36","ClientSSLCipher":"AEAD-AES128-GCM-SHA256","ClientSSLProtocol":"TLSv1.3","ClientSrcPort":52355,"ClientXRequestedWith":"","EdgeColoCode":"BNA","EdgeColoID":115,"EdgeEndTimestamp":"2020-05-20T00:00:06Z","EdgePathingOp":"wl","EdgePathingSrc":"macro","EdgePathingStatus":"nr","EdgeRateLimitAction":"","EdgeRateLimitID":0,"EdgeRequestHost":"www.sample.com","EdgeResponseBytes":61001,"EdgeResponseCompressionRatio":4.29,"EdgeResponseContentType":"text/html","EdgeResponseStatus":200,"EdgeServerIP":"","EdgeStartTimestamp":"2020-05-20T00:00:06Z","FirewallMatchesActions":[],"FirewallMatchesRuleIDs":[],"FirewallMatchesSources":[],"OriginIP":"","OriginResponseBytes":0,"OriginResponseHTTPExpires":"","OriginResponseHTTPLastModified":"","OriginResponseStatus":0,"OriginResponseTime":0,"OriginSSLProtocol":"unknown","ParentRayID":"00","RayID":"####","SecurityLevel":"med","WAFAction":"unknown","WAFFlags":"0","WAFMatchedVar":"","WAFProfile":"unknown","WAFRuleID":"","WAFRuleMessage":"","ZoneID":####}

This created a new index called “cloudflare-2020-05-18”. When I queried the index, it returned a valid result with geo_point information:

GET /cloudflare-2020-05-18/_doc/my_ip

…
"found" : true,
  "_source" : {
    "BotScoreSrc" : "Machine Learning",
    "source" : {
      "geo" : {
        "continent_name" : "North America",
        "region_iso_code" : "US-TN",
        "city_name" : "Murfreesboro",
        "country_iso_code" : "us",
        "timezone" : "America/Chicago",
        "ip" : "###.###.###.###",
        "country_name" : "United States",
        "region_name" : "Tennessee",
        "location" : {
          "lon" : -86.3881,
          "lat" : 35.8437
        }
      },
      "as" : {
        "number" : ####
      },
…

This is why I’m hitting a wall. Everything “seems” to be setup properly from the Elastic side and I think the above proves the geo_point mapping and geoip functionality is working fine. I believe this also verifies that the Cloudflare index template and pipeline are working. The only thing I noticed about the Lambda function is that it is using a deprecated bulk load method, so perhaps that is impacting this? Here is that warning:

WARNING ... [types removal] Specifying types in bulk requests is deprecated."

Any help tracking down this issue would be greatly appreciated!

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.