Summary: I followed the steps outlined in the section "Use a custom endpoint" of the documentation, but I encountered some unexpected behavior. The GEOIP databases are not updating on Elasticsearch ingest nodes. I'm not certain if I made a mistake or if there's an issue with the documentation or GEOIP itself.
According to the documentation, I expected the GEOIP databases to update successfully on ingest nodes.
However, when checking the status with "GET _ingest/geoip/stats," I see that the updates have not occurred:
I have a total of 14 Elasticsearch nodes (ELK STACK version: 8.11.1):
8 data nodes
3 ingest nodes
3 master nodes
I have taken several steps to troubleshoot the issue:
Executed "elasticsearch-geoip" for updating databases, which appeared to complete successfully.
elasticsearch@elasticsearch-5c64694f74-26mxt:~/bin$ ./elasticsearch-geoip -s /geoip/ -t /geoip/
Found GeoIP2-City.mmdb, will compress it to GeoIP2-City.tgz
Adding GeoIP2-City.tgz to overview.json
Checked file permissions on the target folder and confirmed they are appropriate.
elasticsearch@elasticsearch-5c64694f74-26mxt:/geoip$ ls -ls
135949 -rw-r--r--. 1 root root 139210785 Jan 17 16:44 GeoIP2-City.mmdb
71942 -rw-rw-rw-. 1 elasticsearch elasticsearch 73668587 Jan 17 16:48 GeoIP2-City.tgz
1 -rw-rw-rw-. 1 elasticsearch elasticsearch 122 Jan 17 16:48 overview.json
Made configurations and settings adjustments as described in the documentation, including updating endpoint URLs, restarting Elasticsearch, and confirming settings.
Set the log level to trace following the instructions in this article. Observed many log lines, including:
[2024-01-17T12:47:50,093][TRACE][o.e.i.g.DatabaseNodeService] [ingest_1] Not checking databases because geoip databases index does not exist
Followed instructions to delete the .geoip_databases index from this article, but I cannot see this index in my cluster. I attempted to output all indices with the API call:
However, there is no .geoip_databases index in the output.
I'm currently stuck and uncertain about the next steps to resolve this issue. I've followed the documentation and conducted troubleshooting steps as outlined, but the problem persists. Any assistance or guidance from the community would be greatly appreciated.
There is a setting now that makes loading lazy so they don't load until the first call.
Set that to true
(Dynamic, Boolean) If true, Elasticsearch downloads GeoIP2 databases immediately, regardless of whether a pipeline exists with a geoip processor. If false, Elasticsearch only begins downloading the databases if a pipeline with a geoip processor exists or is added.
You need to look at the elasticsearch logs to see the actual GEOIP download error. There should be details; you can turn up the logging for the GeoIP if needed, but if it is an error, you should see why. This is how you will figure this out.
so it looks like you are self-hosting... you did not mention that
How did you configure... I suspect there is an issue with that did you follow the steps? https://mydomain.com/
Use a custom endpoint
You can create a service that mimics the Elastic GeoIP endpoint. You can then get automatic updates from this service.
Like I said in ealier post, all data node are able to download from my endpoint and all 14 nodes have the settings "ingest.geoip.downloader.enabled : true" .. But, thew thing is that the error message related to .geoip_databases index is missing , it's true.. That index is not present.
I wanted to share an update on our situation. After consulting with Elastic Support and investigating further, we identified SSL handshake errors on the Elasticsearch node responsible for downloading the GeoIP database. The resolution involved adding our CA certificate to the truststore of the JVM used by Elasticsearch. Initially, I assumed that all SSL-related queries or API calls within Elasticsearch would utilize the SSL configurations specified in elasticsearch.yml. However, this experience taught me otherwise—an enlightening moment indeed!
With the truststore now updated across all Elasticsearch nodes, the GeoIP database began to download as expected. Success at last! But, there's a twist...
Despite the progress, I encountered an issue where the "_geoip_database_unavailable_GeoLite2-City.mmdb" tag persisted in indices utilizing the geoip processor. To address this, I followed the instructions in the "Use a custom endpoint" section. Notably, before proceeding with step 3, I renamed our GEOIP2-City.mmdb database file to GeoLite2-City.mmdb. This workaround proved effective, yet it feels like a workaround that ideally shouldn't be necessary due to the need to replicate the default filename.
Certainly, setting the database name in the ingest processor does work as expected. However, this approach doesn't align with our overarching goal. We're aiming to replace the default Elastic GeoIP database processing mechanism in a manner that doesn't require us to add specific configuration lines to every GeoIP processor. More importantly, we want to avoid the need to create or modify custom pipelines for all the Fleet integrations we utilize. Our objective is to streamline this process for efficiency and scalability.