I'm not sure exactly where to pose the question, but 2/3 of my possible approaches involve logstash, so it seemed the right place to start. I have incoming apache logs, and need to extract the IPs from them, then do a lookup on each IP, then insert the enriched data into elasticsearch. The data for the lookups is currently in an RDBMS, but could easily be loaded into a separate elasticsearch table. Near as I can tell, the right way to make this happen is either:
- Load the data into elasticsearch and use the logstash elasticsearch filter chained with the grok input filter then an output filter, likely with a filebeat pointed at the incoming data.
- Similar to 1, but leave the data in the RDBMS and use jdbc_static rather than the elasticsearch filter. The only reason to do this is simplicity. I'm assuming 1 would be better on performance, but I have no experience to inform that.
- Write and use an ingest plugin for elasticsearch. I've had difficulty with the examples and documentation here, so it's my least favorite option, but if it's the only way to get good performance, I'm willing to pursue it.
Has anyone done something similar? What approach did you take? Am I missing something obvious? Many thanks for any good advice you can share.