Enrich Elasticsearch via MySQL based on field value

TeePee · July 12, 2016, 5:19pm

Hi,

I'm importing a ton of events into elasticsearch via logstash from a S3 bucket with logs.

Based on a specific UUID in every event I would want to fetch the aggregated info from a MySQL database and update/enrich that same event with that extra data, so it can be used later on Kibana.

Is there a way I can accomplish this?

TIA

javanna · July 14, 2016, 8:05am

Hi,
I'm afraid I'd need some more info to be able to help you. Can you post what the data looks like and what you want to do with it exactly please?

TeePee · July 14, 2016, 9:31am

Hi @javanna

Well, our log entries look something like this:

web	2016-06-23 15:17:55.612	2016-06-23 14:59:53.000		unstruct	9f8aaa2f-b24f-4c8b-8ca6-f6289f072a85			custom	clj-1.1.0-tom-0.2.0	hadoop-1.6.0-common-0.21.0		93.XXX.243.XXX				07eae7d2-04ae-464f-8cd0-98a917bb9975												http://xxx.com/test			http	xx.com	80	/test																							{"schema":"iglu:com.snowplowanalytics.snowplow/unstruct_event/jsonschema/1-0-0","data":{"schema":"iglu:com.xxx/open/jsonschema/1-0-1","data":{"cid":"2253450","eid":"2231323","uid":"21"}}}																			Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.103 Safari/537.36																																															2016-06-23 14:59:53.000	com.xxx	open	jsonschema	1-0-1

Within logstash we're able to parse all the parameters, including the ones on jsonschema.
So our idea is, based on cid, eid, and uid values passed, being able to lookup a MySQL database and retrieve some extra info that allow us to enrich the event data. Let's say gender, age, anything else... that would allow us to do more consistent reports on Kibana.

I've found a plugin for Logstash the would do just that but is yet to be developed.

jsvd · July 14, 2016, 10:01am

Right now there's no built-in functionality to handle jdbc at the filter stage. Currently there's a community-made jdbc filter by one of the most involved contributors to logstash, but I haven't tested it myself.

An alternative, that I used to do in my last job, is to use a zeromq filter and a small application that speaks zeromq and executes queries using a ruby library like sequel.

Topic		Replies	Views
Enriching new documents with fields from other ES indexes Logstash	1	1083	July 6, 2017
Importing MySQL in ES Logstash	11	717	February 12, 2018
Enriching data in index by data from another index Elasticsearch	3	768	May 7, 2019
How to look up data in elasticseach as part of an ingest node process Elasticsearch	3	375	March 14, 2019
Can Logstash enrich data before send to elasticsearch? Logstash	4	1260	March 27, 2017

Enrich Elasticsearch via MySQL based on field value

Related topics