Real Time data ingestion in ES from Multiple Source

Hi Team,

I am working on data ingestion project.
I have multiple data source as below:-

  1. Mysql Data
  2. CSV Data
  3. XML Data
  4. JSON Data
  5. Log data
  6. MongoDB data

As data is generating every time, so I want to make sure data should be ingestion in real time from all data sources.
So, how to make it sure I can able to do data ingestion in the correct way.

Thanks in advance.

I would consider using Logstash which has a number of input plugins to help get you started, e.g. beats plugin for file-based inputs, or jdbc plugin for DB connections
Check the Logstash doco here https://www.elastic.co/guide/en/logstash/current/input-plugins.html

Logstash also provides filter plugins to handle csv and xml data formats. See here https://www.elastic.co/guide/en/logstash/current/filter-plugins.html

And of course the elasticsearch output plugin makes getting your documents into ES.

Once you've defined your inputs and filters, consider specifying logstash pipelines to segregate your different data streams

Thanks for the reply.
As much As I can understand.
Steps will be like below:-

  1. Elasticsearch Service should be in running status [always].
  2. Logstash Service should be in running status [always].
  3. Kibana Service should be in running status [always].

Once I will run the command as [run first time only]
bin/logstash -f mypipeline.conf

At that time whatever data will be in input location, those data will index and after that new data will come automatically.

For example :-

  1. In first run, logstash will ingest 100 records from table 1 [At that time 100 records will be there].
  2. later 5 records will insert into table 1 and automatically logstash will ingest those new data into ES.

Is it ?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.