Transforming into JSON-Documents

Hello !

I am an absolute beginner in elasticsearch and after some hours of reading an watching tutorial-videos I got the impression, that it needs a lot of knowledge and preparation, before I can make a fast search.

In a tutorial-book for Elasticsearch i can read:
"Elasticsearch is a search-server, which can store JSON-Documents and (then) search in them."

Does this mean, that i have to transform all my huge data-amount first into JSON-Documents
before I can search in them?
I guess the ELK-Stack or Elasticsearch does the transforming into JSON-Documents for me.
Is that correct?
I guess furthermore the ELK-Stack or Elasticsearch does building of the indices, which i use for fast searching, for me.
Is that correct?

Thank you for helping me!

Elasticsearch will turn your data into json when it receives it. That might not end up in the best format that you want, so it can make sense to do that before sending to Elasticsearch.

Elasticsearch handles all the creation and searching of the indices.

Hallo,
and thank you very much for your answer! :slight_smile:

You tell, that JSON-data that Elasticsearch creates is often not in an optimal format.
I guess certain tools are usually used to transform the origin data into well-formated-JSON.
What kind of tools i can use to transform the origin data into well-formated-JSON.

Thank you for helping me.

That depends on the type of the source data.

Hallo,
and thank you very much for your answer!
I assume you mean that for each type there is one certain tool!?
For example for csv-formated data there is one certain tool?
Or for plain ascii-files there is one certain tool?

Is there a transforming tool that recognizes the type for each part of the source data?

Thank you.

There's things like Filebeat for event driven data.
Or Logstash can do both event and document style.

Hallo, and thank you.
I don't understand that. Do you mean, that both Filebeat and Logstash transform data into well formed JSON. And both Filebeat and Logstash can recognize the type of the source data by using certain technics?
Can you explain event driven and document style in this context? Thx. :slight_smile:

What sort of data are you looking to ingest here?

Hallo, and thank you.
What I understand is that Filebeat and Logstash transform all or almost all arbitrary data into well formed JSON data.
I would like to know which tools I use for plain ascii text files, and which for Office files like .xls or .doc an which for binary files like .png files and which for .pdf files and which for csv files and which for html-files and which for xml-files and which for log-files.
As principle all human readable files, that means also with programming source code ( .cpp-files .java-files) (I know binary files like png ar not human readable)
I would be thankful for a short overview.

Greetings from the Rhine in Germany. :slight_smile:

Filebeat is mostly for event based data, ie stuff with timestamps. Logstash has traditionally been used for that, but can be used for other formats too. They will both transform the incoming data into JSON.

Regarding the other formats, there are tools that can extract data from binary files. There's nothing native to the Elastic Stack that can do this.