Elasticsearch and its files?


(Daniel) #1

Hi everyone.

Im kinda brand new to elasticsearch . Is there a way to see what files that are going through elasticsearch?

Lets say my elasticsearch are running some files that are passing through. Is there a way to see what file there are passing through and how long that file will take? maybe a plugin or something ?


(David Pilato) #2

Not sure I understand the question.

What do you mean by "files going through elasticsearch"?

Are you talking about the files elasticsearch is storing locally on the hard drive?


(Daniel) #3

So right now im using a program that i setup to run with elasticsearch engine. now it could be really nice if there was a way to see what elasticsearch was doing, and what is going through the engine.


(David Pilato) #4

From a JSON document to the bits on disk? Is that what you are looking for?

Like:

  • parse JSON
  • analyze text
  • build the inverted index
  • write on disk

Well I don't think such a picture exists as it's really complex and depends on many factors.

But again may be I don't understand your question.
Why are you asking that actually?


(Jaishil Dhal) #5

go throw this docs it ll help u..

https://www.elastic.co/guide/en/kibana/current/getting-started.html


(Daniel) #6

hI

i Just want to see what the engine is doing, what files are passing through and so. if its possible to see what it is working on.


(Jörg Prante) #7

Elasticsearch is a distributed system. You can use monitoring tools like kopf, head, marvel etc. to gain insight.

Monitoring a single query through the system can be achieved by using the query profiler https://www.elastic.co/elasticon/conf/2016/sf/profiling-elasticsearch-queries-for-fun-and-profit

Monitoring indexing is possible by evaluating the Java API BulkResponse object which Elasticsearch returns.

Index segment monitoring can be done by plugins like Whatson https://github.com/xyu/elasticsearch-whatson

Low-level monitoring (files etc.) is hard because of several reasons: the volume of data is massive, the monitored nodes are distributed and working asynchronously, and most important the insight will be rather limited by watching raw data and events. That is the reason why monitoring tools like Marvel do the heavy lifting and show graphical data (trends) and statistics.


(system) #8