Slowly Indexing speed

Confused as there is no way your graph system, whatever that is, is taking millions of data points and putting them on the screen; and certainly not any Excel I've ever used.

So I don't get the "In excel I need only to pick two columns without any aggregations" when talking about millions of data points (and 40 fields). Even 100K in Excel is a lot.

It must aggregate or process somehow, so I'm puzzled by your "so there is no need for aggregation at all" when you are visualizing millions of data points. Find a way to do that work in Elasticsearch and graph the 100-1000 points you get in the end.

Ok , thanks Steve
You right about excel
But it can handle one million of records
It takes some time, a lots of time

about reducing data volume
I have axis of time in ms. If I want to see another part of the graph each time
In a time frame that I know will include a maximum of 500 thousands points
can I make Kibana limit the time frame and to let the user to change the time frame to another time frame by itself in the visualization
So he will see another part of the whole graph each time ?
(If the time frame is limited by me I can ensure that I see all the relevant dots for that specific time frame)

You must have better Excel than me; as mine sucks for most things. Elasticsearch is VERY good at time series data and can do lots of things for aggregating, bucketing, and so on - starting in Kibana, though I'm not sure what it does natively if you specify say a week of data and that's a million points; I imagine is aggregates automatically but would need Kibana experts to tell you, suggest you try it, starting in small time windows. But that's a normal time series problem.

Suggest you start a thread here on what you need it to do like I have 50M time series metrics records, and I need to visualize them in Kibana, starting with a basic line graph to show you can get a time window, aggregation/bucket, etc.

Then that you really need a scatter plot which will involve more math by Elasticsearch on the raw data before it aggregates - I don't know how to do that, but I bet smarter people do; even if it means taking data from one index to another smaller one first, in stages, or whatever; that's beyond me.

For the actual scatter plot maybe there are Kibana plug-ins for this, or you can do in Grafana instead, but would need the underlying query to get the data first.

Then you'd be using Elasticsearch in a good, fast way to load, store, and do basic stats/analyses on 'big' data. Of course 50M rows is not big data, but would get you started and if works you can scale to billions of data points and many TB or even PB of data and it'd all work the same way as long as it's RAM-efficient.

Also the plot you are doing seems fairly standard in particle physics so I bet others have already done this, and given the low budgets of most labs, probably some have done it in the ELK stack.

Thanks a lot
You helped me so much :slight_smile: !

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.