Change data from database

pcb · August 24, 2020, 9:29pm

Hi ya,

I'm storing timestamped change data coming off a SQL database into Elasticsearch and would like some advice. It is over 100 tables and most will be fairly small (small number of bytes + not many changes over course of the day) with a few tables being much larger and more active.

I'm curious if you all would recommend one index per table with an alias or one mono-index.

Some points:

Every table will have similar context information (timestamp, user id, org id, etc)
Queries will usually be over all tables by timestamp. I'll likely be sorting by timestamp and aggregating a lot across tables.
I'll like be rolling over the index/indices every week or month.
There will be about 1000 columns (so 1000 fields about total).
Very rarely add columns (and therefore fields).
Quite a few of the tables are small enough that they wouldn't even necessarily need their own shard space-wise(quite a few would be less than 10GB per month).

I'll be writing into it with bulk action, from one process atm. I'm mostly focused on query speed.

I realize there isn't always a straightforward answer here, but I'd love any thoughts you could provide.

Thanks much,
Patrick

warkolm · August 24, 2020, 9:31pm

I would just use one index for them all, and then look at attaching a field+value that identifies the originating table so that you can filter on that if you want.

pcb · August 24, 2020, 10:07pm

Awesome thanks @warkolm , that was the direction I was leaning.

Is 1000 fields within one index an issue? I know having "lots" of fields can be tough on a cluster's master nodes, but never heard what "lots" really means.

Thanks,
Patrick

warkolm · August 24, 2020, 10:09pm

Check out https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html#mapping-limit-settings for more on that.

system · September 21, 2020, 10:09pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Best Approach to taking a large(ish) sql table into ES Elasticsearch	2	435	July 6, 2017
Relational->ES index schema strategy (1 index per table or all tables in 1 index) Elasticsearch	5	1024	July 6, 2017
General question about indexing Elasticsearch	4	372	July 6, 2017
Many small indices vs one large index? Elasticsearch	10	5459	July 6, 2017
Need help architecting an index/schema (migrating data from MySQL) Elasticsearch	7	453	July 6, 2017

Change data from database

Related topics