Hello fellow Elasticsearch-ers !
With this first post on the forums I'm reaching you for some guidance and knowledge.
I'm currently trying to apply a Rollup - based solution for my indices .
As the reason for our Rollup Job is to lighten ElasticSearch from old data which needs to be there but we don't need it to be accurate-by-the-minute, we were planning to let the Jobs rollup the data and then close old already-rolled indices for as long as we deem them not necessary anymore ( thus deleting them definitely ).
The data is organized by customer and by month.
In order to let the wheel spin, I have created a Rollup Job which should work at the beginning of each month ( "monthly" ), but in order to close old indices right away, we create another Rollup Job called "oneshot" which works in the very first few minutes after it's being created ( I don't like waiting next month for something that can be done right now ).
So ,
A.index_2019.03
A.index_2019.04
A.index_2019.05
get Rolledup by "oneshot" into
A.index_rollup
and each month "monthly" would add documents to
A.index_rollup
Now it's where the strange stuff kicks in :
I've been testing this plan for enough time to make me feel comfortable and start implementing into production, but (obviously) something went wrong.
Some of our customers' indices were not searchable anymore after closing old indices ( the query ARE update with _rollup_search
), while others had no issues at all.
Looking into it, the data between the two customers' indices (regular and rollup) is identical, the jobs are created through a script and can't be any different, the rollup indices are created in the same way and have the same mappings ... anything I could think of gave the same exact results.
Next step was :
- add another "oneshot" to the "broken customer" ( calling the job like "oneshot_2", else it would not be created for naming rules of Jobs )
- let it run
- check
No luck, the index was not fixed and not searchable.
Then I tried :
- remove all currently existing Jobs for the "broken customer".
- delete the "_rollup" index
- create another "oneshot" and let it run.
at this point the data WAS searchable.
- create the "monthly" job
and here the issue persisted, the index was not searchable anymore.
The "fun" thing is that the rollup index actually HAS data in it (varying from thousands to some millions of documents ) but the _rollup_search just returns 0 hits
Does anybody have any idea what could cause this, or any idea where I could actually look for it ?
( already compared with
GET A.index_rollup
GET B.index_rollup
GET A.index_rollup/_rollup/data
GET B.index_rollup/_rollup/data
GET xpack/rollup/job/oneshot_A
GET xpack/rollup/job/oneshot_B
GET xpack/rollup/job/monthly_A
GET xpack/rollup/job/monthly_B
)
EDIT : I forgot to specify few things
Currently running a 6.7.1 version on ElasticCloud
The Job templates rollup the data by "interval": "1h"
When I write "create" the Job, I also put the Job in a "started" state, both the "oneshot" and the "monthly".
In the last example I wrote, when the "oneshot" was created and started (and it already ran), the data was searchable. When I created the "monthly", it was "started" but did not run yet ( it's not the beginning of the month ), and the data was not searchable anymore