Let's assume I have a document, which can have a state together with the timestamp when the document was set to this state. I want to know the document's current state as well as the document's state history.
Basically, I see 3 options how to model this:
option 1:
I have a nested state
object with its name and timestamp.
put doc_idx
{
"mappings":
{
"state_doc": {
"properties": {
"state": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"date": {
"type": "date"
}
}
}
}
}
}
}
This is probably the cleanest way to do it, as there is no redundant information. But I haven't found out yet how to filter documents by their current state, as it would be a nested query where only the the nested object with the maximum timestamp should match the state.
option 2:
I have a nested state
object with its name and timestamp and a current
flag.
put doc_idx
{
"mappings":
{
"state_doc": {
"properties": {
"state": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"date": {
"type": "date"
},
"current": {
"type": "boolean"
}
}
}
}
}
}
}
This would make it easier to filter documents by their current state. But it is prone to errors, as there could be less or more than 1 state having the current
flag.
option 3:
I have a nested state_history
object with its name and timestamp. Additionally, there is a current_state
and a current_state_since
field.
put doc_idx
{
"mappings": {
"state_doc": {
"properties": {
"current_state": {
"type": "keyword"
},
"current_state_since": {
"type": "date"
},
"state_history": {
"type": "nested",
"properties": {
"name": {
"type": "keyword"
},
"date": {
"type": "date"
}
}
}
}
}
}
}
This version makes it very easy to filter documents by their current state. But setting a document's new state would require to update the current state and the state history every time.
Is there another, better option? If option 1 is the way to go: how would I filter documents by their current state?