Marvel 2.1 ES 2.1 - node UIDs duplication on online/offline states

monitoring

(Marcin Kubica) #1

Hi,

Just doing simple tests with two data nodes and one non-data node. Deploying kibana 4.3 and ES 2.1 from scratch.

Initially marvel is showing 3 nodes, all is green and peachy. If I stop and a node, Marvel shows it's down. However when I start the node entry seem to be somehow 'duplicated'

This can be replicated for both data and non-data es nodes.

Links under node names have different UID even though node names are shown identical.

If I click on node reported as offline I can see it's historical data.

If I stop master node, the newly elected master appears as new node again.

And bringing the node which was previously master creates yet "new" node again.

If I delete .marvel-YYYY.MM.DD index as suggested in Node names missing after upgrading ES and Marvel to 2.1, KB to 4.3.0 then display page comes to normal.

Cheers
Marcin


(Tim Sullivan) #2

In Marvel 2.1, we made the design decision to group nodes in the listing by UID, rather than name or IP / port number. There are different ways it could be done. Grouping by UID is accurate in the sense that when you start a node, it is a new and distinct node, even if other meta information about it is the same, such as name as a node that has been in the cluster before. In fact, there is no restriction for multiple nodes in the cluster to share the same name. UID is unique though, which is why we thought it makes sense to group node data by node UID.

We're looking into changing the grouping of the listing to be grouped by node name as opposed to node UID, because it's easy to see now how the results become confusing.

Note that the "offline" nodes will be removed from the list when your selected time window moves past the period that those nodes had uptime. In other words, "Offline" status means that at some time during the time range you have selected, there was a node with that name online, but as of the time at the right-most edge of that range, that node was not online.


(Marcin Kubica) #3

Many thanks for your explanations Tim.


(system) #4