After adding second host I can no longer create or view visualizations

For about a year we operated on a single host/instance of Elastic. I've secured the ElasticStack with the native (basic) authentication scheme and everything was working fine.

I've now added a second host and clustered the instance in order to utilize ILM. I have everything flowing between hosts and my cluster health is green.

When I attempt to view any of my visualizations an alert comes up that reads 1 of X Shards failed after clicking the 'Show Details' button nothing shows up.

Here is the error entry from the cluster logs:

[2020-02-24T10:55:38,237][DEBUG][o.e.a.s.TransportSearchAction] [HOST1] [wazuh-alerts-3.x-2020.02.24][0], node[EfI9o1nuTIurxca_lOY10A], [P], s[STARTED], a[id=To3MmSfzT1SqyMzNR-0z_w]: Failed to execute [SearchRequest{searchType=QUERY_THEN_FETCH, indices=[wazuh-alerts-3.x-2020.02.22, wazuh-alerts-3.x-2020.02.23, wazuh-alerts-3.x-2020.02.20, wazuh-alerts-3.x-2020.02.21, wazuh-alerts-3.x-2020.02.24, wazuh-alerts-3.x-2020.02.11, wazuh-alerts-3.x-2020.02.12, wazuh-alerts-3.x-2020.02.10, wazuh-alerts-3.x-2020.02.19, wazuh-alerts-3.x-2020.02.17, wazuh-alerts-3.x-2020.02.18, wazuh-alerts-3.x-2020.02.15, wazuh-alerts-3.x-2020.02.16, wazuh-alerts-3.x-2020.02.14, wazuh-alerts-3.x-2019.12.29, wazuh-alerts-3.x-2019.12.24, wazuh-alerts-3.x-2019.12.23, wazuh-alerts-3.x-2019.12.22, wazuh-alerts-3.x-2019.12.21, wazuh-alerts-3.x-2019.12.28, wazuh-alerts-3.x-2019.12.27, wazuh-alerts-3.x-2019.12.26, wazuh-alerts-3.x-2019.12.25, wazuh-alerts-3.x-2019.12.31, wazuh-alerts-3.x-2019.12.30, wazuh-alerts-3.x-2019.12.19, wazuh-alerts-3.x-2019.12.18, wazuh-alerts-3.x-2019.12.13, wazuh-alerts-3.x-2019.12.12, wazuh-alerts-3.x-2019.12.11, wazuh-alerts-3.x-2019.12.10, wazuh-alerts-3.x-2019.12.17, wazuh-alerts-3.x-2019.12.16, wazuh-alerts-3.x-2019.12.15, wazuh-alerts-3.x-2019.12.14, wazuh-alerts-3.x-2019.12.20, wazuh-alerts-3.x-2019.12.09, wazuh-alerts-3.x-2019.12.08, wazuh-alerts-3.x-2019.12.07, wazuh-alerts-3.x-2020.01.09, wazuh-alerts-3.x-2019.12.02, wazuh-alerts-3.x-2019.12.01, wazuh-alerts-3.x-2019.12.06, wazuh-alerts-3.x-2019.12.05, wazuh-alerts-3.x-2019.12.04, wazuh-alerts-3.x-2019.12.03, wazuh-alerts-3.x-2020.01.10, wazuh-alerts-3.x-2020.01.11, wazuh-alerts-3.x-2020.01.18, wazuh-alerts-3.x-2020.01.19, wazuh-alerts-3.x-2020.01.16, wazuh-alerts-3.x-2020.01.17, wazuh-alerts-3.x-2020.01.14, wazuh-alerts-3.x-2020.01.15, wazuh-alerts-3.x-2020.01.12, wazuh-alerts-3.x-2020.01.13, wazuh-alerts-3.x-2019.11.29, wazuh-alerts-3.x-2019.11.28, wazuh-alerts-3.x-2019.11.23, wazuh-alerts-3.x-2019.11.22, wazuh-alerts-3.x-2019.11.21, wazuh-alerts-3.x-2019.11.20, wazuh-alerts-3.x-2019.11.27, wazuh-alerts-3.x-2019.11.26, wazuh-alerts-3.x-2019.11.25, wazuh-alerts-3.x-2019.11.24, wazuh-alerts-3.x-2019.11.30, wazuh-alerts-3.x-2020.01.07, wazuh-alerts-3.x-2020.01.08, wazuh-alerts-3.x-2020.01.05, wazuh-alerts-3.x-2020.01.06, wazuh-alerts-3.x-2020.01.03, wazuh-alerts-3.x-2020.01.04, wazuh-alerts-3.x-2020.01.01, wazuh-alerts-3.x-2020.01.02, wazuh-alerts-3.x-2019.11.19, wazuh-alerts-3.x-2019.11.18, wazuh-alerts-3.x-2019.11.17, wazuh-alerts-3.x-2019.11.12, wazuh-alerts-3.x-2019.11.11, wazuh-alerts-3.x-2019.11.10, wazuh-alerts-3.x-2019.11.16, wazuh-alerts-3.x-2019.11.15, wazuh-alerts-3.x-2019.11.14, wazuh-alerts-3.x-2019.11.13, wazuh-alerts-3.x-2020.02.01, wazuh-alerts-3.x-2020.01.30, wazuh-alerts-3.x-2020.01.31, wazuh-alerts-3.x-2020.02.08, wazuh-alerts-3.x-2020.02.09, wazuh-alerts-3.x-2020.02.06, wazuh-alerts-3.x-2020.02.07, wazuh-alerts-3.x-2020.02.04, wazuh-alerts-3.x-2020.02.02, wazuh-alerts-3.x-2020.02.03, wazuh-alerts-3.x-2019.11.09, wazuh-alerts-3.x-2019.11.08, wazuh-alerts-3.x-2019.11.07, wazuh-alerts-3.x-2019.11.06, wazuh-alerts-3.x-2019.11.01, wazuh-alerts-3.x-2019.10.30, wazuh-alerts-3.x-2019.10.31, wazuh-alerts-3.x-2019.11.05, wazuh-alerts-3.x-2019.11.04, wazuh-alerts-3.x-2019.11.03, wazuh-alerts-3.x-2019.11.02, wazuh-alerts-3.x-2020.01.21, wazuh-alerts-3.x-2020.01.22, wazuh-alerts-3.x-2020.01.20, wazuh-alerts-3.x-2020.01.29, wazuh-alerts-3.x-2020.01.27, wazuh-alerts-3.x-2020.01.28, wazuh-alerts-3.x-2020.01.25, wazuh-alerts-3.x-2020.01.26, wazuh-alerts-3.x-2020.01.23, wazuh-alerts-3.x-2020.01.24, wazuh-alerts-3.x-2019.10.29, wazuh-alerts-3.x-2019.10.27, wazuh-alerts-3.x-2019.10.28, wazuh-alerts-3.x-2019.10.21, wazuh-alerts-3.x-2019.10.22, wazuh-alerts-3.x-2019.10.20, wazuh-alerts-3.x-2019.10.25, wazuh-alerts-3.x-2019.10.26, wazuh-alerts-3.x-2019.10.23, wazuh-alerts-3.x-2019.10.24, wazuh-alerts-3.x-2019.09.29, wazuh-alerts-3.x-2019.10.18, wazuh-alerts-3.x-2019.10.19, wazuh-alerts-3.x-2019.10.16, wazuh-alerts-3.x-2019.10.17, wazuh-alerts-3.x-2019.10.10, wazuh-alerts-3.x-2019.10.11, wazuh-alerts-3.x-2019.10.14, wazuh-alerts-3.x-2019.10.15, wazuh-alerts-3.x-2019.10.12, wazuh-alerts-3.x-2019.10.13, wazuh-alerts-3.x-2019.09.30, wazuh-alerts-3.x-2019.09.18, wazuh-alerts-3.x-2019.10.07, wazuh-alerts-3.x-2019.09.19, wazuh-alerts-3.x-2019.10.08, wazuh-alerts-3.x-2019.10.05, wazuh-alerts-3.x-2019.10.06, wazuh-alerts-3.x-2019.10.09, wazuh-alerts-3.x-2019.10.03, wazuh-alerts-3.x-2019.10.04, wazuh-alerts-3.x-2019.10.01, wazuh-alerts-3.x-2019.10.02, wazuh-alerts-3.x-2019.09.20, wazuh-alerts-3.x-2019.09.21, wazuh-alerts-3.x-2019.09.22, wazuh-alerts-3.x-2019.09.23, wazuh-alerts-3.x-2019.09.24, wazuh-alerts-3.x-2019.09.25, wazuh-alerts-3.x-2019.09.26, wazuh-alerts-3.x-2019.09.27, wazuh-alerts-3.x-2019.09.28, wazuh-alerts-3.x-2019.09.07, wazuh-alerts-3.x-2019.09.08, wazuh-alerts-3.x-2019.09.09, wazuh-alerts-3.x-2019.09.10, wazuh-alerts-3.x-2019.09.11, wazuh-alerts-3.x-2019.09.12, wazuh-alerts-3.x-2019.09.13, wazuh-alerts-3.x-2019.09.14, wazuh-alerts-3.x-2019.09.15, wazuh-alerts-3.x-2019.09.16, wazuh-alerts-3.x-2019.09.17, wazuh-alerts-3.x-2019.08.28, wazuh-alerts-3.x-2019.08.29, wazuh-alerts-3.x-2019.08.30, wazuh-alerts-3.x-2019.08.31, wazuh-alerts-3.x-2019.09.01, wazuh-alerts-3.x-2019.09.02, wazuh-alerts-3.x-2019.09.03, wazuh-alerts-3.x-2019.09.04, wazuh-alerts-3.x-2019.09.05, wazuh-alerts-3.x-2019.09.06, wazuh-alerts-3.x-2019.08.17, wazuh-alerts-3.x-2019.08.18, wazuh-alerts-3.x-2019.08.19, wazuh-alerts-3.x-2019.08.20, wazuh-alerts-3.x-2019.08.21, wazuh-alerts-3.x-2019.08.22, wazuh-alerts-3.x-2019.08.23, wazuh-alerts-3.x-2019.08.24, wazuh-alerts-3.x-2019.08.25, wazuh-alerts-3.x-2019.08.26, wazuh-alerts-3.x-2019.08.27, wazuh-alerts-3.x-2019.08.06, wazuh-alerts-3.x-2019.08.07, wazuh-alerts-3.x-2019.08.10, wazuh-alerts-3.x-2019.08.11, wazuh-alerts-3.x-2019.08.12, wazuh-alerts-3.x-2019.08.13, wazuh-alerts-3.x-2019.08.14, wazuh-alerts-3.x-2019.08.15, wazuh-alerts-3.x-2019.08.16, wazuh-alerts-3.x-2019.07.31, wazuh-alerts-3.x-2019.08.01, wazuh-alerts-3.x-2019.08.02, wazuh-alerts-3.x-2019.08.03, wazuh-alerts-3.x-2019.08.04, wazuh-alerts-3.x-2019.08.05], indicesOptions=IndicesOptions[ignore_unavailable=true, allow_no_indices=true, expand_wildcards_open=true, expand_wildcards_closed=false, allow_aliases_to_multiple_indices=true, forbid_closed_indices=true, ignore_aliases=false, ignore_throttled=true], types=[], routing='null', preference='1582562395082', requestCache=null, scroll=null, maxConcurrentShardRequests=0, batchedReduceSize=512, preFilterShardSize=128, allowPartialSearchResults=true, localClusterAlias=null, getOrCreateAbsoluteStartMillis=-1, ccsMinimizeRoundtrips=true, source={"size":0,"timeout":"30000ms","query":{"bool":{"must":[{"query_string":{"query":"data.id:4625 and NOT data.account_name:*$* and NOT data.account_name:\"ADMINISTRATOR\" and NOT data.account_name:\"Administrator\"","fields":[],"type":"best_fields","default_operator":"or","max_determinized_states":10000,"enable_position_increments":true,"fuzziness":"AUTO","fuzzy_prefix_length":0,"fuzzy_max_expansions":50,"phrase_slop":0,"analyze_wildcard":true,"time_zone":"America/Denver","escape":false,"auto_generate_synonyms_phrase_query":true,"fuzzy_transpositions":true,"boost":1.0}}],"filter":[{"match_all":{"boost":1.0}},{"range":{"timestamp":{"from":"2020-02-24T07:00:00.000Z","to":"2020-02-25T06:59:59.999Z","include_lower":true,"include_upper":true,"format":"strict_date_optional_time","boost":1.0}}}],"adjust_pure_negative":true,"boost":1.0}},"_source":{"includes":[],"excludes":["@timestamp"]},"stored_fields":"*","docvalue_fields":[{"field":"data.CreationTime","format":"date_time"},{"field":"data.ExchangeDetails.MessageTime","format":"date_time"},{"field":"data.ExchangeMetaData.Sent","format":"date_time"},{"field":"data.FilteringDate","format":"date_time"},{"field":"data.FirstSeen","format":"date_time"},{"field":"data.LastSeen","format":"date_time"},{"field":"data.MessageDate","format":"date_time"},{"field":"data.StartTime","format":"date_time"},{"field":"data.Timestamp","format":"date_time"},{"field":"data.activityDate","format":"date_time"},{"field":"data.aws.end","format":"date_time"},{"field":"data.aws.start","format":"date_time"},{"field":"event.created","format":"date_time"},{"field":"file.ctime","format":"date_time"},{"field":"file.mtime","format":"date_time"},{"field":"process.start","format":"date_time"},{"field":"syscheck.mtime_after","format":"date_time"},{"field":"syscheck.mtime_before","format":"date_time"},{"field":"system.audit.host.boottime","format":"date_time"},{"field":"timestamp","format":"date_time"},{"field":"winlog.event_data.DeviceTime","format":"date_time"},{"field":"winlog.event_data.NewTime","format":"date_time"},{"field":"winlog.event_data.OldTime","format":"date_time"},{"field":"winlog.event_data.PreviousTime","format":"date_time"},{"field":"winlog.event_data.StartTime","format":"date_time"},{"field":"winlog.event_data.StopTime","format":"date_time"},{"field":"winlog.user_data.UTCStartTime","format":"date_time"}],"script_fields":{"hourofday":{"script":{"source":"doc['@timestamp'].value.getHour() -7","lang":"painless"},"ignore_failure":false}},"track_total_hits":2147483647,"aggregations":{"2":{"terms":{"field":"data.account_name","size":75,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]},"aggregations":{"3":{"terms":{"field":"data.account_domain","size":50,"min_doc_count":1,"shard_min_doc_count":0,"show_term_doc_count_error":false,"order":[{"_count":"desc"},{"_key":"asc"}]}}}}}}}]
org.elasticsearch.transport.RemoteTransportException: [HOST2][x.x.x.x:9300][indices:data/read/search[phase/query]]

Any ideas on what I need to do to resolve this?

What is the full output of the ‘_cluster/stats’ API?

{
  "_nodes" : {
    "total" : 2,
    "successful" : 2,
    "failed" : 0
  },
  "cluster_name" : "Cluster",
  "cluster_uuid" : "XXXXXXXXXXXXXXXXXXXXXX",
  "timestamp" : 1582567725850,
  "status" : "green",
  "indices" : {
    "count" : 395,
    "shards" : {
      "total" : 1594,
      "primaries" : 834,
      "replication" : 0.9112709832134293,
      "index" : {
        "shards" : {
          "min" : 2,
          "max" : 6,
          "avg" : 4.035443037974684
        },
        "primaries" : {
          "min" : 1,
          "max" : 3,
          "avg" : 2.1113924050632913
        },
        "replication" : {
          "min" : 0.0,
          "max" : 1.0,
          "avg" : 0.9063291139240506
        }
      }
    },
    "docs" : {
      "count" : 3880120152,
      "deleted" : 6735348
    },
    "store" : {
      "size_in_bytes" : 1803954856210
    },
    "fielddata" : {
      "memory_size_in_bytes" : 37952,
      "evictions" : 0
    },
    "query_cache" : {
      "memory_size_in_bytes" : 44783,
      "total_count" : 2547901,
      "hit_count" : 1264786,
      "miss_count" : 1283115,
      "cache_size" : 52,
      "cache_count" : 52,
      "evictions" : 0
    },
    "completion" : {
      "size_in_bytes" : 0
    },
    "segments" : {
      "count" : 25789,
      "memory_in_bytes" : 3304781648,
      "terms_memory_in_bytes" : 2380198704,
      "stored_fields_memory_in_bytes" : 846014128,
      "term_vectors_memory_in_bytes" : 0,
      "norms_memory_in_bytes" : 15621696,
      "points_memory_in_bytes" : 0,
      "doc_values_memory_in_bytes" : 62947120,
      "index_writer_memory_in_bytes" : 824387374,
      "version_map_memory_in_bytes" : 1654803,
      "fixed_bit_set_memory_in_bytes" : 10547792,
      "max_unsafe_auto_id_timestamp" : 1582564863810,
      "file_sizes" : { }
    }
  },
  "nodes" : {
    "count" : {
      "total" : 2,
      "coordinating_only" : 0,
      "data" : 2,
      "ingest" : 2,
      "master" : 2,
      "ml" : 2,
      "voting_only" : 0
    },
    "versions" : [
      "7.6.0"
    ],
    "os" : {
      "available_processors" : 16,
      "allocated_processors" : 16,
      "names" : [
        {
          "name" : "Linux",
          "count" : 2
        }
      ],
      "pretty_names" : [
        {
          "pretty_name" : "Ubuntu 18.04.4 LTS",
          "count" : 2
        }
      ],
      "mem" : {
        "total_in_bytes" : 50359697408,
        "free_in_bytes" : 3467444224,
        "used_in_bytes" : 46892253184,
        "free_percent" : 7,
        "used_percent" : 93
      }
    },
    "process" : {
      "cpu" : {
        "percent" : 62
      },
      "open_file_descriptors" : {
        "min" : 12454,
        "max" : 32718,
        "avg" : 22586
      }
    },
    "jvm" : {
      "max_uptime_in_millis" : 5533899,
      "versions" : [
        {
          "version" : "13.0.2",
          "vm_name" : "OpenJDK 64-Bit Server VM",
          "vm_version" : "13.0.2+8",
          "vm_vendor" : "AdoptOpenJDK",
          "bundled_jdk" : true,
          "using_bundled_jdk" : true,
          "count" : 2
        }
      ],
      "mem" : {
        "heap_used_in_bytes" : 13771398520,
        "heap_max_in_bytes" : 25647710208
      },
      "threads" : 317
    },
    "fs" : {
      "total_in_bytes" : 19641735970816,
      "free_in_bytes" : 17473815744512,
      "available_in_bytes" : 16483782324224
    },
    "plugins" : [ ],
    "network_types" : {
      "transport_types" : {
        "security4" : 2
      },
      "http_types" : {
        "security4" : 2
      }
    },
    "discovery_types" : {
      "zen" : 2
    },
    "packaging_types" : [
      {
        "flavor" : "default",
        "type" : "deb",
        "count" : 2
      }
    ],
    "ingest" : {
      "number_of_pipelines" : 4,
      "processor_stats" : {
        "geoip" : {
          "count" : 244275,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 932
        },
        "gsub" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        },
        "script" : {
          "count" : 0,
          "failed" : 0,
          "current" : 0,
          "time_in_millis" : 0
        }
      }
    }
  }
}

Do both nodes have the same amount of disk space? Is it possible one of the nodes is filling up and this has caused your indices to become read-only?

Each node has about 8TB remaining.

CPU and RAM utilization also seem to be normal.

I have also configured the second node to be the location of the WARM phase and the initial node to be the location of the HOT phase.

Logs are still flowing in and I can search at will using "Discovery" in Kibana as well as the API.

Alright was able to fix this. The problem was the type for the field data.account_name changed from keyword to text at some point breaking the search. I'm guessing the maintainers of the template broke that.

I've resolve this by enabling the fielddata on the fields for those shards that were text. I've also resolved future issues by changing the template myself to adjust that field to a keyword.

At some point I'll have to bite the bullet and reindex everything, but I'm hoping to kick that can down the road a bit further.

Examples of the API calls I made to resolve this issue:

PUT _template/[template]/
{
      "index_patterns" : [
      "[pattern-*]"
    ],
    "mappings": {
      "properties": {
        "data.account_name" :{
          "type": "keyword"
          }
      }
    }
}

PUT [pattern-*]/_mapping
{
      "properties": {
        "data.account_name" :{
          "type": "text",
          "fielddata": true
          }
      }
    }

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.