Failed to derive xcontent - status 400 & path.log is empty errors!

Luca_Desmond · July 22, 2021, 1:49pm

Fun one for you all :

so I'm trying to take a snapshot of a version 1.7 elastic (yes... 1.7) environment using the following command:

curl -XPUT 'http://localhost:9200/_snapshot/elasticrepo'

But i keep getting this error.

{"error":"ElasticsearchParseException[Failed to derive xcontent]","status":400}

Upon further investigation i noticed that the elasticsearch.yml did not contain a path.log parameter so I created a directory using the Elasticsearch user (and just gave the good old chmod 777 to make sure permissions were fine) and added like so:

path.repo: /usr/share/elasticsearch/elasticsearch-backup/

I restart the node and retry the command but still get the same error. Looking even further I tried this:

 curl -XPUT 'http://localhost:9200/_snapshot/elasticrepo' -d '{
    "type": "fs",
    "settings": {
        "location": "/user/share/elasticsearch/elasticsearch-backup",
        "compress": false
   }
}'

i now get this error:
location [/usr/share/elasticsearch/elasticsearch-backup] doesn't match any of the locations specified by path.repo]; ","status":500} except it should

i've tried variations of the path.rep

[/usr/share/elasticsearch/elasticsearch-backup]
["/usr/share/elasticsearch/elasticsearch-backup"]
[ "/usr/share/elasticsearch/elasticsearch-backup" ]

Ive also tried the above commands as root AND the elasticsearch user - same result. I'm at a loss as to how to simply just create a repo and take a snap shot!

anyone have any further ideas what's wrong and how i can resolve it?

any help would be much appreciated!

regards

dadoonet · July 22, 2021, 2:22pm

Did you set it on every node?
What does the elasticsearch.yml file look like? Could you share it (and format it using backticks or </> button?

Luca_Desmond · July 22, 2021, 3:26pm

Thanks for the prompt reply! - I haven't' set it on all nodes maybe? I didn't even realise there were other nodes to set it on given the enviroment!

If i do "_cat/nodes" i get:

DE-CI-ES "IPADDRESS" 14 82 0.00 d * DE-CI-ES
DE-CI-ES "IPADDRESS" 9 82 0.00 d m DE-CI-ES
DE-CI-ES "IPADDRESS" 7 82 0.00 d m DE-CI-ES

(NOTE: i have removed IP address, but they displayed the same ip )

Does that mean i have 3 nodes. If so how would i find their .yml file or can i set the snapshot repo with a curl command?

will post yml in separate reply due to character limit

thanks

Luca_Desmond · July 22, 2021, 3:28pm


################################### Cluster ###################################

# Cluster name identifies your cluster for auto-discovery. If you're running
# multiple clusters on the same network, make sure you're using unique names.
#
cluster.name: de-ci 


#################################### Node #####################################

# Node names are generated dynamically on startup, so you're relieved
# from configuring them manually. You can tie this node to a specific name:
#
# node.name: "Franz Kafka"
node.name: "DE-CI-ES"

# Every node can be configured to allow or deny being eligible as the master,
# and to allow or deny to store the data.
#
# Allow this node to be eligible as a master node (enabled by default):
#
# node.master: true
#
# Allow this node to store data (enabled by default):
#
# node.data: true

# You can exploit these settings to design advanced cluster topologies.
#
# 1. You want this node to never become a master node, only to hold data.
#    This will be the "workhorse" of your cluster.
#
# node.master: false
# node.data: true
#
# 2. You want this node to only serve as a master: to not store any data and
#    to have free resources. This will be the "coordinator" of your cluster.
#
# node.master: true
# node.data: false
#
# 3. You want this node to be neither master nor data node, but
#    to act as a "search load balancer" (fetching data from nodes,
#    aggregating results, etc.)
#
# node.master: false
# node.data: false

# Use the Cluster Health API [http://localhost:9200/_cluster/health], the
# Node Info API [http://localhost:9200/_nodes] or GUI tools
# such as <http://www.elasticsearch.org/overview/marvel/>,
# <http://github.com/karmi/elasticsearch-paramedic>,
# <http://github.com/lukas-vlcek/bigdesk> and
# <http://mobz.github.com/elasticsearch-head> to inspect the cluster state.

# A node can have generic attributes associated with it, which can later be used
# for customized shard allocation filtering, or allocation awareness. An attribute
# is a simple key value pair, similar to node.key: value, here is an example:
#
# node.rack: rack314

# By default, multiple nodes are allowed to start from the same installation location
# to disable it, set the following:
# node.max_local_storage_nodes: 1


#################################### Index ####################################

# You can set a number of options (such as shard/replica options, mapping
# or analyzer definitions, translog settings, ...) for indices globally,
# in this file.
#
# Note, that it makes more sense to configure index settings specifically for
# a certain index, either when creating it or by using the index templates API.
#
# See <http://elasticsearch.org/guide/en/elasticsearch/reference/current/index-modules.html> and
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/indices-create-index.html>
# for more information.

# Set the number of shards (splits) of an index (5 by default):
#
# index.number_of_shards: 5

# Set the number of replicas (additional copies) of an index (1 by default):
#
# index.number_of_replicas: 1

# Note, that for development on a local machine, with small indices, it usually
# makes sense to "disable" the distributed features:
#
# index.number_of_shards: 1
# index.number_of_replicas: 0

# These settings directly affect the performance of index and search operations
# in your cluster. Assuming you have enough machines to hold shards and
# replicas, the rule of thumb is:
#
# 1. Having more *shards* enhances the _indexing_ performance and allows to
#    _distribute_ a big index across machines.
# 2. Having more *replicas* enhances the _search_ performance and improves the
#    cluster _availability_.
#
# The "number_of_shards" is a one-time setting for an index.
#
# The "number_of_replicas" can be increased or decreased anytime,
# by using the Index Update Settings API.
#
# Elasticsearch takes care about load balancing, relocating, gathering the
# results from nodes, etc. Experiment with different settings to fine-tune
# your setup.

# Use the Index Status API (<http://localhost:9200/A/_status>) to inspect
# the index status.


#################################### Paths ####################################

# Path to directory containing configuration (this file and logging.yml):
#
path.conf: /etc/elasticsearch

# Path to directory where to store index data allocated for this node.
#
path.data: /opt/de/elasticsearch/data
#
# Can optionally include more than one location, causing data to be striped across
# the locations (a la RAID 0) on a file level, favouring locations with most free
# space on creation. For example:
#
# path.data: /path/to/data1,/path/to/data2

# Path to temporary files:
#
path.work: /opt/de/elasticsearch/tmp

# Path to log files:
#
# path.logs: /path/to/logs
path.logs: /var/log/deps/elasticsearch

# Path to where plugins are installed:
#
path.plugins: /usr/share/elasticsearch/plugins

#Path to snapshot repository
#
path.repo: ["/user/share/elasticsearch/elasticsearch-backup"]

#################################### Plugin ###################################

# If a plugin listed here is not installed for current node, the node will not start.
#
# plugin.mandatory: mapper-attachments,lang-groovy


################################### Memory ####################################

# Elasticsearch performs poorly when JVM starts swapping: you should ensure that
# it _never_ swaps.
#
# Set this property to true to lock the memory:
#
# bootstrap.mlockall: true

# Make sure that the ES_MIN_MEM and ES_MAX_MEM environment variables are set
# to the same value, and that the machine has enough memory to allocate
# for Elasticsearch, leaving enough memory for the operating system itself.
#
# You should also make sure that the Elasticsearch process is allowed to lock
# the memory, eg. by using `ulimit -l unlimited`.


############################## Network And HTTP ###############################

# Elasticsearch, by default, binds itself to the 0.0.0.0 address, and listens
# on port [9200-9300] for HTTP traffic and on port [9300-9400] for node-to-node
# communication. (the range means that if the port is busy, it will automatically
# try the next port).

# Set the bind address specifically (IPv4 or IPv6):
#
# network.bind_host: 192.168.0.1

# Set the address other nodes will use to communicate with this node. If not
# set, it is automatically derived. It must point to an actual IP address.
#
# network.publish_host: 192.168.0.1

# Set both 'bind_host' and 'publish_host':
#
# network.host: 192.168.0.1

# Set a custom port for the node to node communication (9300 by default):
#
# transport.tcp.port: 9300

# Enable compression for all communication between nodes (disabled by default):
#
# transport.tcp.compress: true

# Set a custom port to listen for HTTP traffic:
#
# http.port: 9200

# Set a custom allowed content length:
#
# http.max_content_length: 100mb

# Disable HTTP completely:
#
# http.enabled: false
################################### Gateway ###################################

# The gateway allows for persisting the cluster state between full cluster
# restarts. Every change to the state (such as adding an index) will be stored
# in the gateway, and when the cluster starts up for the first time,
# it will read its state from the gateway.

# There are several types of gateway implementations. For more information, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-gateway.html>.

# The default gateway type is the "local" gateway (recommended):
#
# gateway.type: local

# Settings below control how and when to start the initial recovery process on
# a full cluster restart (to reuse as much local data as possible when using shared
# gateway).

# Allow recovery process after N nodes in a cluster are up:
#
# gateway.recover_after_nodes: 1

# Set the timeout to initiate the recovery process, once the N nodes
# from previous setting are up (accepts time value):
#
# gateway.recover_after_time: 5m

# Set how many nodes are expected in this cluster. Once these N nodes
# are up (and recover_after_nodes is met), begin recovery process immediately
# (without waiting for recover_after_time to expire):
#
# gateway.expected_nodes: 2


############################# Recovery Throttling #############################

# These settings allow to control the process of shards allocation between
# nodes during initial recovery, replica allocation, rebalancing,
# or when adding and removing nodes.

# Set the number of concurrent recoveries happening on a node:
#
# 1. During the initial recovery
#
# cluster.routing.allocation.node_initial_primaries_recoveries: 4
#
# 2. During adding/removing nodes, rebalancing, etc
#
# cluster.routing.allocation.node_concurrent_recoveries: 2

# Set to throttle throughput when recovering (eg. 100mb, by default 20mb):
#
# indices.recovery.max_bytes_per_sec: 20mb

# Set to limit the number of open concurrent streams when
# recovering a shard from a peer:
#
# indices.recovery.concurrent_streams: 5


################################## Discovery ##################################

# Discovery infrastructure ensures nodes can be found within a cluster
# and master node is elected. Multicast discovery is the default.

# Set to ensure a node sees N other master eligible nodes to be considered
# operational within the cluster. Its recommended to set it to a higher value
# than 1 when running more than 2 nodes in the cluster.
#
# discovery.zen.minimum_master_nodes: 1

# Set the time to wait for ping responses from other nodes when discovering.
# Set this option to a higher value on a slow or congested network
# to minimize discovery failures:
#
# discovery.zen.ping.timeout: 3s

# For more information, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-zen.html>

# Unicast discovery allows to explicitly control which nodes will be used
# to discover the cluster. It can be used when multicast is not present,
# or to restrict the cluster communication-wise.
#
# 1. Disable multicast discovery (enabled by default):
#
# discovery.zen.ping.multicast.enabled: false
#
# 2. Configure an initial list of master nodes in the cluster
#    to perform discovery when new nodes (master or data) are started:
#
# discovery.zen.ping.unicast.hosts: ["host1", "host2:port"]

# EC2 discovery allows to use AWS EC2 API in order to perform discovery.
#
# You have to install the cloud-aws plugin for enabling the EC2 discovery.
#
# For more information, see
# <http://elasticsearch.org/guide/en/elasticsearch/reference/current/modules-discovery-ec2.html>
#
# See <http://elasticsearch.org/tutorials/elasticsearch-on-ec2/>
# for a step-by-step tutorial.
# GCE discovery allows to use Google Compute Engine API in order to perform discovery.
#
# You have to install the cloud-gce plugin for enabling the GCE discovery.
#
# For more information, see <https://github.com/elasticsearch/elasticsearch-cloud-gce>.

# Azure discovery allows to use Azure API in order to perform discovery.
#
# You have to install the cloud-azure plugin for enabling the Azure discovery.
#
# For more information, see <https://github.com/elasticsearch/elasticsearch-cloud-azure>.

################################## Slow Log ##################################

# Shard level query and fetch threshold logging.

#index.search.slowlog.threshold.query.warn: 10s
#index.search.slowlog.threshold.query.info: 5s
#index.search.slowlog.threshold.query.debug: 2s
#index.search.slowlog.threshold.query.trace: 500ms

#index.search.slowlog.threshold.fetch.warn: 1s
#index.search.slowlog.threshold.fetch.info: 800ms
#index.search.slowlog.threshold.fetch.debug: 500ms
#index.search.slowlog.threshold.fetch.trace: 200ms

#index.indexing.slowlog.threshold.index.warn: 10s
#index.indexing.slowlog.threshold.index.info: 5s
#index.indexing.slowlog.threshold.index.debug: 2s
#index.indexing.slowlog.threshold.index.trace: 500ms

################################## GC Logging ################################

#monitor.jvm.gc.young.warn: 1000ms
#monitor.jvm.gc.young.info: 700ms
#monitor.jvm.gc.young.debug: 400ms

#monitor.jvm.gc.old.warn: 10s
#monitor.jvm.gc.old.info: 5s
#monitor.jvm.gc.old.debug: 2s

# DE config
index.refresh_interval: 250
# To fix error 'Scripts of type [inline], operation [search] and lang [groovy] are disabled' when searching
script.engine.groovy.inline.search: on
script.engine.groovy.inline.update: on
# enable cors
http.cors.enabled: true
#http.cors.allow-origin: "/.*/"
http.cors.allow-origin: "*"

dadoonet · July 22, 2021, 5:00pm

Yes.
It should not be the case if you are in production as each node should be on its own machine.

You should probably stop all the nodes and restart them.
Or stop them all and just restart one node.

Luca_Desmond · July 22, 2021, 10:42pm

no this is a dev machine ( but I will eventually have to do this on production which..is also on just one machine. unfortunately I did not build it like this)

so, I use the " service elasticsearch stop/start" wrapper. I'm assuming because these 3 nodes are all on the one machine - they all get restarted? if not and I figure out how to stop all 3 and restart one (im guessing there's a way to find the node Id and target it directly with a stop command?).
I'm going to need to restart the other two at somepoint. how do I make sure this repo path (provided just having one node up initially resolves my repo issue) replicates to the other nodes?

i'll have to look into moving all nodes onto separate machines I would probably suggest they rebuild on a newer version anyway. but im not sure if this would even fix the issue I'm encountering!

Sorry for all the questions! i've not really dabbled in elastic before and all documentation I've read and tried to implement, has led me to this point!

thanks for the help so far!

dadoonet · July 23, 2021, 4:26am

If you don't really need the other nodes, just stop them.

If it's a dev machine, why are you still using the 1.7 version?
I'd not recommend at all to put this in production.

In old versions like this one, there is no control if you launch by mistake another instance on the same machine. That's probably what happened in your case.

As it's just a dev platform, stop all the nodes. Maybe the _shutdown API is still existing in this version. So call it on each node.

Then restart the service. I believe that only one node will start.
If not, edit your service scripts.

Luca_Desmond · July 23, 2021, 4:45pm

it's in use on an old system that has not (& will unlikely be) given the time to be upgraded and I suspect it'll stay that way for a while longer.

thanks for suggesting the shutdown api!

to be honest im not sure if i did need the other two nodes or not as i never set these environments up but, using the shutdown api seems to have automatically killed off the other two nodes as when i started elastic back up I went from 3 nodes to one. I was able to register a repository but now snapshots are failing. when i query my test snapshot it shows this:

"state" : "FAILED",
    "reason" : "Indices don't have primary shards

cluster health shows this:

{
  "cluster_name" : "deps-ci",
  "status" : "red",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "active_primary_shards" : 103,
  "active_shards" : 103,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 207,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0
}

i'm assuming it's because of the unassigned shards, does that mean i need to reindex them all? or get rid of the unassigned ones?. does this also mean that those shards were originally assigned to other two (erroneous) nodes?

Cheers

dadoonet · July 23, 2021, 5:30pm

If all nodes went down, it did not let time to replicate the data to the remaining nodes.

The easiest thing to do IMO is to get rid of the data dir and restart from the scratch (reindex the data) if possible.

Luca_Desmond · July 27, 2021, 3:01pm

ok i can see how there were 3 nodes now. someone had modified the startup wrapper to basically start each instance as a new data node but all of them point to the single YML file! so the 3 nodes on one system was intentional!

I've got a successful snapshot and the cluster has gone back up to green as well. Thank you for all your help!

system · August 24, 2021, 3:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Issue help - snapshot creation failure Elasticsearch	26	7220	May 17, 2017
Impossible to create Repository for a snapshot : location [...] doesn't match any of the locations specified by path.repo because this setting is empty Elasticsearch	15	9651	January 15, 2018
Elaticsearch not recognising path repo Elasticsearch	19	14406	October 21, 2017
Repository Elasticsearch	14	5447	July 5, 2017
Location [****] doesn't match any of the locations specified by path.repo because this setting is empty Elasticsearch	15	30548	July 5, 2017

Failed to derive xcontent - status 400 & path.log is empty errors!

Related topics