No_shard_available_action_exception in integration tests

rdehuyss · February 2, 2021, 8:28am

Hi,

I'm the author of JobRunr, a distributed job scheduling library that also supports ElasticSearch as a storage provider.

For my integration tests, I use TestContainers and start an ElasticSearch.

@Container
    private static final ElasticsearchContainer elasticSearchContainer = new ElasticsearchContainer("docker.elastic.co/elasticsearch/elasticsearch:7.10.1")
            .withNetwork(network)
            .withNetworkAliases("elasticsearch")
            .withExposedPorts(9200);

There is one case where I always get a no_shard_available_action_exception even though the cluster health is yellow.

My code is as follows:

@Override
    protected boolean isNewMigration(NoSqlMigration noSqlMigration) {
        try {
            System.out.println("Testing for new migration...");
            waitForHealthyCluster(client);
            GetResponse migration = client.get(new GetRequest(JOBRUNR_MIGRATIONS_INDEX_NAME, substringBefore(noSqlMigration.getClassName(), "_")), RequestOptions.DEFAULT);
            return !migration.isExists();
        } catch (IOException e) {
            throw new StorageException(e);
        }
    }

And the logs:

========================================================================
Cluster health:YELLOW
========================================================================
Testing for new migration...
========================================================================
Cluster health:YELLOW
========================================================================
ERROR StatusLogger Log4j2 could not find a logging implementation. Please add log4j-core to the classpath. Using SimpleLogger to log to the console...
Exception in thread "main" ElasticsearchStatusException[Elasticsearch exception [type=no_shard_available_action_exception, reason=No shard available for [get [jobrunr_migrations][_doc][M001]: routing [null]]]]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_index_shard_state_exception, reason=CurrentState[RECOVERING] operations only allowed when shard state is one of [POST_RECOVERY, STARTED]]];

How can I be sure that ES is ready to receive GetRequests?

dadoonet · February 2, 2021, 10:38am

IIRC there's an issue on that.

What I'm doing on my side is to wait if some issues like this are happening and retry until I reach a timeout.

You can see that here:

github.com

dadoonet/fscrawler/blob/f2d21ea4e49e480400b6bcd421efbdf9681ca32a/integration-tests/it-common/src/main/java/fr/pilato/elasticsearch/crawler/fs/test/integration/AbstractITCase.java#L337-L354


long hits = awaitBusy(() -> {
    long totalHits;
    // Let's search for entries
    try {
        // Make sure we refresh indexed docs before counting
        refresh();
        response[0] = documentService.getClient().search(request);
    } catch (RuntimeException|IOException e) {
        staticLogger.warn("error caught", e);
        return -1;
    }
    totalHits = response[0].getTotalHits();
    staticLogger.debug("got so far [{}] hits on expected [{}]", totalHits, expected);
    return totalHits;
}, expected, timeout.millis(), TimeUnit.MILLISECONDS);

This is not ideal for sure but at least I don't have non stable integration tests anymore...

A thing you could do, is to check the index status (waitForHealthyIndex(JOBRUNR_MIGRATIONS_INDEX_NAME)) instead of the cluster status. But I guess you might hit the same issue though.

DavidTurner · February 2, 2021, 10:49am

An index reports yellow health if it's newly created, because we don't want clusters to indicate they're unhealthy just because a new index was created. I'm guessing it's that. Most of the Elasticsearch integration tests wait for a newly-created index to be green before proceeding.

rdehuyss · February 3, 2021, 11:13am

Hi,

Thank you both for the answers.

Will the cluster become green if there is only 1 node participating?

DavidTurner · February 3, 2021, 11:49am

Yes, as long as you set number_of_replicas: 0 on any indices you create.

system · March 3, 2021, 11:50am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
elasticsearch NoShardAvailableActionException Elasticsearch	4	6667	August 1, 2014
No Shard Available ActionException After upgrading my ELS to 2.34 Elasticsearch	6	11075	July 5, 2017
UnavailableShardsException Elasticsearch	1	3694	July 5, 2017
getting UnavailableShardsException on PUT, and NoShardAvailableActionException on Get Elasticsearch	7	1308	August 8, 2012
UnavailableShardsException on cluster Elasticsearch	1	584	February 5, 2020

No_shard_available_action_exception in integration tests

Related topics