Default _ttl causes MapperParsingException due to already expired document

mallox · May 8, 2014, 5:17pm

I have a river importing data from Big Query and I import it into an index
via bulk that has a default _ttl of 30 days configured. I don't set the ttl
anywhere on the document when importing, so every document should just get
the ttl set from the default value.

Unfortunately though I keep getting exceptions such as this one:

2014-05-08 00:04:54,819][DEBUG][action.index ] [prod_log_3]
[prod_-2014.05.08][4], node[7iHEb2ciTsGSaR3LxKbw8w], [P], s[STARTED]:
Failed to execute [index
{[prod_-2014.05.08][logging][g-6CRDhSS2OksWfF3OpCTg],
source[{"message":"Message returned successfully. Size:
2","timestamp":"1399507397000","level":"INFO","mdc":"{"time_received":"1399507397320","time_responded":"1399507397333","user_device":""xxx"","response_length":"2","user_anchor":"0","response_size":"2","returned_models":"0","user_tag":""production"","user_model":""5.06""}","thread":"Request
717C91C3","logger":my.pkg.Servlet"}]}]
org.elasticsearch.index.mapper.MapperParsingException: failed to parse
[_ttl]
at
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:418)
at
org.elasticsearch.index.mapper.internal.TTLFieldMapper.postParse(TTLFieldMapper.java:177)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:523)
at
org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:462)
at
org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:363)
at
org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:215)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:556)
at
org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:426)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:744)
Caused by: org.elasticsearch.index.AlreadyExpiredException: already expired
[prod_context_eng-2014.05.08]/[logging]/[g-6CRDhSS2OksWfF3OpCTg] due to
expire at [3991507494] and was processed at [1399507494819]
at
org.elasticsearch.index.mapper.internal.TTLFieldMapper.innerParseCreateField(TTLFieldMapper.java:215)
at
org.elasticsearch.index.mapper.core.NumberFieldMapper.parseCreateField(NumberFieldMapper.java:215)
at
org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:408)
... 10 more

The mapping for the index looks like this:

logging: {
_timestamp: {
enabled: true
},
_ttl: {
enabled: true,
default: 2592000000
},
properties: {
timestamp: {
type: string
},
message: {
type: string
},
level: {
type: string
},
mdc: {
type: string
},
thread: {
type: string
},
logger: {
type: string
}
}
}

I've checked if the clocks on each of the three nodes is in sync, and there
was only negligible skew.
The cluster is running ES version 1.1.1 on GCE using standard n1 instances
with dedicated disks.

The connector used for nodes to find each other
is https://github.com/mallocator/Elasticsearch-GCE-Discovery

The river used to import data
is https://github.com/mallocator/Elasticsearch-BigQuery-River

Any suggestions on what I can do to fix/improve this issue would be very
welcome.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/dc0d6f32-fc06-4598-9f85-f78e6d342cb3%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Benjamin_Deveze · May 8, 2014, 6:17pm

Hi Ravi,

After a quick investigation I would say that the problem is here:

github.com

mallocator/Elasticsearch-BigQuery-River/blob/master/src/main/java/org/elasticsearch/river/bigquery/BigQueryRiver.java#L391


		/**
		 * Mostly just write the data you got from BigQuery into ElasticSearch.
		 * 
		 * @throws ElasticsearchException
		 * @throws IOException
		 * @throws InterruptedException
		 */
		private void parse(@Nonnull final List<TableRow> rows) throws ElasticsearchException, IOException, InterruptedException {
			final int size = rows.size();
			int count = 0;
			final String timestamp = String.valueOf(System.currentTimeMillis());
			int progress = 0;
			logger.info("Got {} results from BigQuery database", size);


			while (!stopThread && !rows.isEmpty()) {
				final TableRow row = rows.remove(0);
				final String[] jsonResult = getJson(row);
				final String source = jsonResult[0];
				final IndexRequestBuilder builder = esClient.prepareIndex(index, type);
				if (jsonResult[1] != null) {
					builder.setId(jsonResult[1]);

The timestamp should be set in milliseconds so removing the / 1000 should
solve your issue.

Hope this help

--
Benjamin DEVEZE

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CABecc28Fa7eM_ixGWzkkhw8c6ACOzc0FX-tkWf51uxFJKJBBbQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

mallox · May 8, 2014, 8:05pm

Wow, awesome! That was quick.

Thanks a lot.

On Thursday, May 8, 2014 11:17:36 AM UTC-7, Benjamin Devèze wrote:

Hi Ravi,

After a quick investigation I would say that the problem is here:

https://github.com/mallocator/Elasticsearch-BigQuery-River/blob/master/src/main/java/org/elasticsearch/river/bigquery/BigQueryRiver.java#L391

The timestamp should be set in milliseconds so removing the / 1000 should
solve your issue.

Hope this help

--
Benjamin DEVEZE

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/24313452-97da-4dbd-af7a-1ed6003d93d8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Topic		Replies	Views
ES consistently giving not able to parse _ttl exception Elasticsearch	12	1034	July 6, 2017
Renew _ttl on an already expired document Elasticsearch	1	585	July 5, 2017
TTL value must be > 0. Illegal value provided Elasticsearch	2	513	July 6, 2017
Total TTL confusion Elasticsearch	1	624	July 6, 2017
_ttl mixing up field types issue Elasticsearch	2	992	July 20, 2016

Default _ttl causes MapperParsingException due to already expired document

Related topics