AppSearch logs all the API requests and responses. Most of the time it returns a 200 status code - even where there have been errors ingesting the payload. Thats fine, because I get that the request was successful, however..... there is no (optimal) way to identify ingest errors from appsearch/elastic - it requires the sender to notice the errors and logs them, in this case I don't have that control.
I am ingesting data, 10 records per batch, some succeed and have a 'errors:' in the response, some don't and have a 'errors[xxxxx]' in the response.
There is no way in the AppSearch UI to filter to find these. There is "a" way in elastic to find these records, you need to know the error message and..... my node ran out of memory and restarted - yikes.
So, to summarise - there is no way (that I know of) to find AppSearch ingest errors so that I can take action to remediate the data being sent.
I would appreciate a field that says "ingest_errors" : [ "abc", "efg" ]"
You're correct in that the best way to handle indexing errors is in the logic making the indexing requests. It's going to be cumbersome to catch them by inspecting data after the fact.
There's no way to filter in the API Logs UI on indexing requests where one or more documents had an error. The only way I can think of is manually searching over the Elasticsearch index that contains the API Logs data. Note that based on your App Search deploy's ILM policies, I'd expect the data to be routinely aged out. To complicate matters more, the index name will change often based on that same ILM activity, so you'll probably have to search over a pattern instead. Finally, because we're referencing "under the hood" access to data, there's no guarantee that this will continue to work in new versions.
On a fresh Cloud deploy, this is where I could find document indexing errors from my indexing API requests: .ent-search-api-ecs-ilm-logs-production-2021.09.06-000001.
Thanks for this. The problems is less with ILM and more with the way the error is stored in the response. Is not very searchable at all, and puts a lot of strain on the cluster. To the point that it’s not searchable.
Anyone reading this.. don’t try to search it on a production cluster, you’ll break it!
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.