App search crashing after few million records

We have set up app-search self managed instance on a linux machine. Which is fine however it is crashing after few millions of records, below are the issues

  1. We were able to write upto 14 mill records, after that the app-search is crashing with too many open connections errors. I have rebooted the server but no luck.

  2. We are writing data into app search with .Net core application which is running on the same machine, the job does two parts
    fetches data from a document based db -- upto to here it is very quick.
    the second part is writing data to app-search.. which is taking too long.. more frustrated thing is it is processing only 100,000 records per day.

  3. logstasher.log is growing very rapidly.. is there any way we can restrict this.?

  4. Is there any limit in the app-search ?

Any suggestions please.
Thank you in advance.

@Srini12

We were able to write upto 14 mill records, after that the app-search is crashing with too many open connections errors. I have rebooted the server but no luck.

Are you able to provide us with a more specific error message?

which is taking too long.. more frustrated thing is it is processing only 100,000 records per day.

How are you writing the records into App Search?

logstasher.log is growing very rapidly.. is there any way we can restrict this.?

EDITED: Removed my advice in favor of @orhantoy's response below

Is there any limit in the app-search ?

There are no document limits in App Search

Hi @Srini12 :wave:

Reg 1. The documents are indexed into Elasticsearch - have you looked into the health of your Elasticsearch cluster?
Reg 3. What version of App Search are you on, as I recall this being fixed not too long ago?

Hello Srinivas,

Sorry you're experiencing issues with indexing large amounts of data into App Search. I hope we can help you resolve those issues ASAP.

As Jason and Orhan has already pointed out, we may need a bit more information to troubleshoot your specific situation.

We were able to write upto 14 mill records, after that the app-search is crashing with too many open connections errors. I have rebooted the server but no luck.

This is concerning since we've never seen anything like that before and it'd be extremely helpful to get more information on the following:

  1. What specific errors are you seeing and where?

  2. Could you run the following command on the server running App Search when the issue is happening?

    $ ps axuww | grep java | grep app-search
    
    ^ Run this and get the value of the process id 
       (a numeric value of the second column)
      
       Then run the following and share the results with 
       us if possible (PROCESS_ID is the process if value from the previous command)
    
    $ lsof -n -p PROCESS_ID
    
    

it is processing only 100,000 records per day

Ingest rates are extremely dependent on the code performing the ingestion. If you need to ingest a lot of data relatively quickly, you need to ensure you're using batch indexing requests (up to 100 docs in one batch), using multiple parallel processing requests (processes, threads, etc depending on the code that does the ingestion) and are monitoring the health of your Elasticserarch cluster to ensure it is able to keep up with the data you're pushing into it. I'm fairly certain App Search could handle a lot more in a day than the numbers you're seeing, so I'd recommend not settling for that number and looking into ways to dramatically increase it.

Is there any limit in the app-search

There are no limits on the number of records indexed into the system or on the ingestion rate. It all depends on available resources and proper sizing of components (mainly Elasticsearch).

Thank you for using our product!

--
Oleksiy Kovyrin
App Search Tech Lead

Oh, and one more thing: if you are able to upgrade to the latest version (Enterprise Search 7.8.0 at the moment), please do – the product is relatively young and we've made tremendous improvements over the past few minor versions (like the logstasher log, that does not exist anymore and all other logs have a proper automatic log rotation configured).

1 Like