Is it possible use GCP Dataflow to export BigQuery data to App Search?

Total newbie to product so bear with me. I'd like to use App Search for web app but have 1.5 records in BigQuery we would like to search. I've been using the GCP MarketPlace integration and the Dataflow template to export records in App Search with mixed results. I am using Elastic Cloud and the Enterprise Search features.

My question(s):

  1. Is the App Search import api different from standard Elasticsearch and can that be used to import data?
  2. For high volume data export operations, is App Search the correct platform for supporting end user search?
  3. Is it possible to export to Elasticsearch then migrate the documents to the App Search engine?

Thanks so much. As I said I'm totally new to platform and am learning by failing.

Hi @David_Robbins,

Good questions.

This is the API to add/update documents in App Search. It is different from Elasticsearch's APIs. While App Search uses Elasticsearch under the hood, there is not currently a way for you to index data directly into Elasticsearch and then use App Search to query that data, or vice-versa.

App Search's platform is capable of supporting high volumes of data, and provides powerful end-user search capabilities. However, there are at least two "gotchas" that I can think of off the top of my head:

  1. The App Search Documents API limits you to 100 documents per batch. If it's taking too long to index all your documents in 100-document batches, you can have multiple threads/processes all sending batches at once.
  2. If you are observing performance issues, you may want to check your cluster size and resources. In Elastic Cloud, the default cluster size may be too small for high volumes. You should be able to easily scale up. Since you are an Elastic Cloud customer, you're entitled to official support, so don't hesitate to reach out to them if you can't find what you need here on the discuss forums.

Does that help?

Thanks Sean, yes it does. I am trying to cheat and use the Dataflow template from BigQuery to avoid monitoring a python apps running in parallel, as the formula of 100 at time divided by 1.5 mil is the big obstacle.

As I said, I can import in some fashion with Dataflow but the App Search Documents is not behaving as it should so I suspect I have schema issues as an unintended result. It sounds like the Dataflow template is not a supported avenue, even though it seems to somewhat function.

Thanks again.

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.