Replacing SQL geography intersections with ElasticSearch

Hello,

I am looking for a solution to intersect data within ElasticSearch from two seperate indexes. Both indexes would contain documents that I want to intersect geographically and get a list of id's as a result of which documents intersect documents from the opposite index.

Here is a little background on my issue:

The company I work for has been transitioning it's data from MS SQL into Elastic Search for our web application. It is a heavily geographic data driven application. We have ETL jobs that pre process our data before we index it in ElasticSearch. The ETLs run a list of geographic intersections in MS SQL, but they are pretty slow.

For example, we intersect 10 million land parcel shapes with 100,000 flood zone shapes. If we manually run this intersection in MS SQL it can take up to seven days of consecutive processing. This is with proper indexing and such.

In our application we do use ElasticSearch to intersect shapes from the indexes with geojson passed in, so I could just iterate through all of the records and do the intersection that way, but I am curious if there is a simple way in ElasticSearch to take two indexes and give me their intersecting documents.

-Jamie

You can search on multiple indices at once - Search | Elasticsearch Guide [2.3] | Elastic - or is that not what you want and I am missing something....

Elasticsearch :slight_smile:

I am not sure I understand the process fully, but if you are looking to identify matching records of one type while ingesting records of the other type you might want to look into using the percolator functionality.

Nice catch on my casing, ha. Doing a search on multiple indices would definitely be useful, but not quite what I am hoping for.

Basically I just need to go through one index's documents and query it to see which documents intersect document's of another index. I could loop through each of the documents on one side and use it's shape to write a query for the other index, but I am concerned that the performance on this may not be ideal.

I was hoping something was set up for this already before I start testing with this method. The geographic searches are so fast in elastic search that it probably would be pretty efficient, but I figured i'd look into it more beforehand.

Thank!

A close example is at the URL below. The section about using "Pre-Indexed Shapes". This basically is saying write a query that intersects a specific document's shape. I want to accomplish this, but without having to loop through every shape on one index and querying the other.

https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-geo-shape-query.html

You need to do that in your code.

Or use percolation as Christian mentioned.