I have the location(geo points) data updated every 5 minutes for millions of users. We have to search users with specific attributes(age,interests,languages) & in particular geo range. Wanted to understand the right strategy to store such data in elastic.
Option1
Create user document with following keys
- user Metadata & attributes (age, interests, languages,salary etc around 8-10 searchable attributes)
- Live location (changing every few minutes)
"liveLocation" : {
"type" : "Point",
"coordinates" : [-72.333077, 30.856567]
}
- location data - multiple addresses - home address, work address etc along with geo points. (not updated frequently)
"addresses" :
[
{
"type" : "home",
"address" : "first floor, xyz, near landmark",
"city" : "Newyork",
"country" : "Country",
"zipcode" : "US1029",
"location" : {
"type" : "Point",
"coordinates" : [-73.856077, 40.848447]
},
{
... more atype of addresses
}
]
We want to perform geo search queries over all the geo type fields. My worry - live location for users will be updated quite frequently.
Q1. Will this be a viable option considering frequent updates ?
Option2
- Treat every location update as a time series data and insert a new document. This will avoid updating the documents. instead will insert new documents for each user every few minutes.
Q2. while searching all the users(home/office/live location) in a particular geo polygon, I have to consider only the most recently updated documents for each user. How to do that in elastic ?
Q3. We have to search users with specific attributes(age,interests,language) & in particular geo polygon. If option2 is preferable should user attribute-metadata & location updates be treated as parent-child relationship ?
Q4. Conclusion - What should be the right approach .