How to model a live web application in ElasticSearch?

(bdb) #1

In our web application we use a denormalized data mart in SQL Server for geo-based user project content.

Users have 1..*projects, 1..*geo areas. Content is stored (in the data mart) with UserID, ProjectID, text values for geo areas, title and description (both free text search indexed):

UserID, ProjectID, Geo, Title, Description, Timestamp

Now wanting to move this over to ElasticSearch, what would be a good data modeling approach?

Simply for the data mart, I was thinking of just serializing the data object (currently using .Net and EntityFramework) to give me the JSON representation and stuffing that into ES. Is this a good approach (also requires least re-work)?

With regards to modeling the entire application, I have seen examples where an ES type would be organized by, say Users, so the model may look something like this:

  User ID, Name, etc...
     Setting1, Setting2, etc...           
     GeoID, GeoName
     ProjectID, ProjectName
        Key (UserID:ProjectID:ProjectContentID), GeoName, Title, Description, Timestamp

So this looks like the whole web application could run off of one index/type. A bit scary, no? I'm just trying to wrap my head around creating a denormalized data model in ES for a web app.

I would like to use Kibana and other analysis tools in the future, and have read about data modeling limitations like not using parent/child types.

What is would a good ElasticSearch data model look like for something like this?

Another way of asking would be, how would one model a live web application using ElasticSearch, and/or would it be better to store user configs and profiles in a separate RDBMS?

Thank you.

(Mark Walkom) #2

The best way to map your data is going to be based on what you want to look at. Your flattened structure looks good, but just pick somewhere and then iterate, cause you will never pick the right version first up :smile:

P&C is good for the ES version of relationships, but as you point out it won't work with KB. Instead you could build multiple, flattened views of the data that suit how you want to analyse things.

(system) #3