Elasticsearch and picklists?

esearch · July 26, 2017, 9:02pm

Is there a way to use Elasticsearch to implement picklists WITHOUT storing all the data?

For example, say I have the following data:

POST food/dessert/1
{
    "name" : "banana creme pie",
    "type" : "pie"
}

POST food/dessert/2
{
    "name" : "tiramisu",
    "type" : "cake"
}

POST food/dessert/3
{
    "name" : "black forest",
    "type" : "cake"
}

POST food/dessert/4
{
    "name" : "tiramisu",
    "type" : "cake"
}

I'm making an application that shows you all the possible values for any field, so if you're searching for "type", it'll show you "cake" and "pie". If you're searching for "name", it'll show you "tiramisu", "black forest", etc..

Note that there's a lot of duplicate data here. "cake" is repeated multiple times. All I really need to store is a set of strings for each field.

I'm not sure if I'm trying to force Elasticsearch onto my problem or I'm not seeing how to use Elasticsearch correctly.

Please advice. Thanks in advance!

warkolm · July 26, 2017, 9:40pm

Depends what the problem is. Elasticsearch will do this no problems, it also does compression on fields (but not deduplication), so it'll store things efficiently.

esearch · July 26, 2017, 9:47pm

Thanks for your reply. The problem is implementing picklists with Elasticsearch without storing all the data. The example given was just a few examples, but the real data will have tens of millions of entries and there will be a lot of duplicates for a specific field and there will be dozens of fields per record, so I don't want to store the source for each individual record. Does that make sense?

warkolm · July 26, 2017, 11:01pm

It does.

However even with tens of millions of records I don't think there's any problem here. If you are so resource restricted that you cannot store this data, then look at setting the field mappings to "store": false.

system · August 23, 2017, 11:01pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.