Mapping advice

Hello,

I am extremely new to elasticsearch so be gentle, please.

I would like to get some advice on a mapping problem I am facing. I know that a type, once set, cannot be changed so if it is a string has to be a string throughout the same index.

Let's say I have got recipes to store in ES. Recipes have ingredients and each ingredient has a name and a quantity

"ingredients": [
    {"name": "Eggs", "quantity": 3},
    ...
]

The problem I have got is that quantity can be expressed numerically or as a string (e.g. 'a tablespoon')

So the question is shall I treat everything as a string or are there some clever solutions to this very kind of mapping problem? And If I treat everything as a string, would I loose the possibility to query something like "give me recipes where eggs > 3"?

Thank you for your advices.
m.

We are always happy welcoming new users! :slight_smile:

Well. My guess is that it won't work unless you send "quantity": "3".

Yes. That's a problem.

What you could potentially do is to create a subfield and use for this subfield the ignore_malformed option: ignore_malformed | Elasticsearch Guide [2.3] | Elastic

So something like:

"quantity": {
  "type": "string",
  "fields": {
    "numeric": {
      "type": "integer",
      "ignore_malformed": true
    }
  }
}

I did not test it but that could work.

It means that you would be able to search for numerical values in quantity.numeric.

In 5.0.0, you can use Node Ingest which would help you to transform your source by applying some mutation. And you can ignore failures as well.

1 Like

Hello @dadoonet,

thank you for you reply.

After reading the ignore_malformed docs, I am not sure to understand what is going to be ignored in this case.

To store a recipe I would do something like this I suppose:

PUT recipes/tarte/1
{
    ...
    "ingredients": [
        {"name": "Eggs", "quantity": 3},
        {"name": "Salt", "quantity": "a pinch"},
        ...
    ]
    ...
}

So, shouldn't ES complain that I put "quantity": 3 since my mapping declare it as a string? Do I have to add ignore_malformed to quantity as well?

Or maybe I should do something like:

PUT recipes/tarte/1
{
    ...
    "ingredients": [
        {"name": "Eggs", "quantity": "3"},
        {"name": "Salt", "quantity": "a pinch"},
        ...
    ]
    ...
}

So now the numeric field won't complain because it has ignore_malformed but does this mean that the the quantity for eggs has not been indexed?

I am a bit confused, as you have might noticed :smiley:
m

I believe you have to do the later but I did not test it...

Well. Should not be difficult to test both scenarios :stuck_out_tongue:

UPDATE

I just realised that I was running the search on quantity and not on quantity.numeric... damn you "Cut & Paste"...

It works! nicely.

Ignore anything below this line!


Ok, I've run some tests and they are not satisfactory at all.

Firstly I do the mappings somehow like @dadoonet suggested:

"quantity": {
  "type": "string",
  "fields": {
    "numeric": {
      "type": "integer",
      "ignore_malformed": true
    }
  }
}

Then I'll input some data like so:

PUT recipes/tarte/1
{
    ...
    "ingredients": [
        {"name": "Eggs", "quantity": "3"},
        {"name": "Salt", "quantity": "a pinch"},
        ...
    ]
    ...
}

PUT recipes/tarte/2
{
    ...
    "ingredients": [
        {"name": "Eggs", "quantity": "6"},
        {"name": "Sugar", "quantity": "200 gr."},
        ...
    ]
    ...
}

if now I query for recipes with any ingredient which has a quantity more than 5 I get the second one has expected, but unfortunately the quantity.numeric field even if it is an integer it is still sorted lexicographically, in fact if I put a recipes with 20 eggs and query again for something with quantity greater than 5 I still get one recipe back. Only If I query for gte 2 I also get the 20 eggs recipe.

So, although it was promising, the trick is useless due to how the 'integer' is still treated as a 'string'.

I will need to re-think the structure I guess.