Matching by array elements

There isn't an efficient way to do this but I can think of two OK ways to do it and I'm not sure which one is best:

  1. Index the documents with the string array just like that and search using a bool query (bool filter in 1.x) with each term in the OR clause, probably wrapped in a constant_score query. Then you use a script to recheck all the found documents for your restriction.

  2. Index the documents with an extra field that smooshes together the string array. So it'd be {"stringArray": ["A", "B"], "smooshedStringArray": "A_B"}. Then use an appropriate bool query on those. So if the user input is ["A', "B", "C"] then the query is:

"bool": {
  "should": [
    { "term": {"smooshedStringArray": "A"} },
    { "term": {"smooshedStringArray": "B"} },
    { "term": {"smooshedStringArray": "C"} },
    { "term": {"smooshedStringArray": "A_B"} },
    { "term": {"smooshedStringArray": "A_C"} },
    { "term": {"smooshedStringArray": "B_C"} },
    { "term": {"smooshedStringArray": "A_B_C"} }
  ]
}

The first way is going to store less data and be fast if your query is more selective without the script. The second way is going to be faster if some of the terms are really common but it won't work at all on large arrays.

You might be able to think of a way to combine them if you think harder than I have about the problem.