Super-long wildcard query (must be a better way!)

Hi All,

I am playing with some flight data at the moment, and wanted to view the Eastbourne Airshow data I have captured with my aerial & dump1090 integration. To this end, I want to exclude all commercial flights from the dataset, but, well, there are 5400(ish) ICAO 3-letter airline codes registered atm.

I constructed the following query:

{"query":{"bool": {"should": [
{"wildcard":{"aircraft.flight":"AAA*"}},
{"wildcard":{"aircraft.flight":"AAB*"}},
{"wildcard":{"aircraft.flight":"AAC*"}},
{"wildcard":{"aircraft.flight":"AAD*"}},
{"wildcard":{"aircraft.flight":"AAF*"}},
{"wildcard":{"aircraft.flight":"AAG*"}},
{"wildcard":{"aircraft.flight":"AAH*"}},
{"wildcard":{"aircraft.flight":"AAI*"}},
{"wildcard":{"aircraft.flight":"AAJ*"}},
{"wildcard":{"aircraft.flight":"AAK*"}},
{"wildcard":{"aircraft.flight":"AAL*"}},
{"wildcard":{"aircraft.flight":"AAM*"}},
{"wildcard":{"aircraft.flight":"AAO*"}},
{"wildcard":{"aircraft.flight":"AAP*"}},
**snipped for brevity**
{"wildcard":{"aircraft.flight":"ZZM*"}}
]}}}

Doing a quick ciggy packet calc, there are 17576 combinations (A-Z^3), so using an exclude isn't going to be more efficient.

How on earth should I break this down into a meaningful dataset and exclude with ES?

Additional: i want to exclude anything which is [A-Z][A-Z][A-Z] then some numbers - although (awkwardly) some airlines then use letters afters too :frowning: Like NZE1234A...

Cheers,

Richie

In the end I decided to just go with the best I could:

{
  "query": {
    "regexp": {
      "aircraft.flight.keyword": "[A-Z]{3}[0-9]{1}.*"
    }
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.