Hi there,
It's my first post here and I'm still a beginner regarding ES, so please feel free to help with guidelines I might have forgotten when posting this
I took over a rather old project (late 2017) that was using ES 5.5 and was working fine at the time, and have to update its components (this is my first time using ES). ES is used there to handle indexing and searching documents, in what is basically a book collection; two indices handle searching into the "structure" (titles, chapters, etc.) and four other indices, more comprehensive, handle full-text indexing.
While fixing the breaking changes from 5.5 to 7.10 went smoothly up until now (including dividing mapping types into multiple indices), and while my indexing scripts run fine, I run into some issues when performing full-text search on my application.
I get the following errors, observed when looking at my server logs (Tomcat/Catalina):
Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [ordnum] in order to load field data by uninverting the inverted index. Note that this can use significant memory.
Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [typedoc] in order to load field data by uninverting the inverted index. Note that this can use significant memory.
Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [ordnum] in order to load field data by uninverting the inverted index. Note that this can use significant memory.
ordnum
is an ordering number, and typedoc
a short string used as in internal "type" descriptor. Both are defined as such in indices (typedoc
is only used in one index, ordnum
is used in three indices with the exact same definition every time):
{
"settings": {
[...]
},
"mappings": {
"properties": {
"typedoc": {
"type":"keyword",
"index":"true"
},
[...]
"ordnum": {
"type":"long"
},
[...]
}
}
}
Note that both of these fields were previously set with index
to not_analyzed
. Since this has disappeared in more recent versions of ES, I changed to true
for the keyword and removed it for the long. (I also tried changing the long to a keyword, but that didn't solve anything.)
While the entire query is too obtuse and "private" to disclose here, I can tell you that typedoc
is aggregated as such (among numerous columns in the "aggregations" query field) : "typedoc": { "terms" : { "field" : "typedoc", "size" : 5 } },
, while is used for sorting purposes in the sort
field like this: "sort": [{"numvol": { "order": "asc" }}, {"ordnum": { "order": "asc" }}],
.
While I do understand some of the problems here (e.g. ordnum
is a numeric field and sorting is thus disabled), I don't really grasp some things at play here.
-
I've seen many possibilities to solve these issues: change the numeric fields to keywords, use the "fielddata" attribute set to 'true' (but it's not available on numeric types though), use the "doc_values" attribute set to 'true', using multi-type fields... What would be the best solution here? What are the main differences between all of those?
-
Why does aggregating on
typedoc
bring an error despite it being a keyword-type field?
Thanks in advance for your help!