I want to set the type of all fields to to wildcard field when searching log, no matter how big the value is or how many unique values there are.Is this possible?
I have a little performance tests, it will bring a 30% increase in storage and write costs.It looks fine for me.But I'm not sure if the cost goes up more in some special logs.
And there is a known problem using wildcard field with Kibana. KQL doesn't support wildcard field.
Making all your fields wildcard type typically would not be considered a best practice. Typically you would keep original event and perhaps make that a wildcard field to search but not every field.
This is not to say you cannot do make all your fields wildcard type, just that would be a bit unusual.
There are performance and capability advantages to correctly typing your data.
Wildcard fields can Take quite a bit of storage and are not always the most performant from a Search perspective.
It really Depends on the kind of searches you're going to do..
Do you really need a a wildcard type or would perhaps match_only_text might be better see here.
The difference is do you really need to do a search like *string* or If you use a a text type which is tokenizes the input then you just search on your string .
That is not actually correct, but a common misunderstanding . KQL supports wildcards see here. You just can't use the double quotes with it "
KQL correct
my_field : string*
KQL incorrect the * is interpreted as a literal
my_field : "string*"
Perhaps if you provided some samples of the log lines and the types of searches you want to perform we could provide some guidance... again you can absolutely set every field to wildcard it just may have consequences depending on the type of the data and when you data grows.
One thing to remember with wildcard or regex searches that match anywhere in the string:
Keyword field search costs are linear with the number of unique terms Wildcard field search costs are linear with the number of candidate matches.
So if your field has a small number of values (eg month-of-the-year would have 12) then each of those values will have many associated docs. Each one of those docs’ values will need to be decompressed to check if it matches the regex.
TL/DR: don’t use wildcard field for fields with small numbers of possible values
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.