Your need is to count how much documents have the string 查询async-hbase异常 table: fraud:general_feature_m_v2, inside?
If so can you consider indexing your data using a whitespace tokenizer with multi fields.
Hi,
As @warkolm said you can use ingest API or logstash.
If you give more information about the structure of your document as I understand in your first question you have a "message" field that have a string separated with "," and containing something like a json dict but your search is about a substring from value contained in the message field.
Is it correct?
Also need to consider depends on the srtucture of your document that you will not be able to parse if it's not enough normalized. You will have too much fields.
You can read about here:
After reding the doc in your suggestion.
I still have a little doubt .so let describe in more details
My messge field represent my log information in my application like LOGGER.info(message) and also have been index into my elasticSearch (changing the index mode is not the first choose ), so it dont have a nomalized structure.
Now I want to know if I can aggregate message field only contain “查询async-hbase异常 table”.
My expect result will be like this:
[{"key":"查询async-hbase异常 table:xx",
"docCount":55},
{"key":"查询async-hbase异常 table:yy",
"docCount":99}]
Apache, Apache Lucene, Apache Hadoop, Hadoop, HDFS and the yellow elephant
logo are trademarks of the
Apache Software Foundation
in the United States and/or other countries.