Datafeed query for Machine Learning, Got more than 10000 patitions

Hi, Im currently using this query in the datafeed of my jobs in ML:

{
  "bool": {
    "filter": [
      {
        "bool": {
          "should": [
            {
              "exists": {
                "field": "cpu_per"
              }
            }
          ],
          "minimum_should_match": 1
        }
      },
      {
        "bool": {
          "should": [
            {
              "match_phrase": {
                "grupo.keyword": "Datacenter"
              }
            }
          ],
          "minimum_should_match": 1
        }
      }
    ]
  }

but I still get about 18.000 partitions, so I need a way to split the documents to have less than 10000 partitions, and two jobs for the same index and detector, what would be the best way to do this?

Your question needs a lot more context for us to understand. Please describe your use-case, what you want to accomplish, what the data looks like, what your proposed ML job config is, etc.

What field has 18,000 values?

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.