Help needed with MUST_NOT in a nested query


(Carlos Castellanos) #1

Hi there,

I have a recipients index containing a nested activity object which stores event types and campaign ids.

Here is an example:

{
    email: 'm1@example.com',
    userId: 'user-id'
    listId: 'list-id'
    campaignActivity: [
       { event: 'received', campaignId: 'c1', timestamp: 1 },
       { event: 'received', campaignId: 'c3', timestamp: 3 }
    ]
}

I'm not able to get working a query to get all recipients who didn't receive some of the campaigns.

Here is my current mapping:

{
  "recipients_index_test": {
    "mappings": {
      "recipients": {
        "properties": {
          "campaignActivity": {
            "type": "nested",
            "properties": {
              "campaignId": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "event": {
                "type": "text",
                "fields": {
                  "keyword": {
                    "type": "keyword",
                    "ignore_above": 256
                  }
                }
              },
              "timestamp": {
                "type": "long"
              }
            }
          },
          "email": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "id": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "listId": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          },
          "userId": {
            "type": "text",
            "fields": {
              "keyword": {
                "type": "keyword",
                "ignore_above": 256
              }
            }
          }
        }
      }
    }
  }
}

Here is the sample data:

{
  email: 'nm1@example.com',
  userId: 'user-id',
  listId: 'list-id',
  campaignActivity: [
    { event: 'received', campaignId: 'c1', timestamp: 1 },
    { event: 'received', campaignId: 'c2', timestamp: 2 },
    { event: 'received', campaignId: 'c3', timestamp: 3 },
    { event: 'received', campaignId: 'c4', timestamp: 4 }
  ]
},
{
  email: 'nm2@example.com',
  userId: 'user-id',
  listId: 'list-id',
  campaignActivity: [
    { event: 'received', campaignId: 'c1', timestamp: 1 }
  ]
},
{
  email: 'm1@example.com',
  userId: 'user-id',
  listId: 'list-id',
  campaignActivity: []
},
{
  email: 'm2@example.com',
  userId: 'user-id',
  listId: 'list-id',
  campaignActivity: [
    { event: 'received', campaignId: 'c1', timestamp: 1 },
    { event: 'received', campaignId: 'c3', timestamp: 3 }
  ]
},
{
  email: 'm3@example.com',
  userId: 'user-id',
  listId: 'list-id',
  campaignActivity: [
    { event: 'received', campaignId: 'c2', timestamp: 2 },
    { event: 'received', campaignId: 'c3', timestamp: 3 }
  ]
}

I want to get recipients who didn't receive some of c2, c3 or c4 campaigns (by excluding recipients who received all campaigns c2, c3 and c4) This is what I've been trying without luck:

{
  "query": {
    "bool": {
      "must_not": [
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "filter": [
                  {
                    "term": {
                      "campaignActivity.campaignId.keyword": "c4"
                    }
                  },
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "filter": [
                  {
                    "term": {
                      "campaignActivity.campaignId.keyword": "c3"
                    }
                  },
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "filter": [
                  {
                    "term": {
                      "campaignActivity.campaignId.keyword": "c2"
                    }
                  },
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 10
}

Above query returns nm2@example.com and m1@example.com but I'm looking for a result with m1@example.com, m2@example.com, and m3@example.com

Any help it's appreciated.
Carlos


(Carlos Castellanos) #2

UPDATE 1:

This is the closest I've gotten so far:

{
  "query": {
    "bool": {
      "filter": [
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "filter": [
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  },
                  {
                    "terms": {
                      "campaignActivity.campaignId.keyword": [
                        "c4",
                        "c3",
                        "c2"
                      ]
                    }
                  }
                ]
              }
            }
          }
        }
      ],
      "should": [
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "must_not": [
                  {
                    "term": {
                      "campaignActivity.campaignId.keyword": "c4"
                    }
                  },
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "must_not": [
                  {
                    "term": {
                      "campaignActivity.campaignId.keyword": "c3"
                    }
                  },
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  }
                ]
              }
            }
          }
        },
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "must_not": [
                  {
                    "term": {
                      "campaignActivity.campaignId.keyword": "c2"
                    }
                  },
                  {
                    "term": {
                      "campaignActivity.event.keyword": "received"
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 10
}

But it still returns nm1@example.com which shouldn't be part of the result because it received all of the campaigns (c2,c3,c4) and is missing m1@example.com.


(Carlos Castellanos) #3

UPDATE 2: I managed to get it working with this query

{
  "query": {
    "bool": {
      "should": [
        {
          "bool": {
            "must_not": [
              {
                "nested": {
                  "path": "campaignActivity",
                  "query": {
                    "bool": {
                      "filter": [
                        {
                          "term": {
                            "campaignActivity.campaignId.keyword": "c4"
                          }
                        }
                      ]
                    }
                  }
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must_not": [
              {
                "nested": {
                  "path": "campaignActivity",
                  "query": {
                    "bool": {
                      "filter": [
                        {
                          "term": {
                            "campaignActivity.campaignId.keyword": "c3"
                          }
                        }
                      ]
                    }
                  }
                }
              }
            ]
          }
        },
        {
          "bool": {
            "must_not": [
              {
                "nested": {
                  "path": "campaignActivity",
                  "query": {
                    "bool": {
                      "filter": [
                        {
                          "term": {
                            "campaignActivity.campaignId.keyword": "c2"
                          }
                        }
                      ]
                    }
                  }
                }
              }
            ]
          }
        }
      ]
    }
  },
  "post_filter": {
    "bool": {
      "should": [
        {
          "bool": {
            "must_not": [
              {
                "nested": {
                  "path": "campaignActivity",
                  "query": {
                    "bool": {
                      "filter": {
                        "exists": {
                          "field": "campaignActivity"
                        }
                      }
                    }
                  }
                }
              }
            ]
          }
        },
        {
          "nested": {
            "path": "campaignActivity",
            "query": {
              "bool": {
                "filter": [
                  {
                    "terms": {
                      "campaignActivity.campaignId.keyword": [
                        "c4",
                        "c3",
                        "c2"
                      ]
                    }
                  }
                ]
              }
            }
          }
        }
      ]
    }
  },
  "from": 0,
  "size": 10
}

It works but I wonder if there are better ways to do this. Hope it helps to someone in the future.


(Rahul Desai) #4

You can try something like

{
	"query" : {
		"bool" : {
			"must" : [
				{
					"bool" : {
						"should" : [
							{
								//match c2
							},
							{
								//match c3
							},
							{
								//match c4
							}
						]
					}
				},
				{
					"bool" : {
						"must_not" : [
							{
								"bool" : {
									"must" : [
										{
											//match c2
										},
										{
											//match c3
										},
										{
											//match c4
										}
									]
								}
							}
						]
					}
				}
			]
		}
	}
}

(system) #5

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.