Common fields

Hi !

My proxy logs provide "URL" and "user" fields.

I would like to discover all the common "URL" between 2 user.
Is it possible ?

Thank you,

Florent

If you are talking about any pair of users this could be possible using a combination of the terms and cardinality aggregations but would require some tricks to scale if you have millions of unique urls and distributed indices/shards. Is that the case ?
If you are talking about a specific pair of users then a query for them with the terms aggregation on the url field and a cardinality agg on users should suffice.

Yes, I am talking about a specific pair of users.
For instance, what are the common "URL" between "source_login":user1 and "source_login":user2 ?

The problem is that the OR request ("source_login":user1 OR "source_login":user2) provide the UNION of accessed "URL".
But I don't know how I can get the INTERSECTION of accessed "URL" for these two users ?

Thank you for your help,

Regards,

Florent

Try this:

DELETE test
PUT test
{
  "settings": {
	"number_of_shards": 1,
	"number_of_replicas": 0
  },
  "mappings": {
	"_doc":{
	  "properties":{
		"url":{
		  "type":"keyword"  
		},
		"user":{
		  "type":"keyword"  
		}
	
	  }
	}
  }
}
POST test/_doc/_bulk
{"index":{}}
{"user":"user1",  "url":"url1"}
{"index":{}}
{"user":"user1",  "url":"url2"}
{"index":{}}
{"user":"user2",  "url":"url2"}
{"index":{}}
{"user":"user2",  "url":"url2"}
{"index":{}}
{"user":"user2",  "url":"url3"}
{"index":{}}
{"user":"user3",  "url":"url3"}
{"index":{}}
{"user":"user3",  "url":"url4"}

GET test/_search
{
  "query": {
	"terms":{
	  "user":["user1", "user2"]
	}
  },
  "size":0,
  "aggs":{
	"urls":{
	  "terms":{
		"field":"url",
		"min_doc_count": 2,
		"order": {
		  "numUsers": "desc"
		}
	  },
	  "aggs":{
		"numUsers":{
		  "cardinality": {
			"field": "user"
		  }
		}
	  }
	}
  }
}

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.