Concepts suggester?

jy95 · October 11, 2019, 1:08am

Hello,

I would like to implement a Google-like search with Elasticsearch like this example :

For example, if I have this collection of documents where the properties are stored in a "tags" field (an array) :

[
   {
      "title":"Some Example 1",
      "tags":[
         "tag1",
         "tag2",
         "tag3"
      ],
      "difficulty":"easy"
   },
   {
      "title":"Some Example 2",
      "tags":[
         "tag3"
      ],
      "difficulty":"normal"
   },
   {
      "title":"Some Example 3",
      "tags":[
         "tag3",
         "tag1"
      ],
      "difficulty":"normal"
   }
]

Which in Formal Concept analysis gives a table like :

Document	tag1	tag2	tag3
Some Example 1	X	X	X
Some Example 2			X
Some Example 3	X		X

I would like to have a Suggester that does a "Formal concept analysis" on the "tags" field of each document ( in addition to using filters on other fields (like the "difficulty" one) ).

After reading part of the Elasticsearch docs, the Adjacency Matrix Aggregation doesn't seem to handle this complex search algorithm. Do you have any alternatives ?

Thanks for your help

Mark_Harwood · October 11, 2019, 9:21am

I'm not too familiar with "formal concept analysis" but based on a quick skim of this description it relies on studying set intersections.
I think you should be able to use elasticsearch to discover much of the required stats efficiently.

The adjacency matrix can be used to describe intersections of arbitrary sets (you use filters to define the elements you want to intersect and each filter can be a single term or a more complex bool expression to combine multiple tags)
To get stats outside of intersections the significant_terms aggregation might be of use - it returns counts of terms intersecting with your query and also the "background" stats of uses outside of your query set.
If you can say more about what your starting query is and what you hope to discover that might guide the discussion further.

jy95 · October 11, 2019, 10:53am

What I hope to discover is documents that are related to the same "concepts" ( an example in the programming world, if I search with the tags "for" and "while", the concept here is "loops" )

Concretely, my ideal query should have all these properties :

have basic filters for fields that have known boundaries ( like the "difficulty" field )
have a dynamic filter / suggest for the "tags" field (their domain is open) :

For example, at the beginning, I should get all the tags (or the 20 most used) in descending order of their use ( here, "tag3" , "tag1", "tag2" )

Then if I choose the "tag3" tag, I should the tags commonly used with it in descending order (with my given example : "tag1" and "tag2" ) so I can restrict the search.

sort the documents according to their result with what is explained in point 2 ( most relevant documents first )

Mark_Harwood · October 11, 2019, 12:24pm

I think you just want to use the significant_terms aggregation on the tags field.
Here's the significant tags for a search on StackOverflow questions talking about loops/looping:

Kibana-35

If your query is fuzzy in any way (things match to varying degrees) then you probably want to use significant terms in conjunction with the sampler aggregation.

jy95 · October 11, 2019, 6:14pm

Thanks for the clarification.
Can you illustrate that with a query for my given example ?

Mark_Harwood · October 14, 2019, 9:44am

Something like this:

GET stackoverflow/_search
{
  "query": {
	"match": {
	  "title": "loop loops looping"
	}
  },
  "aggs": {
	"my_refinement_suggestions": {
	  "significant_terms": {
		"field": "tag"
	  }
	}
  }
}

system · November 11, 2019, 9:44am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Setting up autocomplete to return tags, not documents Elasticsearch	2	882	July 6, 2017
Tag based searching Elasticsearch	11	4812	July 6, 2017
サジェスト結果に対してAggregationをかけたい Elasticsearch	1	360	December 17, 2018
ElasticSearch _suggest endpoint Elasticsearch	1	393	July 5, 2017
Suggesting search terms to search for Elasticsearch	3	340	July 6, 2017

Concepts suggester?

Related topics