Elasticsearch graph query not working

James_Crone · May 31, 2016, 6:39am

i am new in elastic search graph.i have installed successfully but when i try to choose index name,fields etc no data return.
Here's my query:
{
"query": {
"query_string": {
"default_field": "_all",
"query": "male"
}
},
"controls": {
"use_significance": true,
"sample_size": 100,
"timeout": 100000
},
"connections": {
"vertices": [
{
"field": "gender",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_fname",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_lname",
"size": 5,
"min_doc_count": 3
}
]
},
"vertices": [
{
"field": "gender",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_fname",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_lname",
"size": 5,
"min_doc_count": 3
}
]
}

if i change this query to, and try to post data by postman:

{
"query": {
"match": {
"gender": "male."
}
},
"controls": {
"use_significance": true,
"sample_size": 100,
"timeout": 100000
},
"connections": {
"vertices": [
{
"field": "gender",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_fname",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_lname",
"size": 5,
"min_doc_count": 3
}
]
},
"vertices": [
{
"field": "gender",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_fname",
"size": 5,
"min_doc_count": 3
},
{
"field": "persona_lname",
"size": 5,
"min_doc_count": 3
}
]
}

it returns data with vertices and weight, what should i do in graph setting? or what i am missing?

Mark_Harwood · May 31, 2016, 9:18am

Looking at the first example query I'm not sure what problem you're trying to solve using graph? They don't look like useful fields to draw in a network. It's important to start with an idea of what might be a useful thing to do with your data. I'll give a real case of something useful at the end of this post but for now let's break down what is happening in your example.

I presume you have one doc per person with gender, fname and lname.
Breaking your request down the steps are:

query for all males
Find significant values in these fields:
a) gender - clearly male will be the only gender you'd expect to find in a query for males
b) first name - I'd expect to see "dave" , "john" etc here as significantly associated with males
c) last name - individual surnames are not aligned to the male gender so these will be wholly insignificant selections e.g. smith
For the values found in 2) find significant others. We would expect a query for male, dave, john, smith etc to match docs that were mostly males (although with a handful of common surnames and a big sample size that will match many females too). This is not a particularly well-focused set of docs in which to go looking for significant connections. The results, if any, are going to be spurious.
A small sample will consist entirely of johns and daves and so there are no new first names to find. The genders will all be male (which we already knew from 2a). Any last names are equally going to be weak connections.

A more practical example based on this sort of data is creating a forename thesaurus which can be used to discover name abbreviations and common typos. This is something I did using billions of a bank's records and was able to derive a name thesaurus using this simple file structure as input:

customerID Name

534242131 Bob
534242131 Robert
657464534 Alice
657464534 Sue

So the data format is bank's unique ID for a person and every recorded name that person had ever used in interactions with the bank. Each person can then be represented with a Json doc like this:

{
   "id": 657464534
   "names": ["Alice", "Sue"]
}

Now not many people who called themselves Alice also called themselves Sue at some point so this is not an example of significant connection between names. However, when you look at enough examples of these records Graph will draw out that "Bob" and "Robert" are indeed strongly connected. This is reinforced through many examples (or as many as min_doc_count requires for weight-of-evidence). The resulting weighted graph is pretty interesting (small example below):

The weights of associations (not shown) tell us for example that "janes" is much more likely to be "james" than "jane". This is a behavioural side-effect of the m and n keys being next to each other on the keyboard.

This sort of data analysis is much more the sort of thing that Graph is tuned to help with. The default configuration is trying to identify Bob->Robert connections and tune out the Alice->Sue noise so if you want to just explore all connections, follow the setting suggestions in [1]

Hope this helps

Mark

[1] https://www.elastic.co/guide/en/graph/current/graph-troubleshooting.html#_why_are_results_missing

James_Crone · May 31, 2016, 9:52am

Thank you for your reply.i understand your example.
Actually, my problem is related to this question:

If users bought this type of gardening gloves, what other products might they be interested in?

Here is my index data:

{"tran_date": "2011-12-13","amount": 3540.92,"venue": "St.101 Wales","voucher": "NX6RMQ","voucher_value": 9,"points": 5,"data_source": "user","discount": 6,"payment_by": "Card","tax_type": "vat","tax_value": "12.0","currency": "$","category": "flight","product_name": "Ipod","sub_product_name": "Ipod","product_des": "asdf","product_type": "Ipod","product_unit_price": 3540.92,"product_qty": "1"},{"tran_date": "2012-03-01","amount": 637.6,"venue": "St.101 Wales","voucher": "3MZLYD","voucher_value": 9,"points": 4,"data_source": "user","discount": 5,"payment_by": "Card","tax_type": "vat","tax_value": "12.0","currency": "$","category": "flight","product_name": "Ipod","sub_product_name": "Ipod","product_des": "asdf","product_type": "Ipod","product_unit_price": 637.6,"product_qty": "1"}

i want to create a graph for predict products:
For example: if user 'A' purchase IPOD. And in future i add laptop then how can i predict that user 'A' wants to purchase it because he is interested in electronics IPOD. Is it right way to go with graph for this problem? if yes, how can i achieve it?

Mark_Harwood · May 31, 2016, 2:19pm

You can start from either end (laptop or User A) and find the other.
Let's assume you have a buyer-centric index like the doc I used for figuring out which first names are strongly related but instead of person names you have an array of SKUs (product codes) that each user has purchased. I see you also have product types and categories - these could also be stored in each user's purchase history.
Using these documents you can then draw out the strong connections e.g. people who buy ipods have a tendency to buy Beats headphones. This is the same principle as people who call themselves Robert also tend to call themselves Bob.

Topic		Replies	Views
No Result/Graph comes up after search Kibana elastic-stack-graph	2	1279	July 6, 2017
I would like to execute an example of Graph query in some example data to understand how it works, but when i execute, nothing is returned Elastic Search	1	5	December 10, 2024
Search query in graph returns no response Kibana	10	351	August 22, 2018
Elasticsearch query - AND doesn't return values (when I know it should) Elasticsearch	1	632	July 5, 2017
Elasticsearch always returns all results and queries or filters does not work Elasticsearch	5	4107	June 19, 2017

Elasticsearch graph query not working

Related topics