Search for a value across multiple fields


(Luca Pau) #1

Hello guys,
I'm learning to use elasticsearch and I ran into a little problem ..

My purpose is to search for a value across multiple fields and return the
count of these values ​​and the distinct value.

To do this I realized that I have to use the facets.

This is the database schema:

index:
analysis:
analyzer:
custom_search_analyzer:
type: custom
tokenizer: standard
filter : [standard, snowball, lowercase, asciifolding]
custom_index_analyzer:
type: custom
tokenizer: standard
filter : [standard, snowball, lowercase, asciifolding, custom_filter]
filter:
custom_filter:
type: edgeNGram
side: front
min_gram: 1
max_gram: 20

{
"structure": {
"properties": {
"name": {"type": "string", "search_analyzer": "custom_search_analyzer", "index_analyzer": "custom_index_analyzer"},
"locality": {"type": "string", "search_analyzer": "custom_search_analyzer", "index_analyzer": "custom_index_analyzer"},
"province": {"type": "string", "search_analyzer": "custom_search_analyzer", "index_analyzer": "custom_index_analyzer"},
"region": {"type": "string", "search_analyzer": "custom_search_analyzer", "index_analyzer": "custom_index_analyzer"}
}
}
}

and this is the query that I tried to use:

{
"query": {
"bool": {
"should": [
{
"match": {
"locality": "bolo"
}
},
{
"match": {
"region": "bolo"
}
},
{
"match": {
"name": "bolo"
}
}
]
}
},
"facets": {
"region": {
"query": {
"term": {
"region": "bolo"
}
}
},
"locality": {
"query": {
"term": {
"locality": "bolo"
}
}
},
"name": {
"query": {
"term": {
"name": "bolo"
}
}
}
}
}

Of all the tests I've done this is the query that is closest to my desired
result, however, does not tell me the count of distinct field, I found it
to count the total field.

For example, the above query returns the following result:

facets: {
region: {
_type: query
count: 0
}
locality: {
_type: query
count: 2
}
name: {
_type: query
count: 0
}
}

I would like to have a result like this (not so obviously written is
correct, but does understand what I need):

facets: {
....
locality: {
_type: query
"terms": [
{"term": "Bologna", "count": 1},
{"term": "Bolognano", "count": 1}
]

}

How can I do?

I have already tried to use "terms" instead of "query" in the facets and
put "index: not_analyzed" in the fields of research, but is only returned
if I try the exact scope, not part of it!

Thanks

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/9f74b8be-b17a-4c44-b738-7f0b1ea54750%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #2

The closest you can probably get is to script and then sum (but you need to
know the exact terms you are querying up front). It could be slow if you
got lots of documents, but here is more information:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html

For example:

POST _search
{
"script_fields": {
"locality_bologna": {
"script": "_index['locality']['bologna'].tf()"
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/5dc9cd99-b796-437d-be0e-06d78b8ee492%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Pau) #3

Hello,
Thank you for your response.
Unfortunately this is not what I wanted.

'll Explain what I do, so maybe you can find a route different from the one
I'm trying to take ..

Should I build an autocomplete "smart" in the sense that for every result
must be worn behind a series of information.

For example:

I am looking for "bologna" (my city) and I have to run a query (or more
than one, I was looking for a way to make it as little as possible) where
"bologna" is searched in the fields "name", "locality" and "region".
If it is found, necessary to count how many structures there are in bologna
(here you will understand why I was trying to use the facets).

Obviously I could do this on the server side, for example with php with a
loop, but the answer would be very slow and was frustrated all the speed of
elasticsearch.

At this point I wonder how I could do this.

thanks

2014-02-13 20:51 GMT+01:00 Binh Ly binh@hibalo.com:

The closest you can probably get is to script and then sum (but you need
to know the exact terms you are querying up front). It could be slow if you
got lots of documents, but here is more information:

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/modules-advanced-scripting.html

For example:

POST _search
{
"script_fields": {
"locality_bologna": {
"script": "_index['locality']['bologna'].tf()"
}
}
}

--
You received this message because you are subscribed to a topic in the
Google Groups "elasticsearch" group.
To unsubscribe from this topic, visit
https://groups.google.com/d/topic/elasticsearch/hjODe_18d9o/unsubscribe.
To unsubscribe from this group and all its topics, send an email to
elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/5dc9cd99-b796-437d-be0e-06d78b8ee492%40googlegroups.com
.

For more options, visit https://groups.google.com/groups/opt_out.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAJNwxOO5Mqho_FdKZ-TSdCiwj6Aanz-c6_fMdVSGx%2BEG6jYyQw%40mail.gmail.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Binh Ly) #4

I'm still not 100% sure I understand. Is this something that might work?

{
"query": {
"multi_match": {
"query": "bologna",
"fields": [
"locality",
"region"
]
}
},
"script_fields": {
"bologna_count": {
"script": "_index['locality']['bologna'].tf() + _index['region']['
bologna'].tf()"
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b4ae9668-9416-4c43-8195-652c4da4fe62%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Pau) #5

We're almost there!
This is the result of the query that I have posted:

  • hits: {
    • total: 3
    • max_score: 4.724929
    • hits: [
      • {
        • _index: website
        • _type: structure
        • _id: 7
        • _score: 4.724929
        • fields: {
          • bologna_count: 0
            }
            }
      • {
        • _index: website
        • _type: structure
        • _id: 8
        • _score: 4.724929
        • fields: {
          • bologna_count: 0
            }
            }
      • {
        • _index: website
        • _type: structure
        • _id: 6
        • _score: 4.724929
        • fields: {
          • bologna_count: 0
            }
            }
            ]
            }

in fact located all three records that contain "bologna" in the "locality"
and "region", but bologna_count is always 0 and replicates the result 3
times.
For example:
the three records they found as locality:
"bologna"
"bologna"
"Bolognano"

What I would like is that I give back as a result: (similar to this)
hits: [

  • {
    • _index: website
    • _type: structure
    • _score: 4.724929
    • fields: {
      • count: 2
        locality:"bologna"
        }
        }
  • {
    • _index: website
    • _type: structure
    • _score: 4.724929
    • fields: {
      • count: 1
        locality:"bolognano"
        }
        }
        ]

so that it knows that there are 2 records with the name "bologna" and 1 as
"bolognano."

Thanks

Il giorno venerdì 14 febbraio 2014 13:42:14 UTC+1, Binh Ly ha scritto:

I'm still not 100% sure I understand. Is this something that might work?

{
"query": {
"multi_match": {
"query": "bologna",
"fields": [
"locality",
"region"
]
}
},
"script_fields": {
"bologna_count": {
"script": "_index['locality']['bologna'].tf() + _index['region']['
bologna'].tf()"
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/24d41fdb-2faf-4380-97c2-f0d56d4f0f20%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(Luca Pau) #6

Sorry, typo:
"This is the result of the query you posted:"

Il giorno venerdì 14 febbraio 2014 14:29:38 UTC+1, Luca Pau ha scritto:

We're almost there!
This is the result of the query that I have posted:

  • hits: {
    • total: 3
    • max_score: 4.724929
    • hits: [
      • {
        • _index: website
        • _type: structure
        • _id: 7
        • _score: 4.724929
        • fields: {
          • bologna_count: 0
            }
            }
      • {
        • _index: website
        • _type: structure
        • _id: 8
        • _score: 4.724929
        • fields: {
          • bologna_count: 0
            }
            }
      • {
        • _index: website
        • _type: structure
        • _id: 6
        • _score: 4.724929
        • fields: {
          • bologna_count: 0
            }
            }
            ]
            }

in fact located all three records that contain "bologna" in the "locality"
and "region", but bologna_count is always 0 and replicates the result 3
times.
For example:
the three records they found as locality:
"bologna"
"bologna"
"Bolognano"

What I would like is that I give back as a result: (similar to this)
hits: [

  • {
    • _index: website
    • _type: structure
    • _score: 4.724929
    • fields: {
      • count: 2
        locality:"bologna"
        }
        }
  • {
    • _index: website
    • _type: structure
    • _score: 4.724929
    • fields: {
      • count: 1
        locality:"bolognano"
        }
        }
        ]

so that it knows that there are 2 records with the name "bologna" and 1 as
"bolognano."

Thanks

Il giorno venerdì 14 febbraio 2014 13:42:14 UTC+1, Binh Ly ha scritto:

I'm still not 100% sure I understand. Is this something that might work?

{
"query": {
"multi_match": {
"query": "bologna",
"fields": [
"locality",
"region"
]
}
},
"script_fields": {
"bologna_count": {
"script": "_index['locality']['bologna'].tf() + _index['region']['
bologna'].tf()"
}
}
}

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/e01434e0-be58-45a1-8373-9636219ca569%40googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.


(system) #7