Missing terms API


(Tomislav Poljak) #1

Hi,
I have a requirement to display all (distinct) terms from one or more
fields, alphabetically sorted with pagination (because all terms need to
be displayed, not only most popular ones) for something like simple auto
suggest/dictionary.

I've checked API (before) and I think Terms API was ideal for this. Now,
with Terms API removed I don't know if or how can I achieve some
functionality (especially 'prefix', 'pagination' and 'sort = term') with
facets.

I understand executing a faceted search with 0 result size, like here:

curl -XGET 'http://localhost:9200/index/test/_search' -d '
{
"size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "text",
"size" : 10000
}
}
}
}
'

will give me terms from field 'text', but how can I get all terms (not
only "size"), alphabetically sorted and with pagination (without Terms
API)?

Thanks,
Tomislav


(ppearcy) #2

I've seen this asked a few times on the list and almost asked it once
myself :slight_smile:

Replaced by the facets api. Specifically, term facets will give you
what you need.

On Aug 6, 6:54 am, Tomislav Poljak tpol...@gmail.com wrote:

Hi,
I have a requirement to display all (distinct) terms from one or more
fields, alphabetically sorted with pagination (because all terms need to
be displayed, not only most popular ones) for something like simple auto
suggest/dictionary.

I've checked API (before) and I think Terms API was ideal for this. Now,
with Terms API removed I don't know if or how can I achieve some
functionality (especially 'prefix', 'pagination' and 'sort = term') with
facets.

I understand executing a faceted search with 0 result size, like here:

curl -XGET 'http://localhost:9200/index/test/_search'-d '
{
"size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "text",
"size" : 10000
}
}
} }

'

will give me terms from field 'text', but how can I get all terms (not
only "size"), alphabetically sorted and with pagination (without Terms
API)?

Thanks,
Tomislav


(ppearcy) #3

Sorry, didn't read whole question, disregard!!!

On Aug 6, 10:00 am, Paul ppea...@gmail.com wrote:

I've seen this asked a few times on the list and almost asked it once
myself :slight_smile:

Replaced by the facets api. Specifically, term facets will give you
what you need.

On Aug 6, 6:54 am, Tomislav Poljak tpol...@gmail.com wrote:

Hi,
I have a requirement to display all (distinct) terms from one or more
fields, alphabetically sorted with pagination (because all terms need to
be displayed, not only most popular ones) for something like simple auto
suggest/dictionary.

I've checked API (before) and I think Terms API was ideal for this. Now,
with Terms API removed I don't know if or how can I achieve some
functionality (especially 'prefix', 'pagination' and 'sort = term') with
facets.

I understand executing a faceted search with 0 result size, like here:

curl -XGET 'http://localhost:9200/index/test/_search'-d'
{
"size" : 0,
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "text",
"size" : 10000
}
}
} }

'

will give me terms from field 'text', but how can I get all terms (not
only "size"), alphabetically sorted and with pagination (without Terms
API)?

Thanks,
Tomislav


(Clinton Gormley) #4

will give me terms from field 'text', but how can I get all terms (not
only "size"), alphabetically sorted and with pagination (without Terms
API)?

I don't think you can, any more.

You would have to use a very large 'size', then sort them alphabetically
in your client.

This seems an unusual use case though. Why do you need to display
(potentially) millions of terms?

Surely the user really just wants a list of (eg) 10, which are the most
frequently used?

clint


(Shay Banon) #5

I have been enhancing the terms facet in master (0.9.1). For example, you
can now sort by terms and not just count. And you can use a regex to control
the what terms gets included.

-shay.banon

On Sun, Aug 8, 2010 at 6:05 PM, Clinton Gormley clinton@iannounce.co.ukwrote:

will give me terms from field 'text', but how can I get all terms (not
only "size"), alphabetically sorted and with pagination (without Terms
API)?

I don't think you can, any more.

You would have to use a very large 'size', then sort them alphabetically
in your client.

This seems an unusual use case though. Why do you need to display
(potentially) millions of terms?

Surely the user really just wants a list of (eg) 10, which are the most
frequently used?

clint


(Tomislav Poljak) #6

Hi Clint,

On Sun, 2010-08-08 at 17:05 +0200, Clinton Gormley wrote:

will give me terms from field 'text', but how can I get all terms (not
only "size"), alphabetically sorted and with pagination (without Terms
API)?

I don't think you can, any more.

You would have to use a very large 'size', then sort them alphabetically
in your client.

This seems an unusual use case though. Why do you need to display
(potentially) millions of terms?

Surely the user really just wants a list of (eg) 10, which are the most
frequently used?

I understand your point of view when considering faceting, but this
requirement is more in direction of simple auto suggest/dictionary (with
terms from index), where you need to display all terms not only most
popular ones (and restrict/narrow results by prefix).

Like from Terms API definition: 'This can be very handy to implement
things like tag clouds or simple auto suggest.'

(tag clouds can be implemented by facets as they are)

Tomislav

clint


(Tomislav Poljak) #7

Hi Shay,
sounds great!

I've tested regex param in terms facets on latest master and it works as
expected.

Is 'sort' feature released into current master? I'm asking because I've
downloaded latest master and build-ed from source and for query like:

curl -XGET 'http://localhost:9200/index/type/_search' -d '
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "text",
"regex" : "a.*",
"size" : 1000,
"sort" : "term"
}
}
}
}
'

I didn't get results (terms) alphabetically sorted in response (they
were sorted by count). Did I maybe misplaced/misused 'sort' param in
query request?

Tomislav

On Sun, 2010-08-08 at 23:00 +0300, Shay Banon wrote:

I have been enhancing the terms facet in master (0.9.1). For example,
you can now sort by terms and not just count. And you can use a regex
to control the what terms gets included.

-shay.banon

On Sun, Aug 8, 2010 at 6:05 PM, Clinton Gormley
clinton@iannounce.co.uk wrote:

    > will give me terms from field 'text', but how can I get all
    terms (not
    > only "size"), alphabetically sorted and with pagination
    (without Terms
    > API)?
    
    
    I don't think you can, any more.
    
    You would have to use a very large 'size', then sort them
    alphabetically
    in your client.
    
    This seems an unusual use case though.  Why do you need to
    display
    (potentially) millions of terms?
    
    Surely the user really just wants a list of (eg) 10, which are
    the most
    frequently used?
    
    clint

(Shay Banon) #8

Its actually called "order", here is the issue:
http://github.com/elasticsearch/elasticsearch/issues/closed#issue/303.

On Mon, Aug 9, 2010 at 2:53 PM, Tomislav Poljak tpoljak@gmail.com wrote:

Hi Shay,
sounds great!

I've tested regex param in terms facets on latest master and it works as
expected.

Is 'sort' feature released into current master? I'm asking because I've
downloaded latest master and build-ed from source and for query like:

curl -XGET 'http://localhost:9200/index/type/_search' -d '
{
"query" : {
"match_all" : { }
},
"facets" : {
"tag" : {
"terms" : {
"field" : "text",
"regex" : "a.*",
"size" : 1000,
"sort" : "term"
}
}
}
}
'

I didn't get results (terms) alphabetically sorted in response (they
were sorted by count). Did I maybe misplaced/misused 'sort' param in
query request?

Tomislav

On Sun, 2010-08-08 at 23:00 +0300, Shay Banon wrote:

I have been enhancing the terms facet in master (0.9.1). For example,
you can now sort by terms and not just count. And you can use a regex
to control the what terms gets included.

-shay.banon

On Sun, Aug 8, 2010 at 6:05 PM, Clinton Gormley
clinton@iannounce.co.uk wrote:

    > will give me terms from field 'text', but how can I get all
    terms (not
    > only "size"), alphabetically sorted and with pagination
    (without Terms
    > API)?


    I don't think you can, any more.

    You would have to use a very large 'size', then sort them
    alphabetically
    in your client.

    This seems an unusual use case though.  Why do you need to
    display
    (potentially) millions of terms?

    Surely the user really just wants a list of (eg) 10, which are
    the most
    frequently used?

    clint

(system) #9