Aggregation of hierchical elements possible?


(Markus Breuer) #1

The index has a field named "path" which contains the canonical file name, e.g.:

/a/file1
/a/file2
/a/b/file3

Is it possible to create an bucket aggregation to summarize all file per path including subfolders?

Something like that:

/a => 3 files
/a/b => 1 file

regars,
markus


(vineeth mohan-2) #2

Hello Markus ,

I cant seem to think of any straight method , but then you can try the
following

  1. Apply source transform script to convert /a/b/c => [ /a , /a/b ,
    /a/b/c ] -
    http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
  2. Now apply normal term aggregation.
  3. But then your query on this field will also match /a , /a/b also , so
    go for a raw field too

Thanks
Vineeth

On Mon, Sep 1, 2014 at 9:21 PM, skippi1 skippi1@gmx.de wrote:

The index has a field named "path" which contains the canonical file name,
e.g.:

/a/file1
/a/file2
/a/b/file3

Is it possible to create an bucket aggregation to summarize all file per
path including subfolders?

Something like that:

/a => 3 files
/a/b => 1 file

regars,
markus

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1409586703001-4062768.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(Markus Breuer) #3

Helli Vineeth,

thx for your response. Your proposal 1. seems to be similar to the
path-tokenizer, which I used, isn't it?

"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
"path-analyzer" : {
"type" : "custom"
"tokenizer" : "path-tokenizer"
}
}
"tokenizer" : {
"path-tokenizer" : {
"type" : "path_hierarchy"
"delimiter" : "/"
}
}
}
}
}

But when using the term-aggregation, the result is not correct. The
following query should do an aggregation per folder and sum the length
of all files in this folder and subfolders. The query returns some
result but the result seems not to be complete. Can you explain in which
way you would apply your proposal at 3.?

{
"aggs" : {
"file_count" : {
"terms" : {
"field" : "path",
"order" : {
"_term" : "asc"
}
},
"aggs" : {
"file_size" : {
"sum" : {
"field" : "length"
}
}
}
}
},
"size" : 0
}

These are my mappings:

{
"properties" : {
"path" : {
"type" : "string",
"analyzer" : "path-analyzer"
},
"full_path" : {
"type" : "string",
"index" : "not_analyzed"
},
"is_dir" : {
"type" : "boolean"
}
}
}

regards,
markus

Hello Markus ,

I cant seem to think of any straight method , but then you can try the
following

  1. Apply source transform script to convert /a/b/c => [ /a , /a/b ,
    /a/b/c ] -
    http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
  2. Now apply normal term aggregation.
  3. But then your query on this field will also match /a , /a/b also ,
    so go for a raw field too

Thanks
Vineeth

On Mon, Sep 1, 2014 at 9:21 PM, skippi1 <[hidden email]
</user/SendEmail.jtp?type=node&node=4062817&i=0>> wrote:

The index has a field named "path" which contains the canonical
file name,
e.g.:

/a/file1
/a/file2
/a/b/file3

Is it possible to create an bucket aggregation to summarize all
file per
path including subfolders?

Something like that:

/a => 3 files
/a/b => 1 file

regars,
markus





--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it,
send an email to [hidden email]
</user/SendEmail.jtp?type=node&node=4062817&i=1>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1409586703001-4062768.post%40n3.nabble.com.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google
Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send
an email to [hidden email]
</user/SendEmail.jtp?type=node&node=4062817&i=2>.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com?utm_medium=email&utm_source=footer.
For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the
discussion below:
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768p4062817.html

To unsubscribe from aggregation of hierchical elements possible?,
click here
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4062768&code=c2tpcHBpMUBnbXguZGV8NDA2Mjc2OHwxMjgxODY3Mzg0.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


(vineeth mohan-2) #4

Hello Markus ,

Can you also paste what is returned too.
Also this is what i had in mind. -
http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/analysis-pathhierarchy-tokenizer.html#analysis-pathhierarchy-tokenizer

Thanks
Vineeth

On Tue, Sep 2, 2014 at 11:36 PM, Markus Breuer skippi1@gmx.de wrote:

Helli Vineeth,

thx for your response. Your proposal 1. seems to be similar to the
path-tokenizer, which I used, isn't it?

"settings" : {
"index" : {
"analysis" : {
"analyzer" : {
"path-analyzer" : {
"type" : "custom"
"tokenizer" : "path-tokenizer"
}
}
"tokenizer" : {
"path-tokenizer" : {
"type" : "path_hierarchy"
"delimiter" : "/"
}
}
}
}
}

But when using the term-aggregation, the result is not correct. The
following query should do an aggregation per folder and sum the length of
all files in this folder and subfolders. The query returns some result but
the result seems not to be complete. Can you explain in which way you would
apply your proposal at 3.?

{
"aggs" : {
"file_count" : {
"terms" : {
"field" : "path",
"order" : {
"_term" : "asc"
}
},
"aggs" : {
"file_size" : {
"sum" : {
"field" : "length"
}
}
}
}
},
"size" : 0
}

These are my mappings:

{
"properties" : {
"path" : {
"type" : "string",
"analyzer" : "path-analyzer"
},
"full_path" : {
"type" : "string",
"index" : "not_analyzed"
},
"is_dir" : {
"type" : "boolean"
}
}
}

regards,
markus

Hello Markus ,

I cant seem to think of any straight method , but then you can try the
following

  1. Apply source transform script to convert /a/b/c => [ /a , /a/b ,
    /a/b/c ] -
    http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/mapping-transform.html#mapping-transform
  2. Now apply normal term aggregation.
  3. But then your query on this field will also match /a , /a/b also ,
    so go for a raw field too

Thanks
Vineeth

On Mon, Sep 1, 2014 at 9:21 PM, skippi1 <[hidden email]
http://user/SendEmail.jtp?type=node&node=4062817&i=0> wrote:

The index has a field named "path" which contains the canonical file name,
e.g.:

/a/file1
/a/file2
/a/b/file3

Is it possible to create an bucket aggregation to summarize all file per
path including subfolders?

Something like that:

/a => 3 files
/a/b => 1 file

regars,
markus

--
View this message in context:
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768.html
Sent from the ElasticSearch Users mailing list archive at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4062817&i=1.

To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/1409586703001-4062768.post%40n3.nabble.com
.
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to [hidden email]
http://user/SendEmail.jtp?type=node&node=4062817&i=2.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com
https://groups.google.com/d/msgid/elasticsearch/CAGdPd5nN04%3DjEborsm%3D%3D_N6Yt8vSCN95WJ6Ka4gJGqHRXSJ2Bw%40mail.gmail.com?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.


If you reply to this email, your message will be added to the discussion
below:

http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768p4062817.html
To unsubscribe from aggregation of hierchical elements possible?, click
here.
NAML
http://elasticsearch-users.115913.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html!nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers!nabble%3Aemail.naml-instant_emails!nabble%3Aemail.naml-send_instant_email!nabble%3Aemail.naml


View this message in context: Re: aggregation of hierchical elements
possible?
http://elasticsearch-users.115913.n3.nabble.com/aggregation-of-hierchical-elements-possible-tp4062768p4062846.html

Sent from the ElasticSearch Users mailing list archive
http://elasticsearch-users.115913.n3.nabble.com/ at Nabble.com.

--
You received this message because you are subscribed to the Google Groups
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an
email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit
https://groups.google.com/d/msgid/elasticsearch/540606FA.1030106%40gmx.de
https://groups.google.com/d/msgid/elasticsearch/540606FA.1030106%40gmx.de?utm_medium=email&utm_source=footer
.

For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/CAGdPd5kFjQdcPpqM4je9EnQ6GkJ0bJa4iPc9irRt2Kf0sf-0ug%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.


(system) #5