Batch update documents with new fields

I am looking to add new fields based on the parsing of an existing field.

The documents are apache acccess logs entreis and are already indexed. I
want to add new fields on a parsed user agemt to add browser, os and
type,etc.

  1. A facet query by user agent to return the top user agents.
  2. Parse the returned facet user agents and create new fields for browser,
    os and type for each
  3. Query each user agent from #1, to obtin a lsit of document ids.
  4. Use the update api to add the fields from #2 to each documetn of #3
    (loop though the list)

Is there a better measn to acomplsih this?

Thanks

--

  1. Perform matchall scan search
  2. For each doc/log,
    a. get user agent field
    b. parse out the additional information and create a partial update
    "doc" (Elasticsearch Platform — Find real-time answers at scale | Elastic)
    c. perform the partial update

Better yet, do this additional parsing at original index time. BTW, have
you looked into logstash? (http://logstash.net/)

On Fri, Jan 18, 2013 at 1:26 PM, Kubes philip@freepricealerts.com wrote:

I am looking to add new fields based on the parsing of an existing field.

The documents are apache acccess logs entreis and are already indexed. I
want to add new fields on a parsed user agemt to add browser, os and
type,etc.

  1. A facet query by user agent to return the top user agents.
  2. Parse the returned facet user agents and create new fields for
    browser, os and type for each
  3. Query each user agent from #1, to obtin a lsit of document ids.
  4. Use the update api to add the fields from #2 to each documetn of #3
    (loop though the list)

Is there a better measn to acomplsih this?

Thanks

--

--

Matt,
Thanks. I am using logstash, but these are already indexed documents. In
the future I was thinking a creating a LS filter plug-in to fix the issue.
As in parsing the user agent I was planning to use a external lookup
service via an api (http://useragentstring.com)

On Friday, January 18, 2013 4:41:29 PM UTC-5, Matt Weber wrote:

  1. Perform matchall scan search
  2. For each doc/log,
    a. get user agent field
    b. parse out the additional information and create a partial
    update "doc" (Elasticsearch Platform — Find real-time answers at scale | Elastic
    )
    c. perform the partial update

Better yet, do this additional parsing at original index time. BTW, have
you looked into logstash? (http://logstash.net/)

On Fri, Jan 18, 2013 at 1:26 PM, Kubes <phi...@freepricealerts.com<javascript:>

wrote:

I am looking to add new fields based on the parsing of an existing field.

The documents are apache acccess logs entreis and are already indexed. I
want to add new fields on a parsed user agemt to add browser, os and
type,etc.

  1. A facet query by user agent to return the top user agents.
  2. Parse the returned facet user agents and create new fields for
    browser, os and type for each
  3. Query each user agent from #1, to obtin a lsit of document ids.
  4. Use the update api to add the fields from #2 to each documetn of #3
    (loop though the list)

Is there a better measn to acomplsih this?

Thanks

--

--