Processing Search Result

I can run a search like the following,

curl -XGET 'http://localhost:9200/cms-2016-03-30/job/_search?pretty=true&size=1000' -d '{ "_source":"DESIRED_CMSDataset"}'

which gives results like

{
      "_index" : "cms-2016-03-30",
      "_type" : "job",
      "_id" : "crab3-7@vocms0114.cern.ch#6472621.0#1459313328",
      "_score" : 1.0,
      "_source" : {
        "DESIRED_CMSDataset" : "/BTagCSV/Run2015D-16Dec2015-v1/MINIAOD"
      }
    }

I would like to process this result to only get the MINIAOD part of the DESIRED_CMSDataset string. Various tokenizers exist, but I can't figure out the syntax for any of them.

If you can alternatively tell me how to do this at index time (precisely), that is an acceptable answer. I mostly need the syntax. I don't understand the ElasticSearch syntax

You'd be better off separating that out before indexing to be honest.
How are you sending the data to ES?

It is in the database already. The database is being updated automatically, so a nice way of indexing and separating the strings. It is a simple tokenization problem. Simply split on the '/' and take the last one ... but I don't know how to do this in Elasticsearch

Maybe https://www.elastic.co/guide/en/elasticsearch/reference/2.3/analysis-pattern-tokenizer.html then

Thanks. I know those exist. I have looked at many tokenizers. But, the page is totally useless for me because I don't know how to write the query!! I am very new to ES. I would appreciate it if you could spell the entire query out using the base one I had above.

You want to do this on indexing, so create a field that splits it out via mappings - https://www.elastic.co/guide/en/elasticsearch/guide/master/mapping-intro.html

Sorry. I don't think you understand how new I am. I don't understand any of the documentation you are handing me. I know how to submit queries and I know what an index is. I am testing this product for my research group. I could be wrong, but I believe you work for ElasticSearch, so consider us a customer.

You are going to have to spell it out very clearly (like literally write the query). I have scoured the internet for solutions to simple problems in ElasticSearch and all I get are these documents that don't tell me the answer I want (or at least it is sufficiently cryptic). I am supposed to assemble a set of commands for my advisor to use as a cheatsheet and believe me, he will NEVER read those documentation pages ... ever. And he decides if we continue using your product.

I really do like ElasticSearch and Kibana, but you need to start providing solutions at my fingertips. For example, if I Google the following, my answer is at the given result (ie the 3rd).

  • "python split url" - 3rd result
  • "python split string" - 1st result

The fact that I have to scour the internet for a solution to an incredibly simply problem that I have solved in dozens of other languages and systems demonstrates a severe problem with your product!! Because I will immediately turn to another solution because this is too difficult. And then you lose a lot of customers.

Sorry to lecture you, but this is really too difficult.

My time here is my own, ie this is volunteer time for me, and I don't have enough of that time to write everything for you to pass to your boss I'm sorry.

I'm happy to put you in contact with our presales or consulting teams if that will help you with your testing phase, otherwise hopefully someone else can drop in to help out! :slight_smile: