Searching documents indexed via ingest-attachment

(Rohan Dodeja) #1

Hi All,

i'm new to elastic search and wanted to index files with some of the attributes like Author, Title, Subject, Category, Community etc.

How far i reached :-

i was able to create a attachment pipeline and was able to ingest the different docs in the elastic with attributes. see below how i did:-

  1. created pipeline by following request:-
    {
    "description" : "Extract attachment information",
    "processors" : [
    {
    "attachment" : {
    "field" : "data"
    }
    }
    ]
    }

  2. upload an attachment via following code :-
    {
    "filename":"Presentations-Tips.ppt",
    "Author":"Jaspreet",
    "Category":"uploading ppt",
    "Subject":"testing ppt",
    "fileUrl":"",
    "attributes":{"attr11":"attr11value","attr22":"attr22value","attr33":"attr33value"},
    "data": "here_base64_string_of_file"
    }

  3. then able to search freely on the all the above attributes and on file content as well:-
    {
    "query":{
    "query_string":{
    "query":"test"
    }
    }
    }

Now what I wanted is :-
Wanted to narrow down the searches through some filters like :-

  1. wanted to search on basis like search on specific parameters like search all those whose author must "Rohan"
  2. then search all whose author must be "Rohan" and category must be "Education"
  3. then search all whose author has letters like "han" and categories has letters "Tech"
  4. search all whose author is "Rohan" and can search full text search on all fields which can have "progress" in any field, means first naroow down search for author and then full text search on those resultset fields.

Please help me with proper query syntax and call url like for above full text search I used 'GET /my_index/_search'

(David Pilato) #2

Have a look at bool queries: https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-bool-query.html

then search all whose author has letters like "han" and categories has letters "Tech"

For this, if you want fast response time, you will need to adapt your mapping and generate subfields at index time. Using a ngram based analyzer will help to solve that. Have a look at https://www.elastic.co/guide/en/elasticsearch/reference/6.6/analysis-ngram-tokenizer.html

search all whose author is "Rohan" and can search full text search on all fields which can have "progress" in any field, means first naroow down search for author and then full text search on those resultset fields.

Same as before. Use a bool query and add a should or must clause with a simple query string query inside. See https://www.elastic.co/guide/en/elasticsearch/reference/6.6/query-dsl-simple-query-string-query.html

HTH

1 Like
(Rohan Dodeja) #3

Thanks David,

it fits into my scenario, and finally make a cURL request as following :-

curl -X POST \
  http://localhost:9200/my_index/_search \
  -H 'Content-Type: application/json' \
  -d '{
    "query": {
        "bool": {
            "must": [
                {
                    "query_string": {
                        "query": "progress"
                    }
                },
                {
                    "wildcard": {
                        "Author": "Rohan"
                    }
                },
                {
                    "wildcard": {
                        "Title": "q*"
                    }
                }
            ]
        }
    }
}'
(system) closed #4

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.