How to index an interview

handris · September 19, 2018, 8:04pm

Hi all,

I would like to index interviews into Elasticsearch.

The interview is made up of paragraphs, a paragraph can be said by the interviewer or the interviewee. My problem is that I am only interested in what the interviewee says, I don't want to index the questions of the interviewer.

As far as I understand, I can achieve this if I add "index":"no" to a given property when creating a mapping, this field won't be indexed and searchable. However, when run a search query, I need the whole interview as a result, so somehow I need to send the questions of the interviewer in the response.

Let me clarify it with an example:

This is an interview:

[interviewer]: How are you?
{interviewee}: I am find. And you?
[interviewer]: Fine, thanks.

I only want to make the sentence "I am find. And you?" searchable, because it is only paragraph said by the interviewee, but when I Elasticsearch returns a response, I want to get the whole interview as a response.

My question is, who should I go about achieving this? I am not asking for how to create the exact mapping (though any input there is also appreciated), but rather what properties should I have in mapping? How can I make part of the speech searchable, the other part not searchable, while search responses contain the whole interview?

For reference, here is the mapping I came up with so far:

PUT /interviews
{
  "mappings": {
    "_doc": {
      "properties": {
        "title": {"type": "text"},
        "lead": {"type": "text"},
        "body": {
          "type": "text",
          "analyzer": "hungarian",
          "index_phrases": true
        },
        "interviewerBody" {
          "type": "text",
          "index":"no"
        }
      }
    }
  }
}

The problem with this mapping is that I put the questions and the answers into separate properties, so I do not know in which order the questions by the interviewer and the answers by the interviewee came.

Here is an example interview for reference. The p elements with a direct em child element are said by the interviewer, the p elements without any more children are said by interviewee.

warkolm · September 19, 2018, 9:14pm

You probably want to look at something like parent/child. Where the parent is the interview "event", and then the children are the questions and answers. Then you can control things with a lot more ease.

handris · September 24, 2018, 2:42pm

After a bit of research, I found that I needed to use nested type. This is intended to be used when indexing arrays of objects.

system · October 22, 2018, 2:42pm

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
ES search and indexing question Elasticsearch	4	384	July 6, 2017
Choose what to be indexed Elasticsearch	2	552	July 15, 2019
Query issue Elasticsearch	4	1053	January 31, 2019
Mapping an index with ruby for partial search Elasticsearch	1	363	January 29, 2020
Field mapping Elasticsearch	3	295	July 6, 2017

How to index an interview

Related topics