Indexing three different documents as logically one

Hi, I’m new to ES. I want to index OAS APIs. For each of these APIs I’ve got the OAS3 contract, a markdown documentation file as well as a JSON metadata file. I’d like to submit all three or them (for each API) and have the ability to search for any content in either of these three documents. Any hit should return me the API name contained in the OAS3 spec. So far I’ve been able to index the OAS3 contract. Any suggestions / pointers on how I could achieve this?

Welcome!.

It would be easier to understand if you can provide a concrete example of what you would like to do. What are the source documents?

There are three documents for each "submission to the index":

  • An OAS3 specification
  • A markup document with additional technical documentation for the API spec
  • A JSON metadata file containing metadata about the API.

API spec

   openapi: 3.0.0
servers:
  # Added by API Auto Mocking Plugin
  - description: SwaggerHub API Auto Mocking
    url: https://virtserver.swaggerhub.com/mtedone/api-with-extensions/1.0.0
info:
  description: This is a simple API
  version: "1.0.0"
  title: Simple Inventory API
  contact:
    email: you@your-company.com
  license:
    name: Apache 2.0
    url: 'http://www.apache.org/licenses/LICENSE-2.0.html'
tags:
  - name: admins   
    description: Secured Admin-only calls
  - name: developers
    description: Operations available to regular developers
paths:
  /inventory:
    get:
      tags:
        - developers
      summary: searches inventory
      operationId: searchInventory
      description: |
        By passing in the appropriate options, you can search for
        available inventory in the system
      parameters:
        - in: query
          name: searchString
          description: pass an optional search string for looking up inventory
          required: false
          schema:
            type: string
        - in: query
          name: skip
          description: number of records to skip for pagination
          schema:
            type: integer
            format: int32
            minimum: 0
        - in: query
          name: limit
          description: maximum number of records to return
          schema:
            type: integer
            format: int32
            minimum: 0
            maximum: 50
      responses:
        '200':
          description: search results matching criteria
          content:
            application/json:
              schema:
                type: array
                items:
                  $ref: '#/components/schemas/InventoryItem'
        '400':
          description: bad input parameter
components:
  schemas:
    InventoryItem:
      type: object
      required:
        - id
        - name
        - manufacturer
        - releaseDate
      properties:
        id:
          type: string
          format: uuid
          example: d290f1ee-6c54-4b01-90e6-d701748f0851
        name:
          type: string
          example: Widget Adapter
        releaseDate:
          type: string
          format: date-time
          example: '2016-08-29T09:12:33.001Z'
        manufacturer:
          $ref: '#/components/schemas/Manufacturer'
    Manufacturer:
      required:
        - name
      properties:
        name:
          type: string
          example: ACME Corporation
        homePage:
          type: string
          format: url
          example: 'https://www.acme-corp.com'
        phone:
          type: string
          example: 408-867-5309
      type: object

Example of the metadata file:

    {
      "domain": "inventory",
      "team": "gurus"
    }

Example of the Markdown file:

    It's very easy to make some words **bold** and other words *italic* with Markdown. You can even [link to Google!](http://google.com)

I receive these three files as a single submission in a zip file. I'd like to index all of them and then whenever someone searches for any of the content in any of the three files, I'd like to return the API name in the results.

I hope this clarifies

How are you indexing?
One option is to insert all the data into the same index/document. You should create the document ID and "remember" it when indexing all documents.
Another option is to index into different indices but have a shared ID. i.e. a filed that is common between al indices. Then when you filter by ID=NNN, you get 3 indices filtered and the relevant document showing.

Thank you for your reply. This looks actually useful. I'd prefer option 1. For that, do I need to create a document id "before" indexing? If so, do I do that through a POST request? Then once I've got the id, I'd index all other documents with the same id through a PUT?

Generally speaking, there are a few ways to create and update documents in an index.
Please test the solution beforehand.
More details - Document APIs | Elasticsearch Reference [7.11] | Elastic

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.