Ingestion pipeline processor error - input field does not exist

Joy_yang · February 22, 2024, 12:51am

Hi I'm working on a RAG project where we use elastic-search to search for relevant documents. The document comes from web crawler. However, due to most LLMs have token limit, I'm trying to chunk the documents into smaller sizes so it fits within the prompt token sizes. Without going directly into the custom coding-heaving method, I found I can set up a custom ingestion pipeline in kibana UI. However, after I setup script + foreach processors, I kept getting this errors.

Processor 'inference' in pipeline 'search-test@custom' failed with message 'Input field [body_content_field] does not exist in the source document

If it's web crawler, it usually has body_content, title etc., but I'm not sure where this "body_content_field" comes from and couldn't really find a way to debug this. Could anyone share some insights? Thanks!
**the method I tried is referred to this doc: Chunking Large Documents via Ingest pipelines plus nested vectors equals easy passage search — Elastic Search Labs

stephenb · February 22, 2024, 2:36am

Hi @Joy_yang Welcome to the community and cool building a RAG application. Very cool

There's other good content there at the Elasticsearch labs.

So you need to replace that with the field that's in your source document that has the text in it...

In this line

String[] envSplit = /((?<!M(r|s|rs)\.)(?<=\.) |(?<=\!) |(?<=\?) )/.split(ctx['body_content']);

Replace body_content with your field

system · March 21, 2024, 2:36am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Ingest Pipeline Processor error - Input field [text_field] does not exist in the source Elasticsearch	1	56	March 10, 2025
Elasticsearch OpenNLP Ingest Processor Question Elasticsearch	2	1817	September 15, 2017
Cant index field with ingest pipeline Logstash	3	321	April 25, 2021
FSCrawler - Ingest pipeline error Elasticsearch	3	1548	December 31, 2019
Search query problem Elasticsearch	11	527	July 6, 2017

Ingestion pipeline processor error - input field does not exist

Related topics