Confluence (Server) Connector 302 Errors

ekan · December 17, 2020, 12:54pm

We're having issues using the Confluence (Server) connector in Workplace Search. It starts indexing documents correctly, but at some point, it gets 302s in the form of:

[####-##-##T##:##:##.###+##:##][######][####][connectors][WARN]: ContentSource[<ID_OMITTED>, confluence_server]: Encountered error during extraction of 'Confluence ID: <ID_OMITTED>': Connectors::ContentSources::Atlassian::CustomClient::ClientError: <html>
<head><title>302 Found</title></head>
<body bgcolor="white">
<center><h1>302 Found</h1></center>
<hr><center>nginx/#.##.##</center>
</body>
</html>

I don't actually care if this document doesn't get indexed. However, I do not want the connector to restart the entire job. How can I make it continue despite errors? Our (very old) current search engine has no such issues with our Confluence instance, so I'm not sure why there are no configuration options for the provided Confluence connector.

goodroot · December 17, 2020, 7:53pm

Hello, Ekansh.

Which version of Workplace Search are you running?

Let's start there.

Thanks,

Kellen

ekan · December 17, 2020, 8:37pm

Hey Kellen!

I'm running Workplace Search 7.10.1 (I believe that is the latest one?).

I know why the Confluence pages are erroring now - they can't be loaded in the browser either due to some macros. However, I still want Workplace Search to ignore those errors and continue indexing.

Vadim_Yakhin · December 17, 2020, 9:41pm

Hi Ekansh!

There's a known issue in Confluence regarding 302 redirects. We will try to handle it more gracefully in future versions.

For now, you could use several config options in config/enterprise-search.yml to increase tolerance to errors:

#workplace_search.content_source.sync.max_errors: 1000

#

# Configure how many errors in a row to tolerate in a sync job.

# If the job encounters more errors in a row than this value, the job will fail.

# NOTE: this only applies to errors tied to individual documents.

#

#workplace_search.content_source.sync.max_consecutive_errors: 10

#

# Configure the ratio of <errored documents> / <total documents> to tolerate in a sync job

# or in a rolling window (see `workplace_search.content_source.sync.error_ratio_window_size`).

# If the job encounters an error ratio greater than this value in a given window, or overall

# at the end of the job, the job will fail.

# NOTE: this only applies to errors tied to individual documents.

#

#workplace_search.content_source.sync.max_error_ratio: 0.15

#

# Configure how large of a window to consider when calculating an error ratio

# (see `workplace_search.content_source.sync.max_error_ratio`).

#

#workplace_search.content_source.sync.error_ratio_window_size: 100

Let us know if it helps!

ekan · December 18, 2020, 11:54am

Hi Vadim - thanks for finding the issue!

Yesterday, I went ahead and updated the configuration as following:

    workplace_search.content_source.sync.max_errors: 10000000
    workplace_search.content_source.sync.max_consecutive_errors: 100000
    workplace_search.content_source.sync.max_error_ratio: 1
    workplace_search.content_source.sync.error_ratio_window_size: 100

I don't think we have nearly this number of errors, so the job shouldn't stop. It might be the case that the UI simply reports these errors as existing and still finishes the entire indexing, but I'm skeptical because indexing is still happening on the frequency of every 15-30 minutes instead of the documented 2 hour interval.

Do you know if this is simply a UI issue or if the indexing is actually restarting moreso than is necessary (or how I can check)? I also wonder if maybe my parameters are wrong.

Thanks for the help!

ekan · January 21, 2021, 11:32am

Any updates on this?

Sean_Story · February 2, 2021, 4:33pm

Hi @ekan - these settings as you have them:

    workplace_search.content_source.sync.max_error_ratio: 1
    workplace_search.content_source.sync.error_ratio_window_size: 100

mean that if 100 documents/pages fail in a row, the job will fail. Is it possible you have a group of pages like that?
Are you using self-managed tarball/docker to run Enterprise Search? Or are you running on Elastic Cloud?

sirmavid · January 8, 2022, 10:21am

i think this issue is resolved now. .

system · October 31, 2022, 2:48am

This topic was automatically closed 28 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Confluence cloud is having unexpected error during the synchronization Elastic Search elastic-workplace-search	4	468	October 31, 2022
Confluence server ver 7.17 Elastic Search	2	75	June 3, 2024
Does Workplace search 7.12 confluence server supports only these versions of confluence server? Elastic Search elastic-workplace-search	10	716	June 21, 2021
Confluence Connector doesn't sync all the documents Elastic Search	9	532	November 4, 2022
Jira and Confluence connector issue Elastic Search	9	533	October 19, 2022

Confluence (Server) Connector 302 Errors

Related topics