Elastic PostgreSQL connector 8.12.2 with track_commit_timestamp off

Hi all,

I am using ELK 8.12.2 along with Elastic PostgreSQL connector 8.12.2. Both have been setup on premises, in my infra.

When track_commit_timestamp is off, the connector fails with:

DBAPIError: (sqlalchemy.dialects.postgresql.asyncpg.Error) <class 'asyncpg.exceptions.ObjectNotInPrerequisiteStateError'>: could not get commit timestamp data HINT: Make sure the configuration parameter "track_commit_timestamp" is set. [SQL: SELECT MAX(pg_xact_commit_timestamp(xmin)) FROM myschema."mytable"] (Background on this error at: https://sqlalche.me/e/20/dbapi).

My Postgres setup consists of Primary and Read-Only replicas and Elastic PostgreSQL connector 8.12.2 will be connected to one of the Read-Only replicas.

Billions of transactions take place in the database and I am concerned about enabling track_commit_timestamp. Enabling track_commit_timestamp will incur a cost of adding 12 bytes per transaction and this kind of data will be written inside $PostgresDir/main/pg_commit_ts. Postgres should truncate old data in pg_commit_ts during checkpoints and when old transactions ids are cleaned-up because of vacuuming. Despite that, I'd prefer not to enable track_commit_timestamp.

Is there any way to make Elastic Postgres Connector 8.12.2 not failing when track_commit_timestamp is off, even with the cost of indexing the same data on every connector sync? If there is a way to do that, then I will be deleting all documents from my index each time before the connector performs a full sync and keep track_commit_timestamp off in Postgres.

I am aware about using alternative approaches like replacing Elastic Postgres Connector with Logstash.

Thank you,
Apostolos

Hey @apostolos.e yeah, the error occurs because the connector implementation tries to use pg_xact_commit_timestamp to track the last update time of tables (this is approach to be more efficient at subsequent syncs of data)

As of now, there is no way to disable this in the configuration. If you really want to disable this, you would need to edit the connector implementation function that relies on the timestamp data:

class PostgreSQLClient:
    # ... other methods ...

    async def get_table_last_update_time(self, table):
        """Always returns current timestamp instead of querying for it"""
        return iso_utc()

If there is a way to do that, then I will be deleting all documents from my index each time before the connector performs a full sync and keep

The connector should delete all your outdated docs anyway upon completing the full sync, so manual deletion of docs in index should not be needed.

Hope this helps!

Much appreciated. Thank you