Even though all services are registered on an appliance and vpostgres status is actually up and running you might end up in seeing an exception on VAMI stating the database status is inconsistent and down.
One of the reasons for this would be that the replication from MASTER to REPLICA's is not happening properly.
When we check the status of Postgres we would get a message stating that the pgdata is not a cluster directory and postgressq.auto.conf is missing.
[replica] vranode2:/storage/db/pgdata # service vpostgres status
Last login: Fri Jul 24 10:31:31 UTC 2020
LOG: skipping missing configuration file "/var/vmware/vpostgres/current/pgdata/postgresql.auto.conf"
pg_ctl: directory "/var/vmware/vpostgres/current/pgdata" is not a database cluster directory
When we check MASTER's pgdata structure we do see the following files
[master] mum01-2-vra01:/storage/db/pgdata # ls -ltrh
total 192K
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_twophase
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_tblspc
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_snapshots
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_serial
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_replslot
drwx------ 4 postgres users 4.0K Mar 28 2019 pg_multixact
drwx------ 4 postgres users 4.0K Mar 28 2019 pg_logical
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_dynshmem
drwx------ 2 postgres users 4.0K Mar 28 2019 pg_commit_ts
-rw------- 1 postgres users 4 Mar 28 2019 PG_VERSION
-rw------- 1 postgres users 1.6K Mar 28 2019 pg_ident.conf
-rw------- 1 postgres users 1.7K Jun 17 14:32 server.key
-rw-r--r-- 2 postgres users 4.4K Jun 17 14:32 server.crt
-rw------- 1 postgres users 22K Jun 17 14:32 postgresql.conf.bak
-rw------- 1 postgres users 5.0K Jun 17 14:35 pg_hba.conf
drwx------ 8 postgres users 4.0K Jun 23 07:58 base
-rw------- 1 root root 22K Jul 22 13:53 postgresql.conf.bak22072020
-rw------- 1 postgres users 272 Jul 22 14:02 postgresql.auto.conf
drwx------ 2 postgres users 4.0K Jul 23 13:00 pg_log
-rw------- 1 postgres users 22K Jul 24 09:25 postgresql.conf
-rw------- 1 postgres users 85 Jul 24 09:54 postmaster.pid
-rw------- 1 postgres users 83 Jul 24 09:54 postmaster.opts
drwx------ 2 postgres users 4.0K Jul 24 09:54 pg_stat
drwx------ 2 postgres users 4.0K Jul 24 09:54 pg_notify
-rw------- 1 postgres users 6.8K Jul 24 09:54 serverlog
drwx------ 2 postgres users 4.0K Jul 24 10:19 global
drwx------ 2 postgres users 4.0K Jul 24 10:30 pg_subtrans
drwx------ 2 postgres users 4.0K Jul 24 10:30 pg_clog
drwx------ 3 postgres users 4.0K Jul 24 10:31 pg_xlog
drwx------ 2 postgres users 4.0K Jul 24 10:33 pg_stat_tmp
[master] mum01-2-vra01:/storage/db/pgdata #
There is one such file which is the odd man out. The file postgresql.conf.bak.<<date>>.
looks like a file created by an admin trying to backup postgresql.conf
Remember, whenever we have to take a backup of any particular file to ensure you place them inside a folder in a separate location rather than the same one
The moment we removed this manual backup file, the database inconsistent message under VAMI's Cluster tab disappeared and both the replica nodes status was showing UP
Moral of the story, do not place any manual backups under pgdata folder
Comments