Handling Postgres Startup and Cert Regeneration After a VCF 9.0.1 Fleet Management Reboot
- Arun Nukula
- 10 hours ago
- 1 min read
Overview
Applying VCF Operations fleet management appliance 9.0.1.0 patch on top of version 9.0.0.0 is pretty straight forward.
This does not need a reboot of the appliance , as it just triggers a service restart.
What we noticed was this VCF Operations fleet management appliance was rebooted for some weird reason we did see 2 issues happening
Postgres Service fails to start
Certificate of VCF Operations fleet management is regenerated
If this appliance isn't restarted, you won't encounter this issue at all.
Postgres Exception from vmware_vrlcm.log
Sep 18 15:33:07 <<hostname>> postgres[12859]: pg_ctl: could not open PID file "/var/vmware/vpostgres/current/pgdata/postmaster.pid": Permission denied
Sep 18 15:33:07 <<hostname>> systemd[1]: vpostgres.service: Control process exited, code=exited, status=1/FAILURE
Certificate Issue
Browsing to /opt/vmware/vlcm/cert/ will list a new cert and a key generated , the existing one is backed up with the format server.key.<<timestamp>> , server.crt.<<timestamp>>
Remediation
Login into VCF Operations fleet management via ssh
Execute the command
systemctl status vpostgres
Must be down, if that's the case then execute the following command to fix the permissions
chmod 700 /var/vmware/vpostgres/current/pgdata/
Navigate to the /opt/vmware/vlcm/cert directory. The key and certificate files requiring change will have a timestamp in their names (e.g., server.crt.250930102056). Run the following commands to move the timestamped files into place, replacing the filenames with the ones in your directory:
mv server.key.250930102056 server.key
mv server.crt.250930102056 server.crt
Restart NGINX service
systemctl restart nginx
Restart VCF Operations fleet management appliance service
systemctl restart vrlcm-server.service
Check the status of the service
systemctl status vrlcm-server.service
Once the service startup is complete, you should be now good to go.