Finding a working configuration which helps to build a successful vIDM 8.x cluster was a huge problem.
So thought of sharing what worked for me while I was trying to build a cluster in my lab.
So, Let's start
Creating CA Signed Certificates
The first step is to keep our vIDM CA-signed certificate ready. For this one can generate CSR ( Certificate Signing Request ) from vRLCM by clicking on Generate CSR option available under Locker --- Certificate --- Generate CSR

Then fill in all the details requested in the Generate CSR form.
Remember the Common Name should always be your Load Balancer FQDN
Server Domain/Hostname should contain all the hostnames which would part of the vIDM cluster
All IP addresses corresponding to the vIDM nodes involved in this cluster must be documented

One we click on GENERATE after we fill all the details as shown above we get a pem file downloaded.

This PEM file must be given to your certificate authority to get this signed and generate a CA-signed certificate which can be used to deploy our vIDM cluster
Your authority would give you a Key file and a Secure Certificate

Once we have the above information we would use this to import this certificate into vRLCM
To Import certificate into vRLCM, we need to go to Locker and then click on Certificate and Import certificate

We have two files at hand the first one is vidm.key and the second one is premidm certificate.
Open vidm.key using notepad and then paste that inside Private Key section
Open premidm certificate using notepad and then copy and paste this content under Certificate Chain section

Then click on IMPORT to get this certificate imported into Locker



Once we have our CA-signed certificate which will be used in deploying our vIDM cluster imported into vRLCM we will now go ahead and then import this certificate into NSX-V load balancer
Upload vIDM certificate chain and the corresponding root CA certificates onto NSX-V Edge
Browse to NSX-V Edge, then under the configure tab, click on Certificates and then Add a new Certificate
We need to enter the certificate details the same way we did before and that would import the certificate onto the edge

Once the above step is done Server Certificate is imported into Edge
Root certificate has to be added in the same way by exporting it out of Server certificate

Paste this content into a separate file and save it as root.cer and then on the NSX Edge, Configure and then click on Add new CA certificate

That's it you will have both ROOT and the SERVER certificates in place

Configuring NSX-V Loadbalancer to support a clustered vIDM
Application Profiles
Here's the configuration which has to be part of Application Profiles which supports vIDM LB
Application Profile Type: HTTPS End to End
Persistence: Cookie
Cookie Name: JSESSIONID
Mode: App Session
Expires in: 3600
Insert X-Forwarded-For-HTTP header: Enabled

Under Client SSL tab
Client Authentication: Ignore


Under Server SSL, Server Authentication must be enabled


Service Monitoring
Under Service Monitoring
Interval: 5
Timeout: 10
Max Retries: 3
Type: HTTPS
Expected: 200
Method: GET
URL: /SAAS/API/1.0/REST/system/health/heartbeat

Ensure there are no mistakes while typing or copying URL information
Pools
Algorithm: LEASTCONN
IP Filter: IPv4
Monitor Port: 443


Virtual Servers
Virtual Server: Enable
Protocol: HTTPS
Post/Port Range: 443

Creating a Request in vRLCM for a Clustered vIDM deployment
This is not a complicated task so I won't be discussing much. One important aspect is DELEGATE IP ensure this IP is not resolvable
This will be used during the PGPOOL configuration of your vIDM Cluster.
Before we submit the request we need to ensure all pre-validation is successful.
If above certificate steps are not done then your vIDM deployment will fail at Stage-3 as shown below

com.vmware.vrealize.lcm.common.exception.EngineException: vIDM install prevalidation failed
at com.vmware.vrealize.lcm.vidm.core.task.VidmInstallPrecheckTask.execute(VidmInstallPrecheckTask.java:62)
at com.vmware.vrealize.lcm.automata.core.TaskThread.run(TaskThread.java:45)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Attached a vIDM PreCheck report below for reference.
Once we have all the prechecks successful, then we can submit the request.
Post-Submission Stages during Cluster Deployment
There are 20 stages for a vIDM clustered deployment, they are
Stage 1: validateenvironmentdata
As the name itself suggests it validates the environment data submitted

Stage 2: infraprevalidation
Infrastructure details provided will be validated, the same stuff which was performed during prechecks

Stage 3: vidmprevalidation
vIDM validations will be performed, the same as the ones in prechecks

Stage 4: deployvidm
deploys vIDM OVA's on vCenter

Stage 5: vidmconfiguremaster,vidmprepareslave,vidmprepareslave
During this stage, your Master or the First Node in your vIDM is configured and the Slaves, second and the third nodes will be prepared

Note: In Stage 5, there is a phase called VidmFQDNUpdate if there is a failure observed at this stage as shown below in the screenshot, then your primary vIDM appliance is trying to open or communicate to your vIDM LB and expecting a valid response which is not happening.

Under /opt/vmware/horizon/workspace/configurator.log following exceptions will be seen
2020-09-03T12:37:37,599 INFO (Thread-146) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:06:20,493 INFO (Thread-3) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:14:39,678 INFO (Thread-189) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:14:49,407 INFO (Thread-189) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:17:01,765 INFO (Thread-189) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
At this moment you need to check if your Primary vIDM is responsive
I'll check if https://<<primaryvidmfqdn>>/ is responsive and the vIDM landing page opens up
If this works then I'll check if https://<<vidmlbhostname>>/ responds or redirects me to the available node and give me the vIDM landing page. If this works then you would not see the above failure in Stage 5. You will have to fix your Load Balancer configuration issue to proceed forward. No matter how many retries you perform, you will only see that one line in the configurator.log but nothing else.
At this stage, your LB Pool status should show up for your Primary vIDM appliance

Stage 6: vidmconfigureslave

Stage 7: vidmclusterverify

Stage 8: vidmclusterverify

Stage 9: vidmenableconnector

Stage 10:

Stage 11: vidmpreparemasterpgpool

Stage 12: vidmconfigurepgpool

Note:
You might encounter an exception during VidmAddSSHPostgresKeys task during this stage. Actually, your vIDM appliances are rebooted and if the appliances do not come on time then LCM will fail to execute the scripts to complete deployment. All we need to do is that to find out if SSH to the nodes are working and then perform a Retry
Step 13: vidmstartmasterpgpoolservices

Step 14: vidmstartslavepgpoolservices

Step 15: vidminitialconfigprep

Stage 16: savevmoidtoinventory

Stage 17: environmentupdate

Stage 18: notificationschedules

Stage 19: setauthprovider

Stage 20: vidmClusterHealthScheduler

That's it. Your vIDM cluster has been completely deployed
Post Deployment Checks
When we browse to Environments then you can see the globalenvironment fully configured

You may also use "Trigger Cluster Health" request



Also on the pool status, you can see that all the nodes of vIDM cluster are fully functional and UP

And when you browse to https://<<vidmlbhostname>>/ you should be able to browse the landing page

Comments