Comprehensive Guide in deploying vIDM 8.x Cluster with NSX-V Loadbalancer

Sep 4, 20204 min read

Updated: Apr 27, 2021

Finding a working configuration which helps to build a successful vIDM 8.x cluster was a huge problem.

So thought of sharing what worked for me while I was trying to build a cluster in my lab.

So, Let's start

Creating CA Signed Certificates

The first step is to keep our vIDM CA-signed certificate ready. For this one can generate CSR ( Certificate Signing Request ) from vRLCM by clicking on Generate CSR option available under Locker --- Certificate --- Generate CSR

Then fill in all the details requested in the Generate CSR form.

Remember the Common Name should always be your Load Balancer FQDN

Server Domain/Hostname should contain all the hostnames which would part of the vIDM cluster

All IP addresses corresponding to the vIDM nodes involved in this cluster must be documented

One we click on GENERATE after we fill all the details as shown above we get a pem file downloaded.

This PEM file must be given to your certificate authority to get this signed and generate a CA-signed certificate which can be used to deploy our vIDM cluster

Your authority would give you a Key file and a Secure Certificate

Once we have the above information we would use this to import this certificate into vRLCM

To Import certificate into vRLCM, we need to go to Locker and then click on Certificate and Import certificate

We have two files at hand the first one is vidm.key and the second one is premidm certificate.

Open vidm.key using notepad and then paste that inside Private Key section

Open premidm certificate using notepad and then copy and paste this content under Certificate Chain section

Then click on IMPORT to get this certificate imported into Locker

Once we have our CA-signed certificate which will be used in deploying our vIDM cluster imported into vRLCM we will now go ahead and then import this certificate into NSX-V load balancer

Upload vIDM certificate chain and the corresponding root CA certificates onto NSX-V Edge

Browse to NSX-V Edge, then under the configure tab, click on Certificates and then Add a new Certificate

We need to enter the certificate details the same way we did before and that would import the certificate onto the edge

Once the above step is done Server Certificate is imported into Edge

Root certificate has to be added in the same way by exporting it out of Server certificate

Paste this content into a separate file and save it as root.cer and then on the NSX Edge, Configure and then click on Add new CA certificate

That's it you will have both ROOT and the SERVER certificates in place

Configuring NSX-V Loadbalancer to support a clustered vIDM

Application Profiles

Here's the configuration which has to be part of Application Profiles which supports vIDM LB

Application Profile Type: HTTPS End to End

Persistence: Cookie

Cookie Name: JSESSIONID

Mode: App Session

Expires in: 3600

Insert X-Forwarded-For-HTTP header: Enabled

Under Client SSL tab

Client Authentication: Ignore

Under Server SSL, Server Authentication must be enabled

Service Monitoring

Under Service Monitoring

Interval: 5

Timeout: 10

Max Retries: 3

Type: HTTPS

Expected: 200

Method: GET

URL: /SAAS/API/1.0/REST/system/health/heartbeat

Ensure there are no mistakes while typing or copying URL information

Pools

Algorithm: LEASTCONN

IP Filter: IPv4

Monitor Port: 443

Virtual Servers

Virtual Server: Enable

Protocol: HTTPS

Post/Port Range: 443

Creating a Request in vRLCM for a Clustered vIDM deployment

This is not a complicated task so I won't be discussing much. One important aspect is DELEGATE IP ensure this IP is not resolvable

This will be used during the PGPOOL configuration of your vIDM Cluster.

Before we submit the request we need to ensure all pre-validation is successful.

If above certificate steps are not done then your vIDM deployment will fail at Stage-3 as shown below

com.vmware.vrealize.lcm.common.exception.EngineException: vIDM install prevalidation failed
	at com.vmware.vrealize.lcm.vidm.core.task.VidmInstallPrecheckTask.execute(VidmInstallPrecheckTask.java:62)
	at com.vmware.vrealize.lcm.automata.core.TaskThread.run(TaskThread.java:45)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)

Attached a vIDM PreCheck report below for reference.

Once we have all the prechecks successful, then we can submit the request.

Post-Submission Stages during Cluster Deployment

There are 20 stages for a vIDM clustered deployment, they are

Stage 1: validateenvironmentdata

As the name itself suggests it validates the environment data submitted

Stage 2: infraprevalidation

Infrastructure details provided will be validated, the same stuff which was performed during prechecks

Stage 3: vidmprevalidation

vIDM validations will be performed, the same as the ones in prechecks

Stage 4: deployvidm

deploys vIDM OVA's on vCenter

Stage 5: vidmconfiguremaster,vidmprepareslave,vidmprepareslave

During this stage, your Master or the First Node in your vIDM is configured and the Slaves, second and the third nodes will be prepared

Note: In Stage 5, there is a phase called VidmFQDNUpdate if there is a failure observed at this stage as shown below in the screenshot, then your primary vIDM appliance is trying to open or communicate to your vIDM LB and expecting a valid response which is not happening.

Under /opt/vmware/horizon/workspace/configurator.log following exceptions will be seen


2020-09-03T12:37:37,599 INFO  (Thread-146) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:06:20,493 INFO  (Thread-3) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:14:39,678 INFO  (Thread-189) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:14:49,407 INFO  (Thread-189) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503
2020-09-03T13:17:01,765 INFO  (Thread-189) [;;;] com.vmware.horizon.svadmin.service.ApplicationSetupService - Invalid status code validating FQDN: https://premidm.prem.com : 503

At this moment you need to check if your Primary vIDM is responsive

I'll check if https://<<primaryvidmfqdn>>/ is responsive and the vIDM landing page opens up

If this works then I'll check if https://<<vidmlbhostname>>/ responds or redirects me to the available node and give me the vIDM landing page. If this works then you would not see the above failure in Stage 5. You will have to fix your Load Balancer configuration issue to proceed forward. No matter how many retries you perform, you will only see that one line in the configurator.log but nothing else.

At this stage, your LB Pool status should show up for your Primary vIDM appliance

Stage 6: vidmconfigureslave

Stage 7: vidmclusterverify

Stage 8: vidmclusterverify

Stage 9: vidmenableconnector

Stage 10:

Stage 11: vidmpreparemasterpgpool

Stage 12: vidmconfigurepgpool

Note:

You might encounter an exception during VidmAddSSHPostgresKeys task during this stage. Actually, your vIDM appliances are rebooted and if the appliances do not come on time then LCM will fail to execute the scripts to complete deployment. All we need to do is that to find out if SSH to the nodes are working and then perform a Retry

Step 13: vidmstartmasterpgpoolservices