top of page

Search Results

247 results found with an empty search

  • Implementing vRA 7.5 HF 16 through vRLCM

    Patching vRealize Automation is no less an effort. It needs planning and appropriate pre-requisite steps to be taken before one implements it. vRealize Suite Life Cycle Manager eases this process by helping us to manage and patch vRA environments. Now let's understand how we use vRLCM to patch vRA, In this case, I am using the following versions vRA 7.5 GA which will be patched with HF16 vRLCM 2.1 Patch 2 Below screenshots shows my vRA environment managed by vRLCM It's a vRealize Automation 7.5 environment Before we start thinking about patching vRLCM provides us a feature to take a snapshot on the vRA environment being managed by it. Note: It does not take a snapshot on the database node ( IAAS DB ), this is something we have to take manually. Once the snapshots are taken head to Product Support pane and then download the patch we would like to apply. In our case, it would be vRealize Automation 7.5 HF16 Once the download is complete, we can now go ahead and start the patch installation process Click on the three dots on the right side of the environment and then click on Patches --> History This will not show anything as there are no patches installed currently. Click on the three dots on the right side of the environment and then click on Patches -> Install Patch Select the patch which is being installed, that's HF16, this is the only one we downloaded Review and then click on Install Under requests, you can see one for patching just started Finally after a 1 hour and 7 minutes the patching completes The request is marked as completed inside vRLCM Same is seen in VAMI where the patch history pane would show that the environment is running on HF16 Then environment pane is updated as well This is how a vRA environment is patched using vRLCM. The logs we can use to monitor while this patch is being implemented are vRA Appliance: MASTER node /var/log/vmware/vcac/vcac-config.log /var/log/messages To filter only patch logs one can execute tailf /var/log/vmware/vcac/vcac-config.log | grep -i cluster.patch Note: Ensure timezone on the appliance is set to UTC, this is mandatory Below .txt file has complete snippet right from the point the patch activity started and till the point, it's finished.

  • high CPU utilization and outofmemory issue with vIDM 3.3.x integrated with vRA 8.x

    Environment vIDM 3.3.1 /vIDM 3.3.2 integrated with vRA 8.0/vRA 8.1 and LCM Cause The initial analysis indicates there could be a potential issue around internal postgres query cache, a considerable number of SCIM APIs triggered by vRA “DB Cache replication Thread” Also, the number of concurrent logins and Number of users synched Resolution For the OutofMemory issue, the immediate fix is to increase the memory size and number of CPUs. We found that the vIDM setup is stable with 4 number of CPUs and 16 GB of RAM/Memory. Advice to make 4 CPUs ad 16 GB RAM as default configuration instead of 2 CPUs and 6 GB ram (which is existing default hardware configuration in vIDM 3.3.1 and vIDM 3.3.2 OVAs).

  • vRA fails to deploy from vRSLCM if Second or Teritiary DNS servers are unable to resolve hostnames

    I have been attempting to install vRA 8.x for quite a number of times but I've never been successful due to a simple problem. Let me explain what was that. Every time I used to install it used to fail at this point where it was installing client-secrets Release "client-secrets" does not exist. Installing it now. Error: Job failed: BackoffLimitExceeded helm failed to upgrade 'client-secrets' in namespace 'prelude' Note: Above snippet has been taken from deploy.log When we check csp-fixture-job-XXXX.log under /services-logs/csp-clients-fixture we see that the curl timed out Logging in % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0curl: (6) Could not resolve host: premvra.prem.com But before we started the install we did cross-check that nslookup to my DNS was working absolutely fine, so why this problem? premvra which is our vRA node premidm which is our vIDM node premlcm which is out vRLCM node When you trigger the easy installer it would ask you for Netowrk Information as you can see in the below screenshot The first DNS server in my case is my Windows Active Directory which has forward and reverse lookup zones configured and contains all the DNS records for premlcm, premidm and premvra as well as the rest of the VMware environment. The second DNS server 10.yy.yy.yy is our router which also functions as a DNS server for all other systems outside my lab environment. This router will not be able to resolve anything within the dns zone hosted in the MS DNS Server, but is reachable for all systems. When vRA installation is in progress during this stage when client-secrets are being installed there are certain POST calls made for few registrations in the background Form my research looks like we perform a ROUND-ROBIN load balancing mechanism when multiple DNS servers are configured. In my case , servers ( premlcm , premvra, and premidm ) will only be resolved through my primary DNS. If in case the POST calls go through the secondary DNS for the name resolution it would fail and throw below exception 2020-04-28 10:03:41.430+0000 ERROR 43 --- [or-http-epoll-1] c.v.i.common.util.HealthUtilComponent : premidm.prem.com: Name or service not known java.net.UnknownHostException: premidm.prem.com: Name or service not known at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_241] Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException: Error has been observed at the following site(s): |_ checkpoint ⇢ Request to POST https://premidm.prem.com/SAAS/API/1.0/oauth2/token?grant_type=client_credentials [DefaultWebClient] Stack trace: at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) ~[na:1.8.0_241] at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:929) ~[na:1.8.0_241] at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1324) ~[na:1.8.0_241] at java.net.InetAddress.getAllByName0(InetAddress.java:1277) ~[na:1.8.0_241] After scrapping out this existing deployment, I went ahead and started the installation with only 1 DNS which was able to resolve all the nodes and has entries, and finally, the installation was successful. This scenario might occur in LAB where not all DNS servers are configured for name resolutions or even in production environments where DNS replications have few issues After numerous attempts, it was so heartening to see this screen where it says "INSTALLED" Every DNS Servers mentioned during installation should be able to resolve all the three nodes else installation failures will happen.

  • vRLI Cluster unresponsive as / partition full on 1 node due to multiple .hints file

    Recently we've seen a situation where the root partition was full on vRLI appliance. This was part of a vRLI 3 node cluster. When this issue occurs, the cassandra service gets into a hung state and then this issue starts impacting other nodes in the cluster as well. cassandra.log shows service unresponsive due to space issue on the root partition INFO [HANDSHAKE-XXXXXXX] 2020-03-04 10:47:57,384 OutboundTcpConnection.java:560 - Handshaking version with XXXXXXX INFO [RequestResponseStage-3] 2020-03-04 10:47:57,400 Gossiper.java:1019 - InetAddress /ZZZZZZZ is now UP INFO [GossipStage:1] 2020-03-04 10:47:58,379 StorageService.java:2292 - Node /ZZZZZZZ state jump to NORMAL ERROR [HintsWriteExecutor:1] 2020-03-04 10:48:24,194 CassandraDaemon.java:228 - Exception in thread Thread[HintsWriteExecutor:1,5,main] org.apache.cassandra.io.FSWriteError: java.io.IOException: No space left on device at org.apache.cassandra.hints.HintsWriteExecutor.flushInternal(HintsWriteExecutor.java:232) ~[apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.hints.HintsWriteExecutor.flush(HintsWriteExecutor.java:203) ~[apache-cassandra-3.11.2.jar:3.11.2] at org.apache.cassandra.hints.HintsWriteExecutor.lambda$flush$1(HintsWriteExecutor.java:195) ~[apache-cassandra-3.11.2.jar:3.11.2] The root partition was occupied by a .hprof file along with multiple .hints file and crc32 file getting created in /usr/lib/loginsight/application/lib/apache-cassandra-*/data/hints directory Background on hints Hints are one of three ways to support consistency in the system. When replica node is not available coordinator stores mutating data in temporary hint files to proceed as replica is available. For details look here - https://cassandra.apache.org/doc/latest/operating/hints.html Ideally, in all vRLI deployments, it's configured that they are deleted after the default 3 hours. But somehow it's not working and hint files stay there seems forever in some environments. Repairing runs automatically that is an addition way to support consistency in the system. Manual deletion is solution in this situation. This is a bug and will be addressed in upcoming releases of vRLI

  • Roles in vRealize Automation 8.x

    Roles define a set of privileges associated with users. These privileges are tasks that a user can perform. Depending on responsibilities, a specific role will be assigned to a user. Organization roles are defined at the topmost vRealize Automation layer. The following organization roles exist in vRealize Automation: Organization Owner Organization Member Service roles define user access to individual services offered by vRealize Automation. The following services exist in vRealize Automation: Cloud Assembly Code Stream Orchestrator Service Broker Organization and Services All services offered by vRealize Automation belong to the only possible global organization. Any given service role must be associated with an organization role. Only one organization can exist in a vRealize Automation deployment. Each service in vRealize Automation has more than one role. For example, the Cloud Assembly service includes the following roles: Cloud Assembly Administrator Cloud Assembly User Any user in vRealize Automation must be assigned an organization role with an associated service role.

  • vIDM Architecture

    The identity service runs as a pod in Kubernetes. If a user tries to log in to vRealize Automation The identity service redirects the request to the VMware Identity Manager URL The Identity Manager appliance validates the user credentials with Active Directory The user can log in to vRealize Automation console The identity-db is a dedicated PostgreSQL database for the identity service The URL to access the VMware Identity Manager appliance is set as a VIDM_HOST environment variable during installation. All requests to authorize credentials are forwarded to the VMware Identity Manager appliance. Administrators can use access policies to configure features, such as mobile single sign-on (SSO), conditional access to applications based on enrollment and compliance status, and multifactor authentication. VMware products can use VMware Identity Manager as an enterprise SSO solution VMware Identity Manager is based on the OAuth 2.0 authorization framework.

  • Authentication and Authorization

    Authentication Confirms your identity Verifies who you are Requires login credentials Authorization Determines what you have access to Grants permission to access a resource Requires a user role

  • Version table below vRA cluster tab showing incorrect status

    I was working on a vRA upgrade that was being performed using vRLCM Most of the components were upgraded apart from DEM Orchestrator. This one was constantly failing due to some unknown issues. Since this was the last component we decided to go ahead and install it manually. It did install and the whole upgrade was complete. But when we go to VAMI and check the Cluster tab and the versions, DEM Orchestrator status on the node was set to "Upgrade Failed" This information under the Cluster tab of VAMI is actually coming from two places 1. Management Agent installed on the IAAS node where the respective components are installed 2. cluster_nodes table of vPostgres Inspecting cluster_nodes we do find that there is a ping_info column has the above information {    "installationDrive":"C:\\",      "name":"XXXXXXX.adDEO", "certificate":{ },   "metrics":{         },          "metrics":{ "cpuPercentage":0.0, "memory":52379648     }, "state":"Started",   "version":"7.6.0.16195", "type":"DemOrchestrator", "condition":"Failed"       } As you can see above there is this "condition" tag inside the ping_info column of cluster_nodes table. This is coming from or being updated inside this table from the Management agent. In order to resolve this issue, we have to remove this condition tag inside this table. Management Agent pulls this information from the registry of the IAAS node where the components are installed. To remediate Go to following path HKEY_LOCAL_MACHINE\SOFTWARE\WOW6432Node\VMware, Inc.\VMware vCloud Automation Center DEM Click on DemInstance01 or DemInstance02 ( depends on your environment ) On the right side check if you have ComponentCondition registry key If yes , delete the registry key Restart Management Agent on the node Now if you go back to VAMI the version column and the data inside ping_info column should have appropriate info and VAMI should know correct data

  • vSphere Web (Flash) Client Supportability and End of Life

    vSphere Web Client uses Adobe Flash and Adobe Flash is going End of Life (EOL) in Dec 2020. All the browsers have aligned their efforts to disable / stop running the Flash application for this date. VMware vSphere 6.5 and 6.7 are supported until Nov 2021 and both these versions are shipped with vSphere Web Client (Flash) with them. The vSphere Client (HTML5), starting with vSphere 6.7 Update 1, has become feature complete to support all vSphere management capabilities. VMware recommends for customers to upgrade VMware vCenter Server(s) to 6.7 Update 3 by Dec 2020 and use vSphere Client (HTML5) to manage the vSphere environment. For more information please review the following resources: KB Article 78589 VMware vSphere Blog - vSphere Web Client Support beyond Adobe Flash EOL

  • Disable health broker service ( vrhb-service ) on vRA 7.x

    Perform the following steps to ensure that vhrb service is disabled and not monitored by cron jobs Note: Take Snapshots ( No Memory, No Quiescing ) Stop the health service monitor by commenting out the cron job in /etc/cron.d/monitor-vrhb-cron Kill any instances of monitor that might be running by executing ps -A | grep monitor-vrhb.sh | awk '{print $1}' | xargs --no-run-if-empty kill -9 $1 Stop the health service by executing the command service vrhb-service stop Verify if the service is stopped if a process is found kill it manually ps aux | grep Quorum Cleanup the Health Service datastores ( aka Sandboxes ) rm -r /var/lib/vrhb/service-host/sandbox rm -r /var/lib/vrhb/vra-tests-host/sandbox Note : I have not configured health broker service in my lab so you would not see vra-test-host/sandbox command being executed As a last step, turn off vrhb-service using chkconfig chkconfig vrhb-service off

  • The remote server returned an error: (403) Forbidden

    You might end up in a situation where your IAAS service is not REGISTERED on VAMI Repository.log [UTC:2020-03-18 03:20:56 Local:2020-03-18 03:20] [Error]: [sub-thread-Id="51"  context=""  token=""] System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation. ---> System.AggregateException: One or more errors occurred. ---> System.Security.Authentication.AuthenticationException: OAuth token request failed. URL: https://<>/SERVICE: endpoints/types/sso ---> System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: The remote server returned an error: (403) Forbidden. at System.Net.HttpWebRequest.EndGetRequestStream(IAsyncResult asyncResult, TransportContext& context) at System.Net.Http.HttpClientHandler.GetRequestStreamCallback(IAsyncResult ar) --- End of inner exception stack trace --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at DynamicOps.Common.Client.RestClient.<>c__DisplayClassc9`2.<b__c8>d__cb.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() * * * * --- End of inner exception stack trace --- at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification) at DynamicOps.Repository.Runtime.SecurityModel.CafeSecurityProvider.LoadSecurityInformation(UserIdentity userIdentity) at DynamicOps.Repository.Runtime.SecurityModel.SecurityModelContext.GetIdentityTasksFromCache(UserIdentity userIdentity) at DynamicOps.Repository.Runtime.SecurityModel.SecurityModelContext.get_IdentityTasks() at DynamicOps.Repository.Runtime.ServiceModel.Data.RepositoryDataService`2.CalculateWritePermissionScopes(Int32 entityId) at DynamicOps.Repository.Runtime.ServiceModel.Data.RepositoryDataService`2.InternalOnChangeEntity[TEntity](Int32 entityId, TEntity entity, IQueryable`1 entitySet, UpdateOperations operation) at DynamicOps.Repository.Runtime.ServiceModel.Data.TrackingModelDataService.OnChangeTrackingLogItems(TrackingLogItem entity, UpdateOperations operation) inc:\Windows\Temp\0bxcpk4c.0.cs:line 105 --- End of inner exception stack trace --- at System.Data.Services.DataService`1.BatchDataService.HandleBatchContent(Stream responseStream) INNER EXCEPTION: System.AggregateException: One or more errors occurred. ---> System.Security.Authentication.AuthenticationException: OAuth token request failed. URL: https://<>/SERVICE: endpoints/types/sso ---> System.Net.Http.HttpRequestException: An error occurred while sending the request. ---> System.Net.WebException: The remote server returned an error: (403) Forbidden. at System.Net.HttpWebRequest.EndGetRequestStream(IAsyncResult asyncResult, TransportContext& context) at System.Net.Http.HttpClientHandler.GetRequestStreamCallback(IAsyncResult ar) --- End of inner exception stack trace --- at System.Runtime.ExceptionServices.ExceptionDispatchInfo.Throw() at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at DynamicOps.Common.Client.RestClient.<>c__DisplayClassc9`2.<b__c8>d__cb.MoveNext() Web_Admin.log [UTC:2020-03-17 18:41:08 Local:2020-03-17 18:41] [Error]: [sub-thread-Id="12"  context=""  token=""] Error occurred writing to the repository tracking log System.Net.WebException: The remote server returned an error: (403) Forbidden. at System.Net.HttpWebRequest.GetRequestStream(TransportContext& context) at System.Net.HttpWebRequest.GetRequestStream() at System.Data.Services.Client.ODataRequestMessageWrapper.SetRequestStream(ContentStream requestStreamContent) at System.Data.Services.Client.BatchSaveResult.BatchRequest() at System.Data.Services.Client.DataServiceContext.SaveChanges(SaveChangesOptions options) at DynamicOps.Repository.RepositoryServiceContext.SaveChanges(SaveChangesOptions options) at DynamicOps.Repository.Tracking.RepoLoggingSingleton.WriteExceptionToLogs(String message, Exception exceptionObject, Boolean writeAsWarning) These messages clearly indicate that your IAAS is trying to fetch auth token from Manager but it's unable to get it. Expected ouputs would be as below [UTC:2020-03-11 12:33:36 Local:2020-03-11 00:33] [VMware.Cafe]: [sub-thread-Id="1" context="" token=""] Setting CafeClientCacheDuration: 00:05:00 [UTC:2020-03-11 12:33:36 Local:2020-03-11 00:33] [VMware.Cafe]: [sub-thread-Id="1" context="" token=""] (1) GET endpoints/types/sso [UTC:2020-03-11 12:33:36 Local:2020-03-11 00:33] [VMware.Cafe]: [sub-thread-Id="10" context="" token=""] (1) Response: OK 0:00.105 [UTC:2020-03-11 12:33:37 Local:2020-03-11 00:33] [VMware.Cafe]: [sub-thread-Id="8" context="" token=""] (2) POST SAAS/t/vsphere.local/auth/oauthtoken?grant_type=client_credentials [UTC:2020-03-11 12:33:37 Local:2020-03-11 00:33] [VMware.Cafe]: [sub-thread-Id="11" context="" token=""] (2) Response: OK 0:00.118 [UTC:2020-03-11 12:33:37 Local:2020-03-11 00:33] [VMware.Cafe]: [sub-thread-Id="8" context="" token=""] (3) GET endpoints/types/com.vmware.csp.cafe.authentication.api/default To resolve this problem Take Snapshots ( MANDATORY ) ( Note: No Memory or Quiescing ) Validate if all the certificates are in place and valid you may do this from VAMI Reinitiate trust under Actions section of Certificate tab on vRA Appliance's VAMI Reboot the environment systematically as per documentation Once the environment is up, you should see all services coming back appropriately

  • VAMI login lockouts

    If an incorrect password is entered a couple of times on VAMI it would lock you out from logging in again for a couple of minutes. Here's the procedure to come out of it instantly so that you don't waste time anymore Step 1 Identify how many login failures occurred with account "root" pam_tally2 -u root Step 2 Below command will reset this value to 0 pam_tally2 -u root --reset Step 3 Verify if the value is back to 0 pam_tally2 -u root Step 4 Test your login into VAMI, this should be working now

bottom of page