top of page

Search Results

256 results found with an empty search

  • Failures during vRA patch installation

    Was working on a brand new installation of vRA 7.4 where we were attempting to patch with the latest one under KB 56618 While implementing , it was throwing a straight exception right after the upload's completed Below was the exception in /var/log/vmware/vcac/vcac-config.log But actual exception on why the failure was under IAAS node's Management Agent All.log Failure was seen because IAAS was unable to fetch binaries from vRA Appliance On vRA Virtual Appliance nodes, open /etc/hosts file, find the entry for IPv4 loopback IP Address (127.0.0.1). Make sure that the ‘Fully Qualified Domain Name’ of the node immediately follows ‘127.0.0.1’, before ‘localhost’. ** Example: 127.0.0.1 FQDN_HOSTNAME_OF_NODE localhost In our case it was as below /etc/hosts was # VAMI_EDIT_BEGIN # Generated by Studio VAMI service. Do not modify manually. 127.0.0.1 localhost 10.36.19.50 vraapp01.pslab.org vraapp01 127.0.0.1 vraapp01.pslab.org load-balancer-host # VAMI_EDIT_END Actually it should be # VAMI_EDIT_BEGIN # Generated by Studio VAMI service. Do not modify manually. 127.0.0.1 vraapp01.pslab.org vraapp01 localhost ::1 vraapp01.pslab.org vraapp01 localhost ipv6-localhost ipv6-loopback # VAMI_EDIT_END 127.0.0.1 localhost.localdom 127.0.0.1 vrava-lb.pslab.org load-balancer-host This change has been documented in the Patch KB as well. Here are the screenshots for failure and successful patch installation #vRealizeAutomation

  • VAMI portal changes in vRealize Automation 7.5

    Just upgraded my lab to RA 7.5, so thought to blog on changes in VAMI from 7.4 to 7.5 TIP 1 When you take an SSH session to a vRA node, now it shows what's MASTER and what's REPLICA on the ssh session itself TIP 2 In Previous versions "Cluster" and "Database" settings were under vRA tab on VAMI portal. In 7.5 "Cluster" is now a separate tab which includes "Database" options as well VAMI 7.4 VAMI 7.5 As you can see above Cluster and Database settings are now merged. You have an additional option for taking a Database Dump. When we click on this option it would generate vRA appliance's postgres database dump under /tmp Dump would be in the format of vcac_dbdump_YYYY_MM_DD_HH_MM_SS.sql.zip This is a good feature from VMware Support perspective and can be taken without any downtime or stopping services. TIP 3 Support bundle option has now been removed from Cluster tab as it was on 7.4 and now moved under Logs tab in 7.5 Log Bundle collection in 7.4 Log Bundle Collection in 7.5 #vRealizeAutomation

  • What's New in vRealize Automation 7.5

    vRealize Automation 7.5 released on September 20, 2018. Below is the list of new features and enhancements which came up with this release. Revamped UI and User Experience A complete new look and feel for vRealize Automation along with streamlined flows for common self-service tasks UI based on VMware Clarity standards Larger catalog cards show more of the description Cleaner catalog view Multiple instances of the same catalog item across business groups are now rolled up; the user selects the business group at request time Items and Requests tab merged into new Deployments tab Request details for decommissioned resources moved to the Administration tab Improved status of in-progress requests History view shows all requests associated with a single deployment over time Improved search capabilities across product menus and objects Contextual access to documentation from the product UI Home page and portlets are deprecated in this release Save button on requests is deprecated in this release Improved Integration with vRealize Operations Manager This release introduces deployment dashboards for application owners and enhancements to intelligent workload placement capabilities via integration with vRealize Operations. Show deployment alerts and key metrics (CPU, memory, IOPS, and network) for machines in the deployment details view Enable optimization of vRealize Automation managed workloads to align with vRealize Operations placement policy This builds on an earlier integration for optimizing initial placement to allow ongoing optimization of existing workload Configuration Automation Framework Native integration with external Ansible Tower Management tool. OOTB support for Ansible Tower as first class citizens in vRealize Automation Drag and drop Ansible Tower object in the Blueprint design canvas Parameterize and support early and late binding/request time Dynamically select Ansible job templates,including playbooks, for application configuration Support Day 2 actions to register or decommission machine Troubleshooting Improvements Improvements to Force Delete/Re-Submit ( failed or orphaned deployments) Post-Migration validation Consistent log tracing across solution Expose trace-id to the vRealize Orchestrator plug-in API vRO Database Clustering and Configuration vRO database configuration moved to vRA VAMI Embedded vRO database ( Postgres) is now able to be clustered and supports failover Microsoft Azure Blueprint Enhancements Support for Azure Managed Disks Enhanced Support for Azure regions NSX-T Datacenter Native Integration vRealize Automation now has native integration with NSX-T Datacenter OOTB support for NSX-T Datacenter as first class citizen in vRealize Automation Drag and Drop the following NSX-T Datacenter Services in blueprint design canvas Support for Day-2 actions Event Broker & Custom forms Improvements #vRealizeAutomation

  • Installing vRealize Automation 7.5

    There isn't a great deal of difference in deploying a simple instance of vRA 7.5. It's same as in it's previous versions. Sharing screenshots from my implementation in lab , as it might benefit others. Run Prerequisite Checks once it shows failure , click on Fix so that it runs script on IAAS node mapped to this appliance and get's it ready for the install Now once these pre-reqs are done , move on to enter vRA appliance hostname here and then click on Next Configure SSO ( administrator@vsphere.local ) Configure SQL Server Configure DEM's Configure Agents Generate Certificates Web and Manager are on same node , so no need of a separate certificate in this case ( simple install ) Post Certificates, It would bring up a screen to take snapshots and then the installation starts. It took around 30 minutes for the installation to complete Post Installation , one has to license vRA and it would present post-installation options That's it , it's so simple to install vRA 7.5 ( that's been the case since 7.x was launched ) Will explore various features in my upcoming posts... !! Stay Tuned !! #vRealizeAutomation

  • IaaS node upgrade fails with exception "The archive period of resource <> i

    Once IaaS upgrade kicks-off in some environments you might experience an error stating it cannot continue upgrading ModelManagerWeb Exception goes as below Executing:E:\Program Files (x86)\VMware\vCAC\Server\Model Manager Data\Cafe\Vcac-Config.exe UpgradeArchiveDayTo62 -v[12:26:33.812] Error: [sub-thread-Id="6" context="" token=""] The archive period of resource NUKES02 is not up to date.VMware.Cafe.JsonResponseException: Child resource must have the same lease as its parent.PUT https://vrasimple/catalog-service/api/provider/providers/<>/resources/<> Request: { "id": "VirtualMachineID", "name": "NUKES02", "description": null, "resourceTypeId": "Infrastructure.Virtual", "catalogItemId": null, "catalogItemRequestId": "<<", "organization": { "tenantLabel": "nukes", "tenantRef": "nukes", * * * Response: {"errors":[{"code":20148,"source":null,"message":"Child resource must have the same lease as its parent.","systemMessage":"Child resource must have the same lease as its parent.","moreInfoUrl":null}]} at VMware.Cafe.JsonRestClient.d__2`1.MoveNext() --- End of stack trace from previous location where exception was thrown --- This exception occurs when there is a discrepancy between SQL and Postgres databases with respect to archive dates. Analysis You would have to find out from IAAS Node / ManagementAgent / All.log , how many Virtual Machines are at fault. Note down the names first , then execute query select * from dbo.VirtualMachine where VirtualMachineName = '<>,<>' Once we have VirtualMachineID of VirtualMachines showing discrepancy, keep an eye on "Expire Days" column in the output make a note of the value Login into vRA postgres database ​ su - postgres /opt/vmware/vpostgres/current/bin/psql vcac \x Then execute select * from cat_resource where name = 'vm01'; Examine archive_days value from the output Expire Days from SQL Database for this Virtual Machine and archive_days from Postgres should be same , else you would have problem during upgrade Remediation Now to come out of this problem, we need to update VirtualMachineID's of the virtual machines which are showing this discrepancy. We have the VirtualMachineID captured through the query shared above, so we need to just go ahead and update's it's value as it is shown in Postgres UPDATE VirtualMachine SET ExpireDays = 0 WHERE VirtualMachineID in ('XXXXX-XXXXX-XXXXX-XXXX') Once done go ahead and retry failed iaas upgrade and it should work like charm. Post this exception , only DEM Orchestrator and Agents would be pending. This error occurs when your almost done upgrading ModelManagerData\ #vRealizeAutomation

  • Migrating vRA IaaS database to a NEW SERVER

    This article details on what steps are to be taken in order to migrate vRA IAAS database from one node to another There are three places where you need to make a change ManagerService.exe.config under ...\Program Files (x86)\VMware\vCAC\Server\ ( Manager Service nodes ) Web.Config under ...\Program Files (x86)\VMware\vCAC\Server\Model Manager Web\ ( Web Nodes ) IAAS database Note : If it's a distributed environment , then ensure you make changes across all nodes which have these components installed. Stop all IAAS related services across all nodes before starting this activity Step 1 Making changes to ManagerService.exe.config Take a backup of ManagerService.exe.config file Edit ManagerService.exe.config using word-pad , as seen in the screenshot below. Search for existing database server ,you would find database server name configured under , change the value to "Source=<>" to the new SQL server name 3. Save the file Step 2 Making changes to Web.config under Model Manager Web Take backup of Web.config There would be current database hostname under , edit that to new one and save the file Step 3 Making changes inside your IAAS database Connect to SQL Management Studio and open vRA IaaS database In the VMware vRealize Automation IAAS database on the new server, update the DynamicOps.RepositoryModel.Models table. This table contains loopback connection strings ( ConnectionString column) for each of the VMware vRealize Automation models that require updating with the new Data Source and Initial Catalog values. You will need to edit this table to replace the Data Source with your updated server FQDN and the Initial Catalog with your updated database name (if different). Verify existing values using following query SELECT * FROM [<>].[DynamicOps.RepositoryModel].[Models] Modify values using following query update [<>].[DynamicOps.RepositoryModel].[Models] set ConnectionString='Data Source=<>;Initial Catalog=<>;Integrated Security=True;Pooling=True;Max Pool Size=200;MultipleActiveResultSets=True;Connect Timeout=200' Start all IAAS related services across all nodes before starting this activity Post Change Validations Perform health-checks across on whole environment. Click Here for detailed instructions Ensure data-collection is working as expected Provision a VM and check if it goes through completely You might encounter following issues If there are errors seen similar to below stack Error processing workflow creation Error executing query usp_SearchInitializingRequestVirtualMachines Inner Exception: Error executing query usp_SelectGroup Then it's a problem with MSDTC. Most likely it would be because of SQL nodes. On SQL node you might see MSDTC exception as well 2018-08-21 05:41:31.620 spid62 Enlist operation failed: 0x8004d01c(XACT_E_CONNECTION_DOWN). SQL Server could not register with Microsoft Distributed Transaction Coordinator (MS DTC) as a resource manager for this transaction. The transaction may have been stopped by the client Reconfigure MSDTC on SQL node , these exceptions should stop occurring again. For MSDTC troubleshooting follow KB 2089503 #vRealizeAutomation

  • !!! Rabbitmq reported unrecoverable state , recovery.dets corrupted !!!

    Unable to start rabbitmq after an outage? Are you seeing a similar exception as below 2018-07-26T09:39:17.273888+00:00 <> [cluster-rabbitmq-monitor] - ERROR - Rabbitmq reported unrecoverable state: [Error]: {could_not_start,rabbit, {{badmatch, {error, {{{badmatch, {error, {not_a_dets_file, "/var/lib/rabbitmq/mnesia/rabbit@<>/recovery.dets"}}}, [{rabbit_recovery_terms,open_table,0, [{file,"src/rabbit_recovery_terms.erl"},{line,126}]}, {rabbit_recovery_terms,init,1, [{file,"src/rabbit_recovery_terms.erl"},{line,107}]}, {gen_server,init_it,6,[{file,"gen_server.erl"},{line,328}]}, {proc_lib,init_p_do_apply,3, [{file,"proc_lib.erl"},{line,247}]}]}, {child,undefined,rabbit_recovery_terms, {rabbit_recovery_terms,start_link,[]}, transient,30000,worker, [rabbit_recovery_terms]}}}}, [{rabbit_queue_index,start,1, [{file,"src/rabbit_queue_index.erl"},{line,464}]}, {rabbit_variable_queue,start,1, [{file,"src/rabbit_variable_queue.erl"},{line,455}]}, {rabbit_priority_queue,start,1, [{file,"src/rabbit_priority_queue.erl"},{line,92}]}, {rabbit_amqqueue,recover,0, [{file,"src/rabbit_amqqueue.erl"},{line,239}]}, {rabbit,recover,0,[{file,"src/rabbit.erl"},{line,756}]}, {rabbit_boot_steps,'-run_step/2-lc$^1/1-1-',1, [{file,"src/rabbit_boot_steps.erl"},{line,49}]}, {rabbit_boot_steps,run_step,2, [{file,"src/rabbit_boot_steps.erl"},{line,49}]}, {rabbit_boot_steps,'-run_boot_steps/1-lc$^0/1-0-',1, [{file,"src/rabbit_boot_steps.erl"},{line,26}]}]}} 2018-07-26T09:39:17.898241+00:00 vasydp161 su: (to rabbitmq) root on /dev/pts/4 Above exception states that rabbitmq could not start as there was an exception reading recovery.dets file If you browse to /var/lib/rabbitmq/mnesia and perform ls -ltrh You would see that this file recovery.dets is corrupt or 0 bytes recovery.dets file contains recovery metadata if the node was stopped gracefully. There exists a high change of it's corruption if the node rabbitmq is stopped abruptly To remediate , delete or move this 0 byte file to another location ( eg. /tmp/ ) and then reboot the node , in this case vRealize Automation appliance Once done , during boot process we did see all services including rabbitmq started successfully. #vRealizeAutomation

  • vRealize Suite LifeCycle Manager Introduction

    vRealize Suite LifeCycle Manager a.k.a vRLCM is designed to streamline and simplify deployment and on-going management of vRealize product portfolio vRLCM automates the install , configuration , upgrade and health management across vRealize suite products within a single pane of glass Features vRealize Suite Lifecycle Manager can install and manage following vRealize Suite products vRealize Automation vRealize Business for Cloud vRealize Operations vRealize Log Insight Installation User is prompted up front for the hostnames / IP's and license keys. vRLCM then proceeds to deploy and configure products with no further user interactions. Configuration Management and Drift Remediation Once the product is configured tot he desired state, we can use vRLCM to capture a baseline configuration. Over time, configuration changes may occur causing product configuration to drift from the baseline configuration. vRLCM displays the configuration drift that has occurred and provides the capability to remediate this configuration drift, returning the product to the baseline configuration. Health and Market Place Working in conjunction with vRealize Operations, vRLCM can display the health status of the product it manages Upgrade One click upgrade of vRealize Suite products it manages Content Management Feature to capture, test and release software defined content such as blueprint, templates, workflows etc... Benefits Simplified installation and configuration that saves time and effort Easy alignment with VMware recommended reference architecture and validated design Minimized on-going management effort by leveraging automated configuration and drift management with health monitoring The upgrade is simplified for all supported vRealize suite products System Requirements vRLCM runs as a single virtual appliance running VMware's photon OS Clustering is not possible use vSphere HA 2 vCPUs if content management is disabled 4 vCPUs if content management is enabled 16 GB Memory 135 GB Storage Supported vRealize Suite Products Click here Installation Download OVA from My VMware portal VMware-vLCM-Appliance-1.3.0.14-9069107_OVF10.ova Deploy and pass on necessary parameters like hostname & network information Once powered on and boot process completes , use a supported browser to connect to your vRealize Suite Lifecycle Manager appliance by using appliance IP Address or FQDN https://<>/vrlcm If your using for logging in for the first time username : admin@localhost password : vmware That's it Welcome to the world of vRealize Suite Lifecycle Manager I would blog more about vRLCM as i am configuring it in my lab .. so stay tuned ....

  • Patching vRealize Automation after upgrade to version 7.4

    VMware recently released HF3 as it's called internally to remediate few important bugs. Click here to checkout what bugs are fixed Getting Ready Take Snapshots Verify that all nodes in your vRealize Automation installation are up and running If your environment uses load balancers for HA, disable traffic to secondary nodes and disable service monitoring until after installing or removing patches and all services are showing REGISTERED. In the same manner disable traffic for all secondary members Obtain the patch file from knowledge base article and copy it to the file system from where you would access VAMI interface of vRA appliances Note : Click here for Additional Pre-Requisites, do read , do not skip Implementing Patch Login into Master Node's VAMI page Click on vRA Settings -> Patches Note : To enable or disable Patch Management, log in to the vRealize Automation appliance using the console or SSH as root, and enter one of the following commands: /opt/vmware/share/htdocs/service/hotfix/scripts/hotfix.sh enable /opt/vmware/share/htdocs/service/hotfix/scripts/hotfix.sh disable Click on upload patch , select location and upload Once upload is complete it would give you an option to install the patch. Remember do not refresh the page once you start an upload , just wait till it completes Click on "INSTALL" to start installation of patch Once Installed, status should change to "Success.Install Complete" We can check what patches are installed by browsing through "Installed Patches" section Post Installation Procedures Verify if all services are in "REGISTERED" state on all nodes ( not showing other two nodes as i know they are running ) Re-enable LB traffic to secondary nodes Finish testing by provisioning few workloads , once done consolidate snapshots taken during pre-requisites check #vRealizeAutomation

  • ehcache replication errors in horizon.log

    Are you seeing similar exceptions in horizon.log? First step you do to troubleshoot elasticcache problem is to check if runtime-config.properties file is properly configured From first vRA appliance in a 3 node cluster # ehcache configuration properties ehcache.replication.rmi.registry.port=40002 ehcache.replication.rmi.remoteObject.port=40003 # Overrides the list of ehcache replication peers. FQDNs separated by ":", e.g. server1.example.com:server2.example.com ehcache.replication.rmi.servers=node2fqdn:node3fqdn From Second vRA appliance in a 3 node cluster # ehcache configuration properties ehcache.replication.rmi.registry.port=40002 ehcache.replication.rmi.remoteObject.port=40003 # Overrides the list of ehcache replication peers. FQDNs separated by ":", e.g. server1.example.com:server2.example.com ehcache.replication.rmi.servers=node1fqdn:node3fqdn From Third vRA appliance in a 3 node cluster # ehcache configuration properties ehcache.replication.rmi.registry.port=40002 ehcache.replication.rmi.remoteObject.port=40003 # Overrides the list of ehcache replication peers. FQDNs separated by ":", e.g. server1.example.com:server2.example.com ehcache.replication.rmi.servers=node1fqdn:node2fqdn Next , we did check if port connectivity is established and open node1:~ # curl -v telnet://node1fqdn:40003 * Rebuilt URL to: telnet://node1fqdn:40003/ * Trying 10.37.79.15... * TCP_NODELAY set * Connected to node1fqdn (XX.XX.XX.XX) port 40003 (#0) Ran elastic-search health-check and it's output was promising as well https://hostname/SAAS/API/1.0/REST/system/health/ Did approach Engineering and was suggested to perform following steps 1) Backup existing file: cp /opt/vmware/horizon/workspace/bin/setenv.sh /opt/vmware/horizon/workspace/bin/setenv_bak.sh 2) vi /opt/vmware/horizon/workspace/bin/setenv.sh 3) Import utils.inc file by adding this line in setenv.sh: . /usr/local/horizon/scripts/utils.inc 4) Search for JVM_OPTS in setenv.sh file and ensure you have this property set exactly like this: -Djava.rmi.server.hostname=$(myip) 5) Please repeat above steps for all appliances 6) Restart vIDM service on all appliances: service horizon-workspace restart By default this is how it looks... JVM_OPTS="-server -Djdk.tls.ephemeralDHKeySize=1024 -XX:+AggressiveOpts \ -XX:MaxMetaspaceSize=768m -XX:MetaspaceSize=768m \ -Xss1m -Xmx3419m -Xms2564m \ -XX:+UseParallelGC -XX:+UseParallelOldGC \ -XX:NewRatio=3 -XX:SurvivorRatio=12 \ -XX:+DisableExplicitGC \ -XX:+UseBiasedLocking -XX:-LoopUnswitching" and we need to change it to JVM_OPTS="-server -Djdk.tls.ephemeralDHKeySize=1024 -Djava.rmi.server.hostname=$(myip) -XX:+AggressiveOpts \ -XX:MaxMetaspaceSize=768m -XX:MetaspaceSize=768m \ -Xss1m -Xmx3419m -Xms2564m \ -XX:+UseParallelGC -XX:+UseParallelOldGC \ -XX:NewRatio=3 -XX:SurvivorRatio=12 \ -XX:+DisableExplicitGC \ -XX:+UseBiasedLocking -XX:-LoopUnswitching" After making the change on all available vRA appliances and restarting them. There were no more exceptions seen in horizon.log This file is setting the correct hostname and IP address in the java environment for the application to form a cluster correctly. This code has been already fixed in IDM and should be added in vRA 7.4 Finally , root-cause is that using IPv6 address in /etc/hosts file is not setting the hostname and ip-address correctly for the application. #vRealizeAutomation

  • Removing stale Puppet entries from Configuration Items

    When provisioning / destroy fails or is stuck due to whatever reason , there is every chance that we have to manually clean them. In our scenario, these stale entries were present under vRA Items -> Configuration Management Note : Before we get into steps to remove these entries from Database , we have to have a full backup of vRA vPostgres database Also ensure , Virtual Machine tagged to this entry is no longer present on the endpoint and managed by vRA Steps to remove these entries from Database After database backup has been taken, take a snapshot of vRA appliance Connect to vRA postgres database These entries would be present under cat_resource and cat_resource_owners tables Filter active Puppet entries using below query select * FROM public.cat_resource WHERE resourcetype_id = 'ConfigManagement.Puppet' AND status = 'ACTIVE'; One above query is executed, you would be presented with all active Puppet entries, this would match with the UI entries Before we delete entries from cat_resource , we need to remove references from cat_resource_owner Using the value present in id column of above query execute following query on cat_resource_owners select * FROM public.cat_resource_owners WHERE resource_id = 'XXXXX'; You would be presented with one result, then delete it delete * FROM public.cat_resource_owners WHERE resource_id = 'XXXXX'; Now for the resource_id which was used in the previous query to remove from cat_resource_owners , select the binding ID and then remove it from cat_resource table select * FROM public.cat_resource WHERE binding_id = 'YYYYY'; then delete this entry from database delete * FROM public.cat_resource WHERE binding_id = 'YYYYY'; Now refresh vRA portal , removed Puppet entry would be no longer present. #vRealizeAutomation

  • Removing an existing vRA License

    Came across a situation where even though the product was licensed , it was stating that it's not found in the database. Re-applying same license was not working and it was stating that it's invalid While going through /var/log/vmware/vcac/catalina.log found that there jdbc connection errors while trying to access database entry where licensed assets were present. Also, /var/log/vmware/messages was clearly stating that the license is missing or not found Follow below steps to delete an existing license in vRA ** Before Attempting these steps take a snapshot and a valid backup of vRA database ( vPostgres ) ** SSH or putty into the vRealize Automation appliance as root Take a backup of vRealize Automation Database Stop vRealize automation service service vcac-server stop Change directory using below command cd /tmp Run this command to create a copy of the database in /tmp su -m -c "/opt/vmware/vpostgres/current/bin/pg_dumpall -c -f /tmp/vcac.sql" postgres Run this command to compress database bzip2 -z /tmp/vcac.sql Connect to vPostgres database su -postgres psql vcac Verify if embeddedlicenseentry table is present in database \dt embeddedlicenseentry Review how many rows are present inside the table SELECT * FROM embeddedlicenseentry; There should be 36 ~ 37 entries Delete all information from embeddedlicenseentry table delete FROM embeddedlicenseentry; Verify if the table is now empty SELECT * FROM embeddedlicenseentry; Exit from Database \q Restart services on appliance vcac­vami service­manage stop vco­server vcac­server horizon­workspace elasticsearch vcac­vami service­manage start vco­server vcac­server horizon­workspace elasticsearch Once the Services are back , if we go back to VAMI portal , under vRA Settings --> Licensing , you should not see your previous license present Re-Apply your existing license it should be successful. For distributed environments 2 vRA appliances scenario:- Perform database on the Master node , you can get to know the MASTER from vRA Settings --> Database Restart services on both the nodes 3 vRA appliance scenario:- Change database to ASYNCHRONOUS mode Once done , shutdown both the non-MASTER applainces Since your left only with the MASTER node now , follow same steps as above Once the key is accepted , bring the other nodes online Once all services are registered , go ahead and change the database mode to SYNCHRONOUS Ensure , there are proper backups before performing any of these steps. #vRealizeAutomation

bottom of page