To understand what's the role of RabbitmQ in vRealize Automation , first of all let's figure out what's RabbitMQ
RabbitMQ
It's a messaging broker and gives applications a common place to send and receive messages, a safe place to live until they are delivered
A centralized messaging enables software applications to connect as a components of larger application
Softwares can be written on what state the other applications are in, this enables workload to be distributed to multiple systems for performance and reliability concerns
RabbitMQ Architecture
The basic architecture of a message queue is simple, there are client applications called producers that create messages and deliver them to the broker (the message queue). Other applications, called consumers, connects to the queue and subscribes to the messages to be processed. A software can be a producer, or consumer, or both a consumer and a producer of messages. Messages placed onto the queue are stored until the consumer retrieves them
RabbitMQ's usage in vRealize Automation
Used to keep clustered appliances in sync
Make sure only one appliance takes action on a given message which prevents race condition
Powers the event-broker-service ( EBS )
All of these are done through series of queues , one for each action which has to be kept in sync between each appliance
Examples of few Queues are as below
ebs.com.vmware.csp.iaas.blueprint.service.machine.lifecycle.active__
ebs.com.vmware.csp.iaas.blueprint.service.machine.lifecycle.provision__
vmware.vcac.core.software-service.taskRequestSubmitted
vmware.vcac.core.iaas-proxy-provider.catalogRequestSubmitted
vmware.vcac.core.catalog-service.requestSubmitted
vmware.vcac.core.event-broker-service.publishReplyEvent
RabbitMQ and vRA Clustering
Pre-Requisites
Host short-names and FQDN's must be resolved among all the appliances which is being clustered
This DNS requirement is mandatory because RabbitMQ uses short-name in the node naming convention
Ports 4369 , 5672 and 25672 must be open between appliances
4369 is used for peer discovery service by rabbitmq
5672 is used by AMQP
25672 us used for inter-node and cli communication ( erlang distribution server port )
When RabbitMQ is configured in cluster , unlike other clustering applications there is no Master-Slave relationship. The last node to be receiving message is considered to be the "Leading Cluster Node"
The only time this becomes an issue is when all nodes in vRA cluster has to be stopped , shutdown all nodes apart from one node , which needs to be restarted which ensures it has all the latest messages from the queue. Then bring back the other nodes which are stopped
Listing Message Queues
Message Queues are used to ensure multiple clustered vRealize Automation appliances are kept in sync and also to power EBS.
From ssh session running rabbitmqctl list_queues will show all currently configured queues.
Two pieces of data is returned by default
queue name and number of messages in the queue
As one can see in the above screenshot
The one which starts with ebs.com.vmware.xxxxx.xxxxx is used by event-broker-service
The one which starts with vmware.vcac.core.xxxxxx..xxxxx is used for other vRealize Automation functions
Configuration Files
RabbitMQ uses two main configuration files to set required variables , both files are stored under /etc/rabbitmq
/etc/rabbitmq/rabbitmq.config
SSL Information
TCP Listening Ports
Connection Timeouts
Heartbeat Interval
/etc/rabbitmq/rabbitmq-env.conf
NODENAME=<Host_ShortName>
Note : If we change USE_LONGNAME to true , then it would use FQDN to name the cluster
RabbitMQ Server Service
Rabbitmq is controlled by service rabbitmq-server <<options>>
Log Locations and it's usage
All Rabbitmq logs are stored under /var/log/rabbitmq/*
The main operational log is /var/log/rabbitmq/rabbit@<host_shortname>.log
Above log would contains messages about
Startup
Shutdown
Plugin Information
Queue Sync
RabbitMQ is only broker , it does not have information on what other systems are doing with the messages. It will only show content about messages received or processed
CommandLine Options for Troubleshooting
From ssh session of vRA appliance , rabbitmqctl command-let can be used to control rabbitmq system.
Some of the options commonly used in troubleshooting are
rabbitmqctl cluster_status command would give definitive RabbitMQ clustering status
The running nodes line in the above command should contain all the nodes which are part of the cluster
rabbitmqctl list_policies command would list all currently enforced policies. In vRA only one policy should be returned that's ha-all
Re-Join a node to RabbitMQ cluster
If a node is returning less nodes than expected , we can re-join a node to RabbitMQ cluster through VAMI
The node which is being joined to the cluster would be reset, which removes all messages and metadata on that node. Since ha-all policy is set as discussed above , all messages and metadata are copied on other nodes , which means even if the node is reset , once it's back into the cluster , metadata and messages are copied back to the node.
Reset Rabbitmq Usage
As a last resort we can reset rabbitmq to a default state by clicking on Reset Rabbitmq Cluster on VAMI.
This would
Clear all messages out of queues
Should only be used as last resort
All historical data would be destroyed
!! Happy Learning !!