WildFly HornetQ Collocated Message-Replication Based HA Configuration

From NovaOrdis Knowledge Base
Jump to: navigation, search

External

Internal

Overview

This article describes the steps required to implement a highly available collocated HornetQ topology with EAP 6 and higher, using message replication. The concepts behind such a topology are presented here:

HornetQ Collocated HA Topology Concepts

For high availability purposes, the live server and the backup server instances must be installed on two separated physical (or virtual) hosts, provisioned in such a way to minimize the probability of both host failing at the same time. For this specific configuration, the state replication between the active and the backup node is done over the network, so a shared filesystem is not required

Collocation Considerations

In a collocated configuration, we might end in a situation where there are two active HornetQ nodes running within the same JVM. This happens if one of the hosts HornetQ is deployed on fails, and a stand-by HornetQ node becomes active. That is why we need to make sure that acceptor ID and ports do not overlap for instances running on the same host. The configuration examples below comply with this requirement.

In-VM Acceptor ID

Note that the collocated nodes must have different in-vm acceptor IDs, otherwise on activating the stand-by node we'll get:

15:42:46,651 ERROR [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=2c99027a-e569-11e5-987a-0981dfa56d61) HQ224000: Failure in initialisation: java.lang.IllegalArgumentException: HQ119062: Acceptor with id 0 already registered
	at org.hornetq.core.remoting.impl.invm.InVMRegistry.registerAcceptor(InVMRegistry.java:36) [hornetq-server-2.3.25.Final-redhat-1.jar:2.3.25.Final-redhat-1]
        ...

Netty Acceptor Port

Note that the collocated nodes must have different netty acceptor ports, otherwise on activating the stand-by node we'll get:

15:55:44,035 ERROR [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=f50df054-e56f-11e5-b2e2-ab1fe58dc3e0) HQ224000: Failure in initialisation: org.jboss.netty.channel.ChannelException: Failed to bind to: /172.31.19.27:5445
	at org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) [netty-3.6.10.Final-redhat-1.jar:3.6.10.Final-redhat-1]
        [...]
Caused by: java.net.BindException: Address already in use
	at java.net.PlainSocketImpl.socketBind(Native Method) [rt.jar:1.8.0_74]
        [...]

Procedure

Node 1 Configuration

Filesystem Configuration Paths on Node 1

We need two different local filesystem directories to store messaging state for the two messaging nodes that run on each JBoss instance. For node 1, this is configured as follows (the configuration is identical for node 2):

    <paths>
      <path name="hornetq.pairA.dir" path="${jboss.server.data.dir}/messaging-pairA"/>
      <path name="hornetq.pairB.dir" path="${jboss.server.data.dir}/messaging-pairB"/>
    </paths>

Messaging Subsystem Configuration on Node 1

        <subsystem xmlns="urn:jboss:domain:messaging:1.4">
            <hornetq-server name="active-node-pair-A">
                <backup>false</backup>
                <shared-store>false</shared-store>
                <backup-group-name>pair-A</backup-group-name>
                <check-for-live-server>true</check-for-live-server>
                <failover-on-shutdown>true</failover-on-shutdown>
                <cluster-user>hq</cluster-user>
                <cluster-password>hq123</cluster-password>
                <persistence-enabled>true</persistence-enabled>
                <journal-type>NIO</journal-type>
                <journal-min-files>2</journal-min-files>
                <create-bindings-dir>true</create-bindings-dir>
                <create-journal-dir>true</create-journal-dir>
                <paging-directory path="paging" relative-to="hornetq.pairA.dir"/>
                <bindings-directory path="bindings" relative-to="hornetq.pairA.dir"/>
                <journal-directory path="journal" relative-to="hornetq.pairA.dir"/>
                <large-messages-directory path="large-messages" relative-to="hornetq.pairA.dir"/>

                <connectors>
                    <netty-connector name="netty" socket-binding="messaging-pair-A"/>
                    <netty-connector name="pair" socket-binding="messaging-backup-node-pair-A"/>
                    <in-vm-connector name="in-vm" server-id="0"/>
                </connectors>
                <acceptors>
                    <netty-acceptor name="netty" socket-binding="messaging-pair-A"/>
                    <in-vm-acceptor name="in-vm" server-id="0"/>
                </acceptors>
                <cluster-connections>
                    <cluster-connection name="pair-A-message-replication-ha-cluster">
                        <address>jms</address>
                        <connector-ref>netty</connector-ref>
                        <retry-interval>500</retry-interval>
                        <use-duplicate-detection>true</use-duplicate-detection>
                        <forward-when-no-consumers>true</forward-when-no-consumers>
                        <max-hops>1</max-hops>
                        <static-connectors>
                            <connector-ref>pair</connector-ref>
                        </static-connectors>
                    </cluster-connection>
                </cluster-connections>
                <security-settings>
                    <security-setting match="#">
                        <permission type="send" roles="guest"/>
                        <permission type="consume" roles="guest"/>
                        <permission type="createNonDurableQueue" roles="guest"/>
                        <permission type="deleteNonDurableQueue" roles="guest"/>
                    </security-setting>
                </security-settings>
                <address-settings>
                    <!--default for catch all-->
                    <address-setting match="#">
                        <dead-letter-address>jms.queue.DLQ</dead-letter-address>
                        <expiry-address>jms.queue.ExpiryQueue</expiry-address>
                        <redelivery-delay>0</redelivery-delay>
                        <redistribution-delay>1000</redistribution-delay>
                        <max-size-bytes>10485760</max-size-bytes>
                        <address-full-policy>PAGE</address-full-policy>
                        <page-size-bytes>2097152</page-size-bytes>
                        <message-counter-history-day-limit>10</message-counter-history-day-limit>
                    </address-setting>
                </address-settings>
                <jms-connection-factories>
                    <connection-factory name="InVmConnectionFactory">
                        <connectors>
                            <connector-ref connector-name="in-vm"/>
                        </connectors>
                        <entries>
                            <entry name="java:/ConnectionFactory"/>
                        </entries>
                    </connection-factory>
                    <connection-factory name="RemoteConnectionFactory">
                        <ha>true</ha>
                        <retry-interval>1000</retry-interval>
                        <retry-interval-multiplier>1.0</retry-interval-multiplier>
                        <reconnect-attempts>-1</reconnect-attempts>
                        <block-on-acknowledge>true</block-on-acknowledge>
                        <connectors>
                            <connector-ref connector-name="netty"/>
                        </connectors>
                        <entries>
                            <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/>
                        </entries>
                    </connection-factory>
                    <pooled-connection-factory name="hornetq-ra">
                        <transaction mode="xa"/>
                        <connectors>
                            <connector-ref connector-name="in-vm"/>
                        </connectors>
                        <entries>
                            <entry name="java:/JmsXA"/>
                        </entries>
                    </pooled-connection-factory>
                </jms-connection-factories>
                <jms-destinations>
                    <jms-queue name="ExpiryQueue">
                        <entry name="java:/jms/queue/ExpiryQueue"/>
                    </jms-queue>
                    <jms-queue name="DLQ">
                        <entry name="java:/jms/queue/DLQ"/>
                    </jms-queue>
                </jms-destinations>
            </hornetq-server>
            <hornetq-server name="backup-node-pair-B">
                <backup>true</backup>
                <shared-store>false</shared-store>
                <backup-group-name>pair-B</backup-group-name>
                <check-for-live-server>true</check-for-live-server>
                <failover-on-shutdown>true</failover-on-shutdown>
                <cluster-user>hq</cluster-user>
                <cluster-password>hq123</cluster-password>
                <persistence-enabled>true</persistence-enabled>
                <journal-type>NIO</journal-type>
                <journal-min-files>2</journal-min-files>
                <create-bindings-dir>true</create-bindings-dir>
                <create-journal-dir>true</create-journal-dir>
                <paging-directory path="paging" relative-to="hornetq.pairB.dir"/>
                <bindings-directory path="bindings" relative-to="hornetq.pairB.dir"/>
                <journal-directory path="journal" relative-to="hornetq.pairB.dir"/>
                <large-messages-directory path="large-messages" relative-to="hornetq.pairB.dir"/>

                <connectors>
                    <netty-connector name="netty" socket-binding="messaging-pair-B"/>
                    <netty-connector name="pair" socket-binding="messaging-active-node-pair-B"/>
                    <in-vm-connector name="in-vm" server-id="50"/>
                </connectors>
                <acceptors>
                    <netty-acceptor name="netty" socket-binding="messaging-pair-B"/>
                    <in-vm-acceptor name="in-vm" server-id="50"/>
                </acceptors>
                <cluster-connections>
                    <cluster-connection name="pair-B-message-replication-ha-cluster">
                        <address>jms</address>
                        <connector-ref>netty</connector-ref>
                        <retry-interval>500</retry-interval>
                        <use-duplicate-detection>true</use-duplicate-detection>
                        <forward-when-no-consumers>true</forward-when-no-consumers>
                        <max-hops>1</max-hops>
                        <static-connectors>
                            <connector-ref>pair</connector-ref>
                        </static-connectors>
                    </cluster-connection>
                </cluster-connections>
                <security-settings>
                    <security-setting match="#">
                        <permission type="send" roles="guest"/>
                        <permission type="consume" roles="guest"/>
                        <permission type="createNonDurableQueue" roles="guest"/>
                        <permission type="deleteNonDurableQueue" roles="guest"/>
                    </security-setting>
                </security-settings>
                <address-settings>
                    <!--default for catch all-->
                    <address-setting match="#">
                        <dead-letter-address>jms.queue.DLQ</dead-letter-address>
                        <expiry-address>jms.queue.ExpiryQueue</expiry-address>
                        <redelivery-delay>0</redelivery-delay>
                        <redistribution-delay>1000</redistribution-delay>
                        <max-size-bytes>10485760</max-size-bytes>
                        <address-full-policy>PAGE</address-full-policy>
                        <page-size-bytes>2097152</page-size-bytes>
                        <message-counter-history-day-limit>10</message-counter-history-day-limit>
                    </address-setting>
                </address-settings>
            </hornetq-server>
        </subsystem>

Socket Binding Group Configuration on Node 1

Delete all existing "messaging" socket bindings and add these new ones:

    <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}">
        ...
        <socket-binding name="messaging-pair-A" port="5445"/>
        <socket-binding name="messaging-pair-B" port="5465"/>
        <!-- delete all other "messaging" bindings -->
        ...
        <outbound-socket-binding name="messaging-backup-node-pair-A">
            <remote-destination host="172.31.24.234" port="5445"/>
        </outbound-socket-binding>
        <outbound-socket-binding name="messaging-active-node-pair-B">
            <remote-destination host="172.31.24.234" port="5465"/>
        </outbound-socket-binding>
    </socket-binding-group>

JMS Connection Factories for Pair B on Node 1

A backup HornetQ instance does not need the <jms-connection-factories> and <jms-destinations> sections as any JMS components are created from the shared journal when the backup server becomes live.

Node 2 Configuration

Filesystem Configuration Paths on Node 2

We need two different local filesystem directories to store messaging state for the two messaging nodes that run on each JBoss instance. For node 2, the configuration is identical to node 1:

    <paths>
      <path name="hornetq.pairA.dir" path="${jboss.server.data.dir}/messaging-pairA"/>
      <path name="hornetq.pairB.dir" path="${jboss.server.data.dir}/messaging-pairB"/>
    </paths>

Messaging Subsystem Configuration on Node 2

        <subsystem xmlns="urn:jboss:domain:messaging:1.4">
            <hornetq-server name="active-node-pair-B">
                <backup>false</backup>
                <shared-store>false</shared-store>
                <backup-group-name>pair-B</backup-group-name>
                <check-for-live-server>true</check-for-live-server>
                <failover-on-shutdown>true</failover-on-shutdown>
                <cluster-user>hq</cluster-user>
                <cluster-password>hq123</cluster-password>
                <persistence-enabled>true</persistence-enabled>
                <journal-type>NIO</journal-type>
                <journal-min-files>2</journal-min-files>
                <create-bindings-dir>true</create-bindings-dir>
                <create-journal-dir>true</create-journal-dir>
                <paging-directory path="paging" relative-to="hornetq.pairB.dir"/>
                <bindings-directory path="bindings" relative-to="hornetq.pairB.dir"/>
                <journal-directory path="journal" relative-to="hornetq.pairB.dir"/>
                <large-messages-directory path="large-messages" relative-to="hornetq.pairB.dir"/>

                <connectors>
                    <netty-connector name="netty" socket-binding="messaging-pair-B"/>
                    <netty-connector name="pair" socket-binding="messaging-backup-node-pair-B"/>
                    <in-vm-connector name="in-vm" server-id="0"/>
                </connectors>
                <acceptors>
                    <netty-acceptor name="netty" socket-binding="messaging-pair-B"/>
                    <in-vm-acceptor name="in-vm" server-id="0"/>
                </acceptors>
                <cluster-connections>
                    <cluster-connection name="pair-B-message-replication-ha-cluster">
                        <address>jms</address>
                        <connector-ref>netty</connector-ref>
                        <retry-interval>500</retry-interval>
                        <use-duplicate-detection>true</use-duplicate-detection>
                        <forward-when-no-consumers>true</forward-when-no-consumers>
                        <max-hops>1</max-hops>
                        <static-connectors>
                            <connector-ref>pair</connector-ref>
                        </static-connectors>
                    </cluster-connection>
                </cluster-connections>
                <security-settings>
                    <security-setting match="#">
                        <permission type="send" roles="guest"/>
                        <permission type="consume" roles="guest"/>
                        <permission type="createNonDurableQueue" roles="guest"/>
                        <permission type="deleteNonDurableQueue" roles="guest"/>
                    </security-setting>
                </security-settings>
                <address-settings>
                    <!--default for catch all-->
                    <address-setting match="#">
                        <dead-letter-address>jms.queue.DLQ</dead-letter-address>
                        <expiry-address>jms.queue.ExpiryQueue</expiry-address>
                        <redelivery-delay>0</redelivery-delay>
                        <redistribution-delay>1000</redistribution-delay>
                        <max-size-bytes>10485760</max-size-bytes>
                        <address-full-policy>PAGE</address-full-policy>
                        <page-size-bytes>2097152</page-size-bytes>
                        <message-counter-history-day-limit>10</message-counter-history-day-limit>
                    </address-setting>
                </address-settings>
                <jms-connection-factories>
                    <connection-factory name="InVmConnectionFactory">
                        <connectors>
                            <connector-ref connector-name="in-vm"/>
                        </connectors>
                        <entries>
                            <entry name="java:/ConnectionFactory"/>
                        </entries>
                    </connection-factory>
                    <connection-factory name="RemoteConnectionFactory">
                        <ha>true</ha>
                        <retry-interval>1000</retry-interval>
                        <retry-interval-multiplier>1.0</retry-interval-multiplier>
                        <reconnect-attempts>-1</reconnect-attempts>
                        <block-on-acknowledge>true</block-on-acknowledge>
                        <connectors>
                            <connector-ref connector-name="netty"/>
                        </connectors>
                        <entries>
                            <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/>
                        </entries>
                    </connection-factory>
                    <pooled-connection-factory name="hornetq-ra">
                        <transaction mode="xa"/>
                        <connectors>
                            <connector-ref connector-name="in-vm"/>
                        </connectors>
                        <entries>
                            <entry name="java:/JmsXA"/>
                        </entries>
                    </pooled-connection-factory>
                </jms-connection-factories>
                <jms-destinations>
                    <jms-queue name="ExpiryQueue">
                        <entry name="java:/jms/queue/ExpiryQueue"/>
                    </jms-queue>
                    <jms-queue name="DLQ">
                        <entry name="java:/jms/queue/DLQ"/>
                    </jms-queue>
                </jms-destinations>
            </hornetq-server>
            <hornetq-server name="backup-node-pair-A">
                <backup>true</backup>
                <shared-store>false</shared-store>
                <backup-group-name>pair-A</backup-group-name>
                <check-for-live-server>true</check-for-live-server>
                <failover-on-shutdown>true</failover-on-shutdown>
                <cluster-user>hq</cluster-user>
                <cluster-password>hq123</cluster-password>
                <persistence-enabled>true</persistence-enabled>
                <journal-type>NIO</journal-type>
                <journal-min-files>2</journal-min-files>
                <create-bindings-dir>true</create-bindings-dir>
                <create-journal-dir>true</create-journal-dir>
                <paging-directory path="paging" relative-to="hornetq.pairA.dir"/>
                <bindings-directory path="bindings" relative-to="hornetq.pairA.dir"/>
                <journal-directory path="journal" relative-to="hornetq.pairA.dir"/>
                <large-messages-directory path="large-messages" relative-to="hornetq.pairA.dir"/>

                <connectors>
                    <netty-connector name="netty" socket-binding="messaging-pair-A"/>
                    <netty-connector name="pair" socket-binding="messaging-active-node-pair-A"/>
                    <in-vm-connector name="in-vm" server-id="50"/>
                </connectors>
                <acceptors>
                    <netty-acceptor name="netty" socket-binding="messaging-pair-A"/>
                    <in-vm-acceptor name="in-vm" server-id="50"/>
                </acceptors>
                <cluster-connections>
                    <cluster-connection name="pair-A-message-replication-ha-cluster">
                        <address>jms</address>
                        <connector-ref>netty</connector-ref>
                        <retry-interval>500</retry-interval>
                        <use-duplicate-detection>true</use-duplicate-detection>
                        <forward-when-no-consumers>true</forward-when-no-consumers>
                        <max-hops>1</max-hops>
                        <static-connectors>
                            <connector-ref>pair</connector-ref>
                        </static-connectors>
                    </cluster-connection>
                </cluster-connections>
                <security-settings>
                    <security-setting match="#">
                        <permission type="send" roles="guest"/>
                        <permission type="consume" roles="guest"/>
                        <permission type="createNonDurableQueue" roles="guest"/>
                        <permission type="deleteNonDurableQueue" roles="guest"/>
                    </security-setting>
                </security-settings>
                <address-settings>
                    <!--default for catch all-->
                    <address-setting match="#">
                        <dead-letter-address>jms.queue.DLQ</dead-letter-address>
                        <expiry-address>jms.queue.ExpiryQueue</expiry-address>
                        <redelivery-delay>0</redelivery-delay>
                        <redistribution-delay>1000</redistribution-delay>
                        <max-size-bytes>10485760</max-size-bytes>
                        <address-full-policy>PAGE</address-full-policy>
                        <page-size-bytes>2097152</page-size-bytes>
                        <message-counter-history-day-limit>10</message-counter-history-day-limit>
                    </address-setting>
                </address-settings>
            </hornetq-server>
        </subsystem>

Socket Binding Group Configuration on Node 2

Delete all existing "messaging" socket bindings and add these new ones:

    <socket-binding-group name="standard-sockets" default-interface="public" port-offset="${jboss.socket.binding.port-offset:0}">
        ...
        <socket-binding name="messaging-pair-A" port="5445"/>
        <socket-binding name="messaging-pair-B" port="5465"/>
        <!-- delete all other "messaging" bindings -->
        ...
        <outbound-socket-binding name="messaging-backup-node-pair-B">
            <remote-destination host="172.31.28.249" port="5465"/>
        </outbound-socket-binding>
        <outbound-socket-binding name="messaging-active-node-pair-A">
            <remote-destination host="172.31.28.249" port="5445"/>
        </outbound-socket-binding>
    </socket-binding-group>

JMS Connection Factories for Pair A on Node 2

A backup HornetQ instance does not need the <jms-connection-factories> and <jms-destinations> sections as any JMS components are created from the shared journal when the backup server becomes live.

Log Messages

Log messages on node 1 after node 2 comes on-line:

00:17:04,329 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [null] started, waiting live to fail before it gets active
00:17:08,143 INFO  [org.hornetq.core.server] (Old I/O client worker ([id: 0xe37c96f3, /172.31.28.249:43339 => /172.31.24.234:5465])) HQ221024: Backup server HornetQServerImpl::serverUUID=8f496daf-fc77-11e5-b0d8-6f0d7be39c22 is synchronized with live-server.
00:17:08,146 INFO  [org.hornetq.core.server] (Thread-1 (HornetQ-server-HornetQServerImpl::serverUUID=null-1263094150)) HQ221031: backup announced

CLI Procedure

TODO: https://access.redhat.com/solutions/400873