WildFly HornetQ Message Replication-Based HA Configuration

From NovaOrdis Knowledge Base
Jump to: navigation, search

External

Internal

Overview

For high availability purposes, the live server and the backup server must be installed on two separated physical (or virtual) hosts, provisioned in such a way to minimize the probability of both host failing at the same time.

Configuration

Procedure

Active Node Configuration

...
<subsystem xmlns="urn:jboss:domain:messaging:1.4"> 
   <hornetq-server> 

      <backup>false</backup>
      <shared-store>false</shared-store>
      <backup-group-name>pair-A</backup-group-name>
      <check-for-live-server>true</check-for-live-server>
      <failover-on-shutdown>true</failover-on-shutdown>

      <persistence-enabled>true</persistence-enabled>

      <cluster-user>hq</cluster-user>
      <cluster-password>hq123</cluster-password>
 
      ...
      
      <connectors>
            ...
            <netty-connector name="backup-node-connector" socket-binding="backup-node-hornetq-binding"/>
       </connectors>

      ...
      
      <cluster-connections>
          <cluster-connection name="pair-A-high-availability-cluster">
             <address>jms</address>
             <connector-ref>netty</connector-ref>
             <retry-interval>500</retry-interval>
             <use-duplicate-detection>true</use-duplicate-detection>
             <forward-when-no-consumers>true</forward-when-no-consumers>
             <max-hops>1</max-hops>
             <static-connectors>
                <connector-ref>backup-node-connector</connector-ref>
              </static-connectors>
          </cluster-connection>
      </cluster-connections>

      ...

      <jms-connection-factories>
         ...
         <connection-factory name="RemoteConnectionFactory">
            <ha>true</ha>
            <retry-interval>1000</retry-interval>
            <retry-interval-multiplier>1.0</retry-interval-multiplier>
            <reconnect-attempts>-1</reconnect-attempts> 
            <connectors> 
               <connector-ref connector-name="netty"/>
            </connectors> 
            <entries> 
               <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/> 
            </entries> 
         </connection-factory>
         ...
      </jms-connection-factories>
   </hornetq-server>
</subsystem>

...

<socket-binding-group name="standard-sockets"...>
        ...
        <outbound-socket-binding name="backup-node-hornetq-binding">
            <remote-destination host="1.2.3.5" port="5445"/>
        </outbound-socket-binding>
</socket-binding-group>

Backup Node Configuration

...
<subsystem xmlns="urn:jboss:domain:messaging:1.4"> 
   <hornetq-server> 

      <backup>true</backup>
      <shared-store>false</shared-store>
      <backup-group-name>pair-A</backup-group-name>
      <check-for-live-server>true</check-for-live-server>
      <failover-on-shutdown>true</failover-on-shutdown>

      <persistence-enabled>true</persistence-enabled>

      <cluster-user>hq</cluster-user>
      <cluster-password>hq123</cluster-password>

      ...
      
      <connectors>
            ...
            <netty-connector name="active-node-connector" socket-binding="active-node-hornetq-binding"/>
       </connectors>

      ...
      
      <cluster-connections>
          <cluster-connection name="pair-A-high-availability-cluster">
             <address>jms</address>
             <connector-ref>netty</connector-ref>
             <retry-interval>500</retry-interval>
             <use-duplicate-detection>true</use-duplicate-detection>
             <forward-when-no-consumers>true</forward-when-no-consumers>
             <max-hops>1</max-hops>
             <static-connectors>
                <connector-ref>active-node-connector</connector-ref>
              </static-connectors>
          </cluster-connection>
      </cluster-connections>

      ...

      <!--
      <jms-connection-factories>
         ...
      </jms-connection-factories>
      -->
   </hornetq-server>
</subsystem>

...

<socket-binding-group name="standard-sockets"...>
        ...
        <outbound-socket-binding name="active-node-hornetq-binding">
            <remote-destination host="1.2.3.4" port="5445"/>
        </outbound-socket-binding>
</socket-binding-group>

JMS Connection Factories and JMS Destinations on the Backup Server

A backup HornetQ instance does not need the <jms-connection-factories> and <jms-destinations> sections as any JMS components are created from the shared journal when the backup server becomes live.

Log Output

Active server starting:

00:54:10,313 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: live server is starting with configuration HornetQ Configuration (
clustered=true,
backup=false,
sharedStore=false,
journalDirectory=/opt/jboss/standalone/data/messagingjournal,
bindingsDirectory=/opt/jboss/standalone/data/messagingbindings,
largeMessagesDirectory=/opt/jboss/standalone/data/messaginglargemessages,
pagingDirectory=/opt/jboss/standalone/data/messagingpaging)
00:54:17,461 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221013: Using NIO Journal
00:54:19,925 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221007: Server is now live
00:54:19,925 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221001: HornetQ Server version 2.3.25.Final (2.3.x, 123) [5802038e-e5bb-11e5-89d2-3d869d769af8]
...

Stand-by server starting:

01:05:23,765 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: backup server is starting with configuration HornetQ Configuration (
clustered=true,
backup=true,
sharedStore=false,
journalDirectory=/opt/jboss/standalone/data/messagingjournal,
bindingsDirectory=/opt/jboss/standalone/data/messagingbindings,
largeMessagesDirectory=/opt/jboss/standalone/data/messaginglargemessages,
pagingDirectory=/opt/jboss/standalone/data/messagingpaging)
01:05:23,784 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingjournal to /opt/jboss/standalone/data/messagingjournal1
01:05:23,841 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221013: Using NIO Journal
01:05:24,068 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [null] started, waiting live to fail before it gets active
01:05:28,335 INFO  [org.hornetq.core.server] (Old I/O client worker ([id: 0x241e2d9c, /172.31.22.24:43128 => /172.31.24.172:5445])) HQ221024: Backup server HornetQServerImpl::serverUUID=5802038e-e5bb-11e5-89d2-3d869d769af8 is synchronized with live-server.
01:05:28,339 INFO  [org.hornetq.core.server] (Thread-1 (HornetQ-server-HornetQServerImpl::serverUUID=null-213578370)) HQ221031: backup announced

Upon backup server startup, the live server log shows this:

01:05:25,144 INFO  [org.hornetq.core.server] (Thread-81) HQ221025: Replication: sending JournalFileImpl: (hornetq-data-2.hq id = 2, recordID = 2) (size=10,485,760) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingjournal/hornetq-data-2.hq
01:05:27,573 INFO  [org.hornetq.core.server] (Thread-81) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-3.bindings id = 1, recordID = 1) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-3.bindings
01:05:27,604 INFO  [org.hornetq.core.server] (Thread-81) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-2.bindings id = 2, recordID = 2) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-2.bindings

Failover to stand-by server:

01:08:16,869 WARN  [org.hornetq.core.client] (Thread-6 (HornetQ-client-global-threads-926703232)) HQ212004: Failed to connect to server.
01:08:16,884 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221037: HornetQServerImpl::serverUUID=5802038e-e5bb-11e5-89d2-3d869d769af8 to become live
01:08:17,334 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221020: Started Netty Acceptor version 3.6.10.Final-266dbdf 172.31.22.24:5445 for CORE protocol
01:08:17,346 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221020: Started Netty Acceptor version 3.6.10.Final-266dbdf 172.31.22.24:5455 for CORE protocol

Failback on the active server:

01:09:12,140 INFO  [org.hornetq.core.server] (Thread-78) HQ221025: Replication: sending JournalFileImpl: (hornetq-data-3.hq id = 9, recordID = 9) (size=10,485,760) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingjournal/hornetq-data-3.hq
01:09:14,168 INFO  [org.hornetq.core.server] (Thread-78) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-3.bindings id = 1, recordID = 1) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-3.bindings
01:09:14,192 INFO  [org.hornetq.core.server] (Thread-78) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-2.bindings id = 8, recordID = 8) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-2.bindings
01:09:19,221 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED enabled=true
01:09:19,221 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED true
01:09:19,249 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=FAIL_OVER enabled=true
01:09:19,250 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=FAIL_OVER true
01:09:19,276 INFO  [org.hornetq.core.server] (Thread-78) HQ221002: HornetQ Server version 2.3.25.Final (2.3.x, 123) [5802038e-e5bb-11e5-89d2-3d869d769af8] stopped
01:09:19,277 INFO  [org.hornetq.core.server] (Thread-78) HQ221039: Restarting as Replicating backup server after live restart
01:09:19,277 INFO  [org.hornetq.core.server] (Thread-78) HQ221000: backup server is starting with configuration HornetQ Configuration (
clustered=true,
backup=true,
sharedStore=false,
journalDirectory=/opt/jboss/standalone/data/messagingjournal,
bindingsDirectory=/opt/jboss/standalone/data/messagingbindings,
largeMessagesDirectory=/opt/jboss/standalone/data/messaginglargemessages,
pagingDirectory=/opt/jboss/standalone/data/messagingpaging)
01:09:19,278 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingbindings to /opt/jboss/standalone/data/messagingbindings2
01:09:19,279 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingjournal to /opt/jboss/standalone/data/messagingjournal2
01:09:19,279 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingpaging to /opt/jboss/standalone/data/messagingpaging2
01:09:19,279 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messaginglargemessages to /opt/jboss/standalone/data/messaginglargemessages2
01:09:19,279 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221013: Using NIO Journal
01:09:21,293 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [null] started, waiting live to fail before it gets active
01:09:24,890 INFO  [org.hornetq.core.server] (Old I/O client worker ([id: 0x45afcbfa, /172.31.22.24:43263 => /172.31.24.172:5445])) HQ221024: Backup server HornetQServerImpl::serverUUID=5802038e-e5bb-11e5-89d2-3d869d769af8 is synchronized with live-server.
01:09:24,895 INFO  [org.hornetq.core.server] (Thread-1 (HornetQ-server-HornetQServerImpl::serverUUID=null-1956723314)) HQ221031: backup announced