WildFly HornetQ Message Replication-Based HA Configuration

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

For high availability purposes, the live server and the backup server must be installed on two separated physical (or virtual) hosts, provisioned in such a way to minimize the probability of both host failing at the same time.

Configuration

backup-group-name

This is the unique name which identifies a live/backup pair that should replicate with each other. Example: "pair-A".

check-for-live-server

If a replicated live server should check the current cluster to see if there is already a live server with the same node id. Default is false.

TODO: ?

failover-on-shutdown

Whether this backup server (if it is a backup server) becomes the live server on a normal server shutdown. Default is false.

allow-failback

Whether this server will automatically shutdown if the original live server comes back up. Default is true.

max-saved-replicated-journal-size

The maximum number of backup journals to keep after failback occurs. Specifying this attribute is only necessary if allow-failback is true. Default value is 2, which means that after 2 failbacks the backup server must be restarted in order to be able to replicate journal from live server and become backup again.

Procedure

Active Node Configuration

...
<subsystem xmlns="urn:jboss:domain:messaging:1.4"> 
   <hornetq-server> 

      <backup>false</backup>
      <shared-store>false</shared-store>
      <backup-group-name>pair-A</backup-group-name>
      <check-for-live-server>true</check-for-live-server>
      <failover-on-shutdown>true</failover-on-shutdown>

      <persistence-enabled>true</persistence-enabled>

      <cluster-user>hq</cluster-user>
      <cluster-password>hq123</cluster-password>
 
      ...
      
      <connectors>
            ...
            <netty-connector name="backup-node-connector" socket-binding="backup-node-hornetq-binding"/>
       </connectors>

      ...
      
      <cluster-connections>
          <cluster-connection name="pair-A-high-availability-cluster">
             <address>jms</address>
             <connector-ref>netty</connector-ref>
             <retry-interval>500</retry-interval>
             <use-duplicate-detection>true</use-duplicate-detection>
             <forward-when-no-consumers>true</forward-when-no-consumers>
             <max-hops>1</max-hops>
             <static-connectors>
                <connector-ref>backup-node-connector</connector-ref>
              </static-connectors>
          </cluster-connection>
      </cluster-connections>

      ...

      <jms-connection-factories>
         ...
         <connection-factory name="RemoteConnectionFactory">
            <ha>true</ha>
            <retry-interval>1000</retry-interval>
            <retry-interval-multiplier>1.0</retry-interval-multiplier>
            <reconnect-attempts>-1</reconnect-attempts> 
            <connectors> 
               <connector-ref connector-name="netty"/>
            </connectors> 
            <entries> 
               <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/> 
            </entries> 
         </connection-factory>
         ...
      </jms-connection-factories>
   </hornetq-server>
</subsystem>

...

<socket-binding-group name="standard-sockets"...>
        ...
        <outbound-socket-binding name="backup-node-hornetq-binding">
            <remote-destination host="1.2.3.5" port="5445"/>
        </outbound-socket-binding>
</socket-binding-group>

Backup Node Configuration

...
<subsystem xmlns="urn:jboss:domain:messaging:1.4"> 
   <hornetq-server> 

      <backup>true</backup>
      <shared-store>false</shared-store>
      <backup-group-name>pair-A</backup-group-name>
      <check-for-live-server>true</check-for-live-server>
      <failover-on-shutdown>true</failover-on-shutdown>

      <persistence-enabled>true</persistence-enabled>

      <cluster-user>hq</cluster-user>
      <cluster-password>hq123</cluster-password>

      ...
      
      <connectors>
            ...
            <netty-connector name="active-node-connector" socket-binding="active-node-hornetq-binding"/>
       </connectors>

      ...
      
      <cluster-connections>
          <cluster-connection name="pair-A-high-availability-cluster">
             <address>jms</address>
             <connector-ref>netty</connector-ref>
             <retry-interval>500</retry-interval>
             <use-duplicate-detection>true</use-duplicate-detection>
             <forward-when-no-consumers>true</forward-when-no-consumers>
             <max-hops>1</max-hops>
             <static-connectors>
                <connector-ref>active-node-connector</connector-ref>
              </static-connectors>
          </cluster-connection>
      </cluster-connections>

      ...

      <!--
      <jms-connection-factories>
         ...
      </jms-connection-factories>
      -->
   </hornetq-server>
</subsystem>

...

<socket-binding-group name="standard-sockets"...>
        ...
        <outbound-socket-binding name="active-node-hornetq-binding">
            <remote-destination host="1.2.3.4" port="5445"/>
        </outbound-socket-binding>
</socket-binding-group>

JMS Connection Factories and JMS Destinations on the Backup Server

A backup HornetQ instance does not need the <jms-connection-factories> and <jms-destinations> sections as any JMS components are created from the shared journal when the backup server becomes live.

Log Output

Active server starting:

00:54:10,313 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: live server is starting with configuration HornetQ Configuration (
clustered=true,
backup=false,
sharedStore=false,
journalDirectory=/opt/jboss/standalone/data/messagingjournal,
bindingsDirectory=/opt/jboss/standalone/data/messagingbindings,
largeMessagesDirectory=/opt/jboss/standalone/data/messaginglargemessages,
pagingDirectory=/opt/jboss/standalone/data/messagingpaging)
00:54:17,461 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221013: Using NIO Journal
00:54:19,925 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221007: Server is now live
00:54:19,925 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221001: HornetQ Server version 2.3.25.Final (2.3.x, 123) [5802038e-e5bb-11e5-89d2-3d869d769af8]
...

Stand-by server starting:

01:05:23,765 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: backup server is starting with configuration HornetQ Configuration (
clustered=true,
backup=true,
sharedStore=false,
journalDirectory=/opt/jboss/standalone/data/messagingjournal,
bindingsDirectory=/opt/jboss/standalone/data/messagingbindings,
largeMessagesDirectory=/opt/jboss/standalone/data/messaginglargemessages,
pagingDirectory=/opt/jboss/standalone/data/messagingpaging)
01:05:23,784 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingjournal to /opt/jboss/standalone/data/messagingjournal1
01:05:23,841 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221013: Using NIO Journal
01:05:24,068 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [null] started, waiting live to fail before it gets active
01:05:28,335 INFO  [org.hornetq.core.server] (Old I/O client worker ([id: 0x241e2d9c, /172.31.22.24:43128 => /172.31.24.172:5445])) HQ221024: Backup server HornetQServerImpl::serverUUID=5802038e-e5bb-11e5-89d2-3d869d769af8 is synchronized with live-server.
01:05:28,339 INFO  [org.hornetq.core.server] (Thread-1 (HornetQ-server-HornetQServerImpl::serverUUID=null-213578370)) HQ221031: backup announced

Upon backup server startup, the live server log shows this:

01:05:25,144 INFO  [org.hornetq.core.server] (Thread-81) HQ221025: Replication: sending JournalFileImpl: (hornetq-data-2.hq id = 2, recordID = 2) (size=10,485,760) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingjournal/hornetq-data-2.hq
01:05:27,573 INFO  [org.hornetq.core.server] (Thread-81) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-3.bindings id = 1, recordID = 1) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-3.bindings
01:05:27,604 INFO  [org.hornetq.core.server] (Thread-81) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-2.bindings id = 2, recordID = 2) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-2.bindings

Failover to stand-by server:

01:08:16,869 WARN  [org.hornetq.core.client] (Thread-6 (HornetQ-client-global-threads-926703232)) HQ212004: Failed to connect to server.
01:08:16,884 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221037: HornetQServerImpl::serverUUID=5802038e-e5bb-11e5-89d2-3d869d769af8 to become live
01:08:17,334 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221020: Started Netty Acceptor version 3.6.10.Final-266dbdf 172.31.22.24:5445 for CORE protocol
01:08:17,346 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221020: Started Netty Acceptor version 3.6.10.Final-266dbdf 172.31.22.24:5455 for CORE protocol

Failback on the active server:

01:09:12,140 INFO  [org.hornetq.core.server] (Thread-78) HQ221025: Replication: sending JournalFileImpl: (hornetq-data-3.hq id = 9, recordID = 9) (size=10,485,760) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingjournal/hornetq-data-3.hq
01:09:14,168 INFO  [org.hornetq.core.server] (Thread-78) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-3.bindings id = 1, recordID = 1) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-3.bindings
01:09:14,192 INFO  [org.hornetq.core.server] (Thread-78) HQ221025: Replication: sending JournalFileImpl: (hornetq-bindings-2.bindings id = 8, recordID = 8) (size=1,048,576) to backup. NIOSequentialFile /opt/jboss/standalone/data/messagingbindings/hornetq-bindings-2.bindings
01:09:19,221 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED enabled=true
01:09:19,221 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=STOP_CALLED true
01:09:19,249 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=FAIL_OVER enabled=true
01:09:19,250 WARN  [org.hornetq.core.server] (Thread-78) HQ222015: LIVE IS STOPPING?!? message=FAIL_OVER true
01:09:19,276 INFO  [org.hornetq.core.server] (Thread-78) HQ221002: HornetQ Server version 2.3.25.Final (2.3.x, 123) [5802038e-e5bb-11e5-89d2-3d869d769af8] stopped
01:09:19,277 INFO  [org.hornetq.core.server] (Thread-78) HQ221039: Restarting as Replicating backup server after live restart
01:09:19,277 INFO  [org.hornetq.core.server] (Thread-78) HQ221000: backup server is starting with configuration HornetQ Configuration (
clustered=true,
backup=true,
sharedStore=false,
journalDirectory=/opt/jboss/standalone/data/messagingjournal,
bindingsDirectory=/opt/jboss/standalone/data/messagingbindings,
largeMessagesDirectory=/opt/jboss/standalone/data/messaginglargemessages,
pagingDirectory=/opt/jboss/standalone/data/messagingpaging)
01:09:19,278 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingbindings to /opt/jboss/standalone/data/messagingbindings2
01:09:19,279 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingjournal to /opt/jboss/standalone/data/messagingjournal2
01:09:19,279 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messagingpaging to /opt/jboss/standalone/data/messagingpaging2
01:09:19,279 WARN  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ222162: Moving data directory /opt/jboss/standalone/data/messaginglargemessages to /opt/jboss/standalone/data/messaginglargemessages2
01:09:19,279 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221013: Using NIO Journal
01:09:21,293 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=null) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [null] started, waiting live to fail before it gets active
01:09:24,890 INFO  [org.hornetq.core.server] (Old I/O client worker ([id: 0x45afcbfa, /172.31.22.24:43263 => /172.31.24.172:5445])) HQ221024: Backup server HornetQServerImpl::serverUUID=5802038e-e5bb-11e5-89d2-3d869d769af8 is synchronized with live-server.
01:09:24,895 INFO  [org.hornetq.core.server] (Thread-1 (HornetQ-server-HornetQServerImpl::serverUUID=null-1956723314)) HQ221031: backup announced