WildFly HornetQ Message Replication-Based HA Configuration

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

External

Internal

Overview

For high availability purposes, the live server and the backup server must be installed on two separated physical (or virtual) hosts, provisioned in such a way to minimize the probability of both host failing at the same time.

Configuration

shared-store

Whether this server is using shared store or not. Default is false.

backup-group-name

This is the unique name which identifies a live/backup pair that should replicate with each other. Example: "pair-A".

check-for-live-server

If a replicated live server should check the current cluster to see if there is already a live server with the same node id. Default is false.

TODO: ?

failover-on-shutdown

Whether this backup server (if it is a backup server) becomes the live server on a normal server shutdown. Default is false.

allow-failback

Whether this server will automatically shutdown if the original live server comes back up. Default is true.

max-saved-replicated-journal-size

The maximum number of backup journals to keep after failback occurs. Specifying this attribute is only necessary if allow-failback is true. Default value is 2, which means that after 2 failbacks the backup server must be restarted in order to be able to replicate journal from live server and become backup again.

Procedure

Active Node Configuration

...
<subsystem xmlns="urn:jboss:domain:messaging:1.4"> 
   <hornetq-server> 

      <backup>false</backup>
      <shared-store>false</shared-store>
      <backup-group-name>pair-A</backup-group-name>
      <check-for-live-server>true</check-for-live-server>
      <failover-on-shutdown>true</failover-on-shutdown>

      <persistence-enabled>true</persistence-enabled>

      ...
      
      <cluster-connections>
          <cluster-connection name="pair-A-high-availability-cluster">
             <address>jms</address>
             <connector-ref>netty</connector-ref>
             <retry-interval>500</retry-interval>
             <use-duplicate-detection>true</use-duplicate-detection>
             <forward-when-no-consumers>true</forward-when-no-consumers>
             <max-hops>1</max-hops>
             <static-connectors>
                <connector-ref>backup-node-connector</connector-ref>
              </static-connectors>
          </cluster-connection>
      </cluster-connections>

      ...

      <jms-connection-factories>
         ...
         <connection-factory name="RemoteConnectionFactory">
            <ha>true</ha>
            <retry-interval>1000</retry-interval>
            <retry-interval-multiplier>1.0</retry-interval-multiplier>
            <reconnect-attempts>-1</reconnect-attempts> 
            <connectors> 
               <connector-ref connector-name="netty"/>
            </connectors> 
            <entries> 
               <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/> 
            </entries> 
         </connection-factory>
         ...
      </jms-connection-factories>
   </hornetq-server>
</subsystem>
...

Shared Path Declaration

This is usually common for the entire domain, so it can be specified in the domain top level section.

   ...
   <paths>
      <path name="hornetq.shared.dir" path="/nfs/hornetq-shared-storage"/>
   </paths>
   ...

Live Server Configuration

jboss.messaging.hornetq.backup is by default false, but it's actually a good idea to make the configuration obvious. Add the following in the active node's host.xml:

<host ...>
   <system-properties>
      <property name="jboss.messaging.hornetq.backup" value="false"/>
   </system-properties>
   ...
</host>

Stand-By Server Configuration

jboss.messaging.hornetq.backup should be set to "true" in the stand-by node's host.xml:

<host ...>
   <system-properties>
      <property name="jboss.messaging.hornetq.backup" value="true"/>
   </system-properties>
   ...
</host>

JMS Connection Factories

A backup HornetQ instance does not need the <jms-connection-factories> and <jms-destinations> sections as any JMS components are created from the shared journal when the backup server becomes live.

Log Output

Active server starting:

13:14:00,312 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=false,backup=false,sharedStore=true,journalDirectory=/nfs/hornetq-shared-storage/journal,bindingsDirectory=/nfs/hornetq-shared-storage/bindings,largeMessagesDirectory=/nfs/hornetq-shared-storage/large-messages,pagingDirectory=/nfs/hornetq-shared-storage/paging)
13:14:00,313 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221006: Waiting to obtain live lock
[...]
13:14:00,614 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221035: Live Server Obtained live lock
[...]
13:14:01,800 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221007: Server is now live
13:14:01,801 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221001: HornetQ Server version 2.3.25.Final (2.3.x, 123) [db446058-de41-11e5-aea0-174ba3e38330] 

Stand-by server starting:

13:18:19,380 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: backup server is starting with configuration HornetQ Configuration (clustered=false,backup=true,sharedStore=true,journalDirectory=/nfs/hornetq-shared-storage/journal,bindingsDirectory=/nfs/hornetq-shared-storage/bindings,largeMessagesDirectory=/nfs/hornetq-shared-storage/large-messages,pagingDirectory=/nfs/hornetq-shared-storage/paging)
13:18:19,402 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221032: Waiting to become backup node
13:18:19,449 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221033: ** got backup lock
[...]
13:18:19,680 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [db446058-de41-11e5-aea0-174ba3e38330] started, waiting live to fail before it gets active
root@h2# 

Failover to stand-by server:

[...]
13:20:21,911 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221010: Backup Server is now live