WildFly HornetQ Shared Filesystem-Based Dedicated HA Configuration: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(37 intermediate revisions by the same user not shown)
Line 8: Line 8:
=Internal=
=Internal=


* [[WildFly HornetQ-Based Messaging Subsystem - HA Configuration#Subjects|WildFly HornetQ HA Configuration]]
* [[WildFly HornetQ-Based Messaging Subsystem Configuration|HornetQ-Based Messaging Configuration]]
* [[WildFly HornetQ-Based Messaging Subsystem Concepts#Shared_Filesystem-based_Replication|Concepts: Shared Filesystem-based Replication]]


=Overview=
=Overview=
Line 21: Line 22:


<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[WildFly HornetQ-Based Messaging Subsystem Concepts#WildFly_Clustering_and_HornetQ_High_Availability|HornetQ Concepts - WildFly Clustering and HornetQ High Availability]]
:[[WildFly HornetQ-Based Messaging Subsystem Concepts#WildFly_Clustering_and_HornetQ_High_Availability|Concepts: WildFly Clustering and HornetQ High Availability]]
</blockquote>
</blockquote>


Line 28: Line 29:
==Common Configuration==
==Common Configuration==


Common configuration specification and system properties makes sense for WildFly instance running in domain mode, because it permits the use of the same configuration for both the live server and the backup server. The differences in behavior are specified via system properties. In standalone mode, the sequences below can be copied and pasted in the respective <tt>standalone*.xml</tt> files.
Common configuration specification using system properties for per-server instance externalization makes sense for WildFly instance running in domain mode, because it permits the use of the same configuration for both the live server and the backup server. The differences in behavior are specified via system properties. In standalone mode, the sequences below can be copied and pasted in their respective <tt>standalone*.xml</tt> files.
 
===Shared-Storage Based High Availability===


Use the following "messaging" subsystem configuration on both live and stand-by servers. This is convenient because the servers can be made part of the same server group.
Use the following "messaging" subsystem configuration on both live and stand-by servers. This is convenient because the servers can be made part of the same server group.
Line 36: Line 39:
<subsystem xmlns="urn:jboss:domain:messaging:1.4">  
<subsystem xmlns="urn:jboss:domain:messaging:1.4">  
   <hornetq-server>  
   <hornetq-server>  
       <persistence-enabled>true</persistence-enabled>
       <persistence-enabled>true</persistence-enabled>
       ...
       ...
Line 44: Line 48:
       <failover-on-shutdown>true</failover-on-shutdown>
       <failover-on-shutdown>true</failover-on-shutdown>


       <paging-directory path="../shared-hornetq-storage/paging" relative-to="jboss.server.base.dir"/>
       <paging-directory path="paging" relative-to="hornetq.shared.dir"/>
       <bindings-directory path="../shared-hornetq-storage/bindings" relative-to="jboss.server.base.dir"/>  
       <bindings-directory path="bindings" relative-to="hornetq.shared.dir"/>  
       <journal-directory path="../shared-hornetq-storage/journal" relative-to="jboss.server.base.dir"/>
       <journal-directory path="journal" relative-to="hornetq.shared.dir"/>
       <large-messages-directory path="../shared-hornetq-storage/large-messages" relative-to="jboss.server.base.dir"/>
       <large-messages-directory path="large-messages" relative-to="hornetq.shared.dir"/>
        
        
       ...
       ...
       <jms-connection-factories>
       <jms-connection-factories>
         ...
         ...
Line 69: Line 74:
</subsystem>
</subsystem>
...
...
</pre>
===HornetQ JMS ConnectionFactory Configuration===
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[HornetQ JMS ConnectionFactory Configuration]]
</blockquote>
===Shared Path Declaration===
This is usually common for the entire domain, so it can be specified in the domain top level section.
<pre>
  ...
  <paths>
      <path name="hornetq.shared.dir" path="/nfs/hornetq-shared-storage"/>
  </paths>
  ...
</pre>
</pre>


==Live Server Configuration==
==Live Server Configuration==


==Stand-By Server Configuration==
<tt>jboss.messaging.hornetq.backup</tt> is by default false, but it's actually a good idea to make the configuration obvious. Add the following in the active node's <tt>host.xml</tt>:


<pre>
<pre>
...
<host ...>
<subsystem xmlns="urn:jboss:domain:messaging:1.4">  
   <system-properties>
   <hornetq-server>
       <property name="jboss.messaging.hornetq.backup" value="false"/>
      <persistence-enabled>true</persistence-enabled>
  </system-properties>
      ...
  ...
      <backup>true</backup>
</host>
      <shared-store>true</shared-store>
      <allow-failback>true</allow-failback>
       <bindings-directory path="../shared-hornetq-storage/bindings" relative-to="jboss.server.base.dir"/>
      <journal-directory path="../shared-hornetq-storage/journal" relative-to="jboss.server.base.dir"/>
      <large-messages-directory path="../shared-hornetq-storage/large-messages" relative-to="jboss.server.base.dir"/>
      <paging-directory path="../shared-hornetq-storage/paging" relative-to="jboss.server.base.dir"/>
      ...
      <!-- Remove all <jms-connection-factories> declarations -->
  </hornetq-server>
</subsystem>
...
</pre>
</pre>


A backup HornetQ instance does not need the <tt><jms-connection-factories></tt> and <tt><jms-destinations></tt> sections as any JMS components are created from the shared journal when the backup server becomes live.
==Stand-By Server Configuration==


=Failover Limitations=
<tt>jboss.messaging.hornetq.backup</tt> should be set to "<tt>true</tt>" in the stand-by node's <tt>host.xml</tt>:


Due to the way HornetQ was designed, the failover is not fully transparent and it requires application’s cooperation.
<pre>
<host ...>
  <system-properties>
      <property name="jboss.messaging.hornetq.backup" value="true"/>
  </system-properties>
  ...
</host>
</pre>


There are two notable situations when the application will be notified of live server failure:
===JMS Connection Factories===


# The application performs a blocking operations (for example a message <tt>send()</tt>). In this situation, if a live server failure occurs, the client side messaging runtime will interrupt the send operation and make it throw a <tt>JMSExcepiton</tt>.
A backup HornetQ instance does not need the <tt><jms-connection-factories></tt> and <tt><jms-destinations></tt> sections as any JMS components are created from the shared journal when the backup server becomes live.
# The live server failure occurs during a transaction. In this case, the client-side messaging runtime rolls back the transaction.


=Failover in Case of Administrative Shutdown of the Live Server=
=Log Output=


HornetQ allows the possibility to specify the client-side failover behavior in case of administrative shutdown of the live server. There are two options:
Active server starting:
# Client does not fail over to the backup server on administrative shutdown of the live server. If the connection factory is configured to contain other live server connectors, the client will reconnect to those; if not, it will issue a warning log entry and close the connection.
# Client does fail over to the backup server on administrative shutdown of the live server. If there are no other live servers available, this is probably a sensible option.


=HornetQ Data Directories=
<pre>
13:14:00,312 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=false,backup=false,sharedStore=true,journalDirectory=/nfs/hornetq-shared-storage/journal,bindingsDirectory=/nfs/hornetq-shared-storage/bindings,largeMessagesDirectory=/nfs/hornetq-shared-storage/large-messages,pagingDirectory=/nfs/hornetq-shared-storage/paging)
13:14:00,313 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221006: Waiting to obtain live lock
[...]
13:14:00,614 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221035: Live Server Obtained live lock
[...]
13:14:01,800 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221007: Server is now live
13:14:01,801 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221001: HornetQ Server version 2.3.25.Final (2.3.x, 123) [db446058-de41-11e5-aea0-174ba3e38330]
</pre>


The configuration allows the possibility of creating the HornetQ bindings and journal data directories at startup, if they do not already exist. This configuration could be useful in "experimental" mode when one deletes and recreates HornetQ data files for whatever reason, and probably not that useful in production. If the directories exist, they are not re-created, so the "create" options can be left in place, even in a production configuration. However, there is another set of directories (large messages and paging) that will be created automatically if they don’t exist, in absence of any explicit configuration option. For production, it’s probably best if the directories are created manually as part of the installation procedure, and "create-*" options are removed from configuration.
Stand-by server starting:


=TODO=
<pre>
13:18:19,380 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: backup server is starting with configuration HornetQ Configuration (clustered=false,backup=true,sharedStore=true,journalDirectory=/nfs/hornetq-shared-storage/journal,bindingsDirectory=/nfs/hornetq-shared-storage/bindings,largeMessagesDirectory=/nfs/hornetq-shared-storage/large-messages,pagingDirectory=/nfs/hornetq-shared-storage/paging)
13:18:19,402 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221032: Waiting to become backup node
13:18:19,449 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221033: ** got backup lock
[...]
13:18:19,680 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [db446058-de41-11e5-aea0-174ba3e38330] started, waiting live to fail before it gets active
root@h2#
</pre>


<font color=red>
Failover to stand-by server:


<pre>
[...]
13:20:21,911 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221010: Backup Server is now live
</pre>


* Clarify why I need <tt><cluster-user></tt> and <tt><cluster-password</tt>.
=Other Examples=


</font>
* An isolated domain that starts an active and a standby HornetQ node with shared filesystem-based HA: [https://github.com/NovaOrdis/playground/blob/master/jboss/hornetq/configuration-examples/domain-shared-filesystem-based-dedicated-ha/domain.xml domain.xml], [https://github.com/NovaOrdis/playground/blob/master/jboss/hornetq/configuration-examples/domain-shared-filesystem-based-dedicated-ha/host.xml host.xml].

Latest revision as of 21:21, 5 September 2017

External

Internal

Overview

For high availability purposes, the live server and the backup server must be installed on two separated physical (or virtual) hosts, provisioned in such a way to minimize the probability of both host failing at the same time. Highly available HornetQ requires access to reliable shared file system storage, so a file system such as GFS2 or a SAN must be made available to both hosts. HornetQ instances will store on the shared directory, among other things, their bindings and journal files. NFS v4 appropriately configured is also an option.

WildFly Clustering and HornetQ High Availability

This document contains instructions for setting up a configuration where HornetQ HA is independently configured from WildFly clustering.

For more details see:

Concepts: WildFly Clustering and HornetQ High Availability

Procedure

Common Configuration

Common configuration specification using system properties for per-server instance externalization makes sense for WildFly instance running in domain mode, because it permits the use of the same configuration for both the live server and the backup server. The differences in behavior are specified via system properties. In standalone mode, the sequences below can be copied and pasted in their respective standalone*.xml files.

Shared-Storage Based High Availability

Use the following "messaging" subsystem configuration on both live and stand-by servers. This is convenient because the servers can be made part of the same server group.

...
<subsystem xmlns="urn:jboss:domain:messaging:1.4"> 
   <hornetq-server> 

      <persistence-enabled>true</persistence-enabled>
      ...
      <shared-store>true</shared-store>
      <backup>${jboss.messaging.hornetq.backup:false}</backup>
      <create-bindings-dir>true</create-bindings-dir>
      <create-journal-dir>true</create-journal-dir>
      <failover-on-shutdown>true</failover-on-shutdown>

      <paging-directory path="paging" relative-to="hornetq.shared.dir"/>
      <bindings-directory path="bindings" relative-to="hornetq.shared.dir"/> 
      <journal-directory path="journal" relative-to="hornetq.shared.dir"/>
      <large-messages-directory path="large-messages" relative-to="hornetq.shared.dir"/>
      
      ...

      <jms-connection-factories>
         ...
         <connection-factory name="RemoteConnectionFactory">
            <ha>true</ha>
            <retry-interval>1000</retry-interval>
            <retry-interval-multiplier>1.0</retry-interval-multiplier>
            <reconnect-attempts>-1</reconnect-attempts> 
            <connectors> 
               <connector-ref connector-name="netty"/>
            </connectors> 
            <entries> 
               <entry name="java:jboss/exported/jms/RemoteConnectionFactory"/> 
            </entries> 
         </connection-factory>
         ...
      </jms-connection-factories>
   </hornetq-server>
</subsystem>
...

HornetQ JMS ConnectionFactory Configuration

HornetQ JMS ConnectionFactory Configuration

Shared Path Declaration

This is usually common for the entire domain, so it can be specified in the domain top level section.

   ...
   <paths>
      <path name="hornetq.shared.dir" path="/nfs/hornetq-shared-storage"/>
   </paths>
   ...

Live Server Configuration

jboss.messaging.hornetq.backup is by default false, but it's actually a good idea to make the configuration obvious. Add the following in the active node's host.xml:

<host ...>
   <system-properties>
      <property name="jboss.messaging.hornetq.backup" value="false"/>
   </system-properties>
   ...
</host>

Stand-By Server Configuration

jboss.messaging.hornetq.backup should be set to "true" in the stand-by node's host.xml:

<host ...>
   <system-properties>
      <property name="jboss.messaging.hornetq.backup" value="true"/>
   </system-properties>
   ...
</host>

JMS Connection Factories

A backup HornetQ instance does not need the <jms-connection-factories> and <jms-destinations> sections as any JMS components are created from the shared journal when the backup server becomes live.

Log Output

Active server starting:

13:14:00,312 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: live server is starting with configuration HornetQ Configuration (clustered=false,backup=false,sharedStore=true,journalDirectory=/nfs/hornetq-shared-storage/journal,bindingsDirectory=/nfs/hornetq-shared-storage/bindings,largeMessagesDirectory=/nfs/hornetq-shared-storage/large-messages,pagingDirectory=/nfs/hornetq-shared-storage/paging)
13:14:00,313 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221006: Waiting to obtain live lock
[...]
13:14:00,614 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221035: Live Server Obtained live lock
[...]
13:14:01,800 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221007: Server is now live
13:14:01,801 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221001: HornetQ Server version 2.3.25.Final (2.3.x, 123) [db446058-de41-11e5-aea0-174ba3e38330] 

Stand-by server starting:

13:18:19,380 INFO  [org.hornetq.core.server] (ServerService Thread Pool -- 60) HQ221000: backup server is starting with configuration HornetQ Configuration (clustered=false,backup=true,sharedStore=true,journalDirectory=/nfs/hornetq-shared-storage/journal,bindingsDirectory=/nfs/hornetq-shared-storage/bindings,largeMessagesDirectory=/nfs/hornetq-shared-storage/large-messages,pagingDirectory=/nfs/hornetq-shared-storage/paging)
13:18:19,402 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221032: Waiting to become backup node
13:18:19,449 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221033: ** got backup lock
[...]
13:18:19,680 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221109: HornetQ Backup Server version 2.3.25.Final (2.3.x, 123) [db446058-de41-11e5-aea0-174ba3e38330] started, waiting live to fail before it gets active
root@h2# 

Failover to stand-by server:

[...]
13:20:21,911 INFO  [org.hornetq.core.server] (HQ119000: Activation for server HornetQServerImpl::serverUUID=db446058-de41-11e5-aea0-174ba3e38330) HQ221010: Backup Server is now live

Other Examples

  • An isolated domain that starts an active and a standby HornetQ node with shared filesystem-based HA: domain.xml, host.xml.