HornetQ Persistence Concepts: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(32 intermediate revisions by the same user not shown)
Line 5: Line 5:
=Overview=
=Overview=


This article provides a high level overview of the HornetQ persistence concepts. It will describe what kind of data is persisted, as well as where and when. It will also discuss paging, which is a protection mechanism against running out of memory; persistence is relevant in this context because messages that do not fit in memory go to the filesystem, even if the messages themselves are marked as non-persistent.
This article provides a high level overview of the HornetQ persistence concepts. It describes what kind of data is persisted, as well as where and when. It also discusses paging, which is a protection mechanism against running out of memory; persistence is relevant in this context because messages that do not fit in memory go to the filesystem, even if the messages themselves are marked as non-persistent. Large messages that are fragmented and  stored on disk are also mentioned.


=There is No Database=
=There is No Database=
Line 19: Line 19:
All persistent messages ''must'' be stored on persistent storage, as mandated by the JMS specification. This is necessary to protect against messaging system failure: a persistent message can be presumably recovered from the persistent storage and re-sent.
All persistent messages ''must'' be stored on persistent storage, as mandated by the JMS specification. This is necessary to protect against messaging system failure: a persistent message can be presumably recovered from the persistent storage and re-sent.


HornetQ stores messages in an append-only file system journal, optimized for message-specific use cases. The journal consists of a set of fixed-size files. Initially, the files are filled with padding, which is progressively replaced with message data or deletes and transactional information. When a journal file is full, the next one is used and so on. A garbage collection algorithm determines whether a specific file is needed or it can be re-used. HornetQ can also compact the space in the journal files.
HornetQ stores messages in an append-only file system journal, optimized for message-specific use cases. The journal consists of a set of fixed-size files. Initially, the files are filled with padding, which is progressively replaced with message data or deletes and transactional information. Duplicate ID caches are also stored here. When a journal file is full, the next one is used and so on. A garbage collection algorithm determines whether a specific file is needed or it can be re-used. HornetQ can also compact the space in the journal files.
 
<pre>
$JBOSS_HOME/standalone/data/messagingjournal
-rw-r--r--. 1 root root 10485760 Mar  7 03:11 hornetq-data-33.hq
-rw-r--r--. 1 root root 10485760 Mar  7 03:25 hornetq-data-34.hq
-rw-r--r--. 1 root root 10485760 Mar  7 03:09 hornetq-data-35.hq
-rw-r--r--. 1 root root 10485760 Mar 11 16:43 hornetq-data-36.hq
-rw-r--r--. 1 root root      19 Mar 11 16:43 server.lock
</pre>
 
===Disk Write Cache===
 
<blockquote style="background-color: Gold; border: solid thin Goldenrod;">
:If ''disk write cache'' is enabled, and the disk's cache memory is volatile - that depends on the disk type -, data can be lost on power failure even if HornetQ correctly synced data to disk. If data integrity is important, disable disk write cache for disk with volatile cache.<br>
</blockquote>
 
More: [[hdparm]], [[sdparm]].


===Spinning Disks===
===Spinning Disks===
Line 37: Line 54:
</blockquote>
</blockquote>


===Configuration===
===Node ID===
 
When a node is started for the first time it persists a unique identifier into its journal directory. This ID is needed for proper formation of [[#Clustering|clusters]].
 
===Message Journal Configuration===


<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
Line 45: Line 66:
==Bindings Journal==
==Bindings Journal==


This is a relatively low-throughput journal (if compared to the message journal) used to store core queue data and id sequence counters.
This is a relatively low-throughput journal (if compared to the message journal) used to store core queue data and id sequence counters:
 
<pre>
$JBOSS_HOME/standalone/data/messagingbindings:
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-bindings-1.bindings
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-bindings-2.bindings
</pre>
 
The implementation is always NIO.
 
===Bindings Journal Configuration===
 
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[WildFly HornetQ-Based Messaging Subsystem Persistence Configuration#Bindings_Journal_Configuration|Bindings Journal Configuration]]
</blockquote>


==JMS Journal==
==JMS Journal==


==Large Messages==
This is also a low-throughput journal used to store JMS-related data: JMS queues and topics, JMS ConnectionFactories, JNDI bindings:


==Non Persistent Messages==
<pre>
$JBOSS_HOME/standalone/data/messagingbindings:
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-jms-1.jms
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-jms-2.jms
</pre>


=Journal=
<blockquote style="background-color: Gold; border: solid thin Goldenrod;">
:Note that only the JMS resources created via the management interface will be persisted in this journal. The resources specified in configuration will not be persisted here.<br>
</blockquote>
 
===JMS Journal Configuration===


When a node is started for the first time it persists a unique identifier into its journal directory. This ID is needed for proper formation of [[#Clustering|clusters]].
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[WildFly HornetQ-Based Messaging Subsystem Persistence Configuration#JMS_Journal_Configuration|JMS Journal Configuration]]
</blockquote>
 
=Large Messages=
 
With HornetQ it is possible to send and receive messages larger that the total amount of memory available to the broker and the client. The size of the message is limited only by the amount of disk space available. This is implemented by breaking the message into smaller fragments, which are persisted on disk by the broker. At no time the entire message is store in memory, either on client or the server.
 
==Large Message Configuration==
 
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[WildFly HornetQ-Based Messaging Subsystem Persistence Configuration#Large_Message_Configuration|Large Message Configuration]]
</blockquote>
 
=Non Persistent Messages and Paging=
 
Even non-persistent messages will be written to storage when a specific address is configured for ''paging''. For more details see:
 
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[HornetQ Paging]]
</blockquote>
 
=Zero Persistence=
 
No message data, bindings, etc. will be persisted.
 
<pre>
  ...
  <persistence-enable>false</persistence-enabled>
  ...
</pre>


=HornetQ Data Directories=
=Sending Messages and the Journal=


The configuration allows the possibility of creating the HornetQ bindings and journal data directories at startup, if they do not already exist. This configuration could be useful in "experimental" mode when one deletes and recreates HornetQ data files for whatever reason, and probably not that useful in production. If the directories exist, they are not re-created, so the "create" options can be left in place, even in a production configuration. However, there is another set of directories (large messages and paging) that will be created automatically if they don’t exist, in absence of any explicit configuration option. For production, it’s probably best if the directories are created manually as part of the installation procedure, and "create-*" options are removed from configuration.
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[WildFly_HornetQ-Based_Messaging_Subsystem_Concepts#Writing_to_Journal|Writing to Journal]]
</blockquote>

Latest revision as of 05:42, 6 May 2016

Internal

Overview

This article provides a high level overview of the HornetQ persistence concepts. It describes what kind of data is persisted, as well as where and when. It also discusses paging, which is a protection mechanism against running out of memory; persistence is relevant in this context because messages that do not fit in memory go to the filesystem, even if the messages themselves are marked as non-persistent. Large messages that are fragmented and stored on disk are also mentioned.

There is No Database

Unlike other messaging systems, which do offer the option of storing message data in a relational database, HornetQ does not. For reasons that led to this decision see https://developer.jboss.org/thread/153581. More details in "Messaging persistence in EAP 6.x" https://access.redhat.com/solutions/226743.

What Does HornetQ Persist?

Naturally, HornetQ persists persistent messages, as required by the JMS specification. It also persists some topology information (bindings and JMS information). HornetQ allows sending large messages - a message can be larger than the total amount of memory available to a broker - by fragmenting the messages and storing the fragments on the filesystem, so large message storage is another type of persistence managed by HornetQ. Finally, HornetQ is capable of storing any message, including the non-persistent messages, on the filesystem, when the amount of memory available to the broker is not sufficient to allow handling all messages for a specific address in memory. This mechanism is known as paging and it is describe here.

Persistent Message Journal

All persistent messages must be stored on persistent storage, as mandated by the JMS specification. This is necessary to protect against messaging system failure: a persistent message can be presumably recovered from the persistent storage and re-sent.

HornetQ stores messages in an append-only file system journal, optimized for message-specific use cases. The journal consists of a set of fixed-size files. Initially, the files are filled with padding, which is progressively replaced with message data or deletes and transactional information. Duplicate ID caches are also stored here. When a journal file is full, the next one is used and so on. A garbage collection algorithm determines whether a specific file is needed or it can be re-used. HornetQ can also compact the space in the journal files.

$JBOSS_HOME/standalone/data/messagingjournal
-rw-r--r--. 1 root root 10485760 Mar  7 03:11 hornetq-data-33.hq
-rw-r--r--. 1 root root 10485760 Mar  7 03:25 hornetq-data-34.hq
-rw-r--r--. 1 root root 10485760 Mar  7 03:09 hornetq-data-35.hq
-rw-r--r--. 1 root root 10485760 Mar 11 16:43 hornetq-data-36.hq
-rw-r--r--. 1 root root       19 Mar 11 16:43 server.lock

Disk Write Cache

If disk write cache is enabled, and the disk's cache memory is volatile - that depends on the disk type -, data can be lost on power failure even if HornetQ correctly synced data to disk. If data integrity is important, disable disk write cache for disk with volatile cache.

More: hdparm, sdparm.

Spinning Disks

A great deal of effort has been spent to optimize the journal for spinning hard drives. It's too bad that more and more of these are in process to be replaced with solid state drives, for which the head mechanics is irrelevant.

Journal Implementations

HornetQ ships with two journal implementations: pure Java, NIO-based and native Linux Asynchronous IO (AIO), available with Linux kernel 2.6 and higher. With AIO, HornetQ is called back when the data was written on disk, thus allowing it to avoid explicit syncs.

More details are available here:

http://docs.jboss.org/hornetq/2.4.0.Final/docs/user-manual/html/persistence.html#installing-aio
http://docs.jboss.org/hornetq/2.4.0.Final/docs/user-manual/html/libaio.html

Node ID

When a node is started for the first time it persists a unique identifier into its journal directory. This ID is needed for proper formation of clusters.

Message Journal Configuration

Message Journal Configuration

Bindings Journal

This is a relatively low-throughput journal (if compared to the message journal) used to store core queue data and id sequence counters:

$JBOSS_HOME/standalone/data/messagingbindings:
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-bindings-1.bindings
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-bindings-2.bindings

The implementation is always NIO.

Bindings Journal Configuration

Bindings Journal Configuration

JMS Journal

This is also a low-throughput journal used to store JMS-related data: JMS queues and topics, JMS ConnectionFactories, JNDI bindings:

$JBOSS_HOME/standalone/data/messagingbindings:
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-jms-1.jms
-rw-r--r--. 1 root root 1048576 Mar 11 16:43 hornetq-jms-2.jms
Note that only the JMS resources created via the management interface will be persisted in this journal. The resources specified in configuration will not be persisted here.

JMS Journal Configuration

JMS Journal Configuration

Large Messages

With HornetQ it is possible to send and receive messages larger that the total amount of memory available to the broker and the client. The size of the message is limited only by the amount of disk space available. This is implemented by breaking the message into smaller fragments, which are persisted on disk by the broker. At no time the entire message is store in memory, either on client or the server.

Large Message Configuration

Large Message Configuration

Non Persistent Messages and Paging

Even non-persistent messages will be written to storage when a specific address is configured for paging. For more details see:

HornetQ Paging

Zero Persistence

No message data, bindings, etc. will be persisted.

  ...
  <persistence-enable>false</persistence-enabled>
  ...

Sending Messages and the Journal

Writing to Journal