JGroups Protocol MERGE2: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
 
(One intermediate revision by the same user not shown)
Line 10: Line 10:
=Relevance=
=Relevance=


* JGroups 3.4.3
* JGroups 3.4.3 (in this version MERGE2 is broken, does not handle "re-incarnation" well)


=Overview=
=Overview=
Line 36: Line 36:
The FIND_ALL_VIEWS events (and consequently GET_MBRS_REQ messages) are sent at random intervals between 'min_interval' and 'max_interval' milliseconds.  
The FIND_ALL_VIEWS events (and consequently GET_MBRS_REQ messages) are sent at random intervals between 'min_interval' and 'max_interval' milliseconds.  


The result of GET_MBRS_REQ is a list of <tt>PingData</tt>. The example below shows a FIND_ALL_VIEWS result indicating that '''no''' partition happened, for a cluster of three nodes.
The result of GET_MBRS_REQ is a list of <tt>PingData</tt>.  


<font color=red>TODO h45gt3</font>
<font color=red>TODO more here</font>.
 
<pre>
TRACE org.jgroups.protocols.MERGE2 (Timer-5,shared=tcp) -- Discovery results:
[host03/tcp]: view_id=
[host01/tcp]: view_id=[host01/tcp|3] ([host01/tcp|3] (2) [host01/tcp, host03/tcp])
</pre>
 
This example shows a FIND_ALL_VIEWS result indicating that a cluster member (host02) maintains an obsolete view:
 
<font color=red>TODO b347tg</font>
 
<pre>
DEBUG org.jgroups.protocols.MERGE2 (Timer-2,shared=tcp) -- host01/tcp found different views : [host01/tcp|2], [host01/tcp|3]; sending up MERGE event with merge participants [host01/tcp, host02/tcp].
Discovery results:
[host02/tcp]: coord=host01/tcp
[host01/tcp]: coord=host01/tcp
</pre>
 
Note the coordinator is the same, but the views have different versions. This is a situation when the merge process will be initiated.
 
If ''another'' coordinator for the same group receives this message, it will also initiate a merge process.


The merge process does not merge state. The app has to handle the callback to merge state.
The merge process does not merge state. The app has to handle the callback to merge state.

Latest revision as of 04:20, 5 March 2016

External

Internal

Relevance

  • JGroups 3.4.3 (in this version MERGE2 is broken, does not handle "re-incarnation" well)

Overview

MERGE2 merges split groups.

Merging Process

If a group splits, MERGE2 detects the split and re-joins the group.

Split situations:

  • A member is inactive (long GC), it is excluded from group, then comes back to life and it's part of an "older" view - the new view multicasted the coordinator does not include it.
  • Real network partitions that leads to two subgroups with two coordinators.

The merge protocol is run by coordinator.

On the coordinator, the FindSubgroupsTask of the MERGE2 protocol periodically sends a FIND_ALL_VIEWS down the stack, which is handled by the implementation of the Discovery protocol (PING, MPING, JDBC_PING, etc). Upon receiving a FIND_ALL_VIEWS, the discovery protocol sends a GET_MBRS_REQ message.

For more details on GET_MBRS_REQ requests handling by the Discovery protocols, see:

GET_MBRS_REQ requests handling by the Discovery protocols

The FIND_ALL_VIEWS events (and consequently GET_MBRS_REQ messages) are sent at random intervals between 'min_interval' and 'max_interval' milliseconds.

The result of GET_MBRS_REQ is a list of PingData.

TODO more here.

The merge process does not merge state. The app has to handle the callback to merge state.

Configuration

JGroups

  <MERGE2 max_interval="100000" min_interval="20000"/>

WildFly

<subsystem xmlns="urn:jboss:domain:jgroups:2.0" default-stack="tcp">
   <stack name="tcp">
      ...
     <protocol type="MERGE2"/>
     ...
   </stack>
</subsystem>

Parameters

min_interval

max_interval

If 'max_interval' is smaller or equal with 'min_interval', we get a configuration error:

21:14:16,436 ERROR [MERGE2] @main max_interval has to be greater than min_interval

and the stack won't start.

Also See

MERGE3