Mod cluster Concepts: Difference between revisions

From NovaOrdis Knowledge Base
Jump to navigation Jump to search
No edit summary
 
(71 intermediate revisions by the same user not shown)
Line 1: Line 1:
<font color=red>TODO Deplete: https://home.feodorov.com:9443/wiki/Wiki.jsp?page=Mod_clusterConcepts</font>
=Internal=
=Internal=


Line 5: Line 4:
* [[httpd mod_proxy Concepts|mod_proxy Concepts]]
* [[httpd mod_proxy Concepts|mod_proxy Concepts]]


=Overview=
=Architecture=
 
 
 
mod_cluster general idea is that the httpd process instantiates [[#Balancer|balancers]], which are mechanism capable of sending requests into a set of back end nodes. Each balancer has a set of [[#Worker_Node|worker nodes]], each worker node corresponding to a back end node. The node population associated with a balancer is dynamic - explain that. </font>Virtual hosts work with balancers: a virtual host has the capability to either send requests into one or more balancers, based, which in turn send the requests into their node population, or serves the request locally from its  <tt>DocumentRoot</tt>. <font color=red>The balancers are dynamically updated with contexts over the network.</font>


<font color=red>
<font color=red>
Questions:


mod_cluster general idea is that the httpd process instantiates [[#Balancer|balancers]], which are mechanism capable of sending requests into a set of back end nodes. Each balancer has a set of [[#Worker_Node|worker nodes]], each worker node corresponding to a back end node. The node population associated with a balancer is dynamic - explain that. Virtual hosts work with balancers - they have the capability to either send requests into balancers, which in turn send the requests into their node population, or they resolve requests locally in their <tt>DocumentRoot</tt>.
* Understand the relationship between manager and a virtual host - do we need a virtual host to wrap around a manager?
 
* Is the top level balancer common to all virtual hosts while one declared inside a proxy is private to that proxy? Try to use a balancer declared in VH1 from VH2.
 
* Clarify the relationship between balancer and virtual host.
 
* Understand how a back end node dynamically updates its own context on the front-end. This makes ProxyPass superfluous.


</font>
</font>


=Lifecycle of a Request=
httpd multicasts itself as soon as it starts. JBoss mod_cluster service listens to proxy advertisements:
 
<pre>
21:08:31,156 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 62) MODCLUSTER000032: Listening to proxy advertisements on /224.0.1.105:23364
</pre>
 
Requests flow from httpd to AS nodes, and the AS nodes open feedback channels back to httpd.


<font color=red>TODO</font>
 
=Manager=
 
The mod_cluster manager runs inside a virtual host. The manager is available over HTTP at http://<virtual-host-ip-address>:<virtual-host-port>/mod_cluster-manager


=Balancer=
=Balancer=
Line 30: Line 49:
Places in which a balancer can be configured:
Places in which a balancer can be configured:
* [[mod_cluster mod_proxy_cluster Configuration#CreateBalancer|mod_proxy_cluster CreateBalancer]]
* [[mod_cluster mod_proxy_cluster Configuration#CreateBalancer|mod_proxy_cluster CreateBalancer]]
=Load Balancing and Node Busyness=
mod_cluster has the capability of load balancing requests to back end nodes based on various load balancing algorithms, one of those being "node busyness". The mod_cluster httpd plugin constantly receives "busyness" data from the back end nodes and uses this information to send requests to the nodes perceived to be under the lowest load.
<blockquote style="background-color: Gold; border: solid thin Goldenrod;">
:<br>"busyness" will be used to decide what node to send the request ''only for new requests'' or ''when the node that owns the request is associated with is not available''. If the request is associated with session that is already "owned" by a back end node, and that node appears "live" to the load balancer, it will be always used as a target, regardless of how "busy" it is.<br><br>
</blockquote>


=Worker Node=
=Worker Node=
Worker nodes have the choice of registering with a list of httpd proxy node, at startup. If the socket connections between the EAP nodes and the httpd proxy are severed later,, the EAP nodes should automatically attempt to auto-re-register with the proxy.
=Alias=
=Context=
=Domain=
<font color=red>What is a domain?
''Nods might be partitioned into mod-cluster 'domains'. Each domain can run on different versions of EAP / JGroups. Sticky sessions per mod-cluster domain is done by appending the node and domain identity to the jsessionid. Sessions are sticky per domain: so if a node crashes, a node from the same domain is picked. If none is available, then any domain is picked.Via the mod-cluster-manager app, you could de-activate entire domains, which means that new sessions were not created in that domain, but existing sessions would continue to be served. When all sessions of a given domain had timed out, the domain could be shut down (and updated, for example).''
* It looks as simple as adding a “domain” attribute.
* More details https://developer.jboss.org/wiki/ModClusterDesign?_sscc=t
</font>
=Worker Node - Proxy Communication=
The STATUS command is used to report the current load of the servlet container the proxy. It is reported as a byte value between 0 and 100, where 0 represents max load, and 100 represents no load. The STATUS command is sent from the EAP node to the httpd proxy every 10 seconds (default). If the connection is broken, the EAP node will reset its config with the proxy.


=Modules=
=Modules=


==mod_cluster_slotmem==
==mod_manager==
==mod_manager==
'''mod_manager''', also known as the ''cluster manager module'', manages the communication with the worker nodes. It receives and acknowledges node messages concerning registration, node load data and life cycle events of the application deployed on worker nodes.
mod_manager configuration options are described here:


<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[mod_cluster mod_manager Configuration|mod_manager Configuration]]
:[[mod_cluster mod_manager Configuration|mod_manager Configuration]]
</blockquote>
===PersistSlots===
The metadata associated with [[Mod_cluster_Concepts#Worker_Node|nodes]], [[Mod_cluster_Concepts#Alias|aliases]] and [[Mod_cluster_Concepts#Context|contexts]] is sent by the worker nodes during the registration process, and then subsequently updated via messages, but is not persisted, by default. Thus, if a httpd node is stopped and then restarted, it loses the metadata, so it "forgets" about the backend nodes, unless the worker nodes explicitly re-register.
During that time then the httpd does have complete knowledge about each EAP node in the backend cluster, it is not able to correctly load-balance when it received valid (but unknown to it) JVM Routes.
This behavior can be avoided by setting <tt>PersistSlots</tt> to "on".
There are no damaging side-effects to setting "PersistSlots" to "on".
For more details on configuration, see:
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[Mod_cluster_mod_manager_Configuration#PersistSlots|PersistSlots Configuration]]
</blockquote>
</blockquote>


==mod_proxy_cluster==
==mod_proxy_cluster==


<tt>mod_proxy_cluster</tt> cannot work correctly if [[httpd mod_proxy Concepts#mod_proxy_balancer|mod_proxy's <tt>mod_proxy_balancer</tt>]] is loaded, so <tt>mod_proxy_balancer</tt> must be removed from the httpd configuration.
This is known as the ''proxy balancer module'', and it is a replacement for the standard <tt>mod_proxy_balancer</tt> that comes with mod_proxy. Note that <tt>mod_proxy_cluster</tt> cannot work correctly if [[httpd mod_proxy Concepts#mod_proxy_balancer|mod_proxy's <tt>mod_proxy_balancer</tt>]] is loaded, so <tt>mod_proxy_balancer</tt> must be removed from the httpd configuration.
 
<tt>mod_proxy_cluster</tt> handles the routing of requests to cluster nodes. The proxy balancer selects the appropriate node to forward the request to, based on application location in the cluster, current state of each of the cluster nodes, and the Session ID (if a request is part of an established session).
 
For more details on configuration, see:


<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
Line 52: Line 123:
==mod_advertise==
==mod_advertise==


=Dependency on mod_proxy=
<tt>mod_advertise</tt> is also known as the ''proxy advertisement module''; it broadcasts the existence of the proxy server via UDP multicast messages. The server advertisement messages contain the IP address and port number where the proxy is listening for responses from nodes that wish to join the load-balancing cluster.
This module must be defined alongside <tt>[[#mod_manager|mod_manager]]</tt> in the <tt>VirtualHost</tt> element of the httpd configuration. Note that
<tt>mod_advertise</tt> multicasts the VirtualHost where is configured, so it must be the same <tt>VirtualHost</tt> where <tt>mod_manager</tt> is defined.
 
For more details on configuration, see:
 
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[mod_cluster mod_advertise Configuration|mod_advertise Configuration]]
</blockquote>
 
==mod_cluster_slotmem==
 
<font color=red>Available in  mod_slotmem.so</font>.
 
==mod_proxy==
 
Note that this is not a mod_cluster module, but a standard httpd server module, and it is required by mod_cluster, as mod_cluster delegates to it. For more details see:
 
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[Httpd_mod_proxy_Concepts#mod_proxy|mod_proxy]]
</blockquote>
 
<tt>mod_proxy</tt> directives such as [[Httpd_ProxyIOBufferSize|ProxyIOBufferSize]] are used to configure mod_cluster.
 
==mod_proxy_ajp==
 
Note that this is not a mod_cluster module, but a standard httpd server module, and it is required by mod_cluster, as mod_cluster delegates to it. For more details see:
 
<blockquote style="background-color: #f9f9f9; border: solid thin lightgrey;">
:[[Httpd_mod_proxy_Concepts#mod_proxy_ajp|mod_proxy_ajp]]
</blockquote>
 
=jboss.node.name=
 
JBoss mod_cluster service uses the "jboss.node.name" system property as identity information when registering with httpd.
 
=Sticky Session=
 
This is how sticky session is configured: [[Mod_cluster_JBoss_Configuration#Sticky_Session|mod_cluster Sticky Session Configuration]].
 
=Organizatorium=
 
==Initialization==
 
Upon the initialization of the mod_cluster service on the JBoss node, it sends the following sequence of requests:


<font color=red>
* INFO /
TODO
* CONFIG /JVMRoute=webr01&Port=8009&Host=192.168.0.147&Type=ajp&StickySessionForce=No&Maxattempts=1
* At this moment the balancer is created and the corresponding worker is initialized in each existing httpd child process.
* STATUS /JVMRoute=webr01&Load=-1 - periodic status updates + periodic ajp_cping_cpong initiated by the web server. Not sure at this point who initiates who, and who answers who - they seem to be related.
* proxy_cluster_watchdog_func (?)


* Why are mod_proxy.so and mod_proxy_ajp.so and how are they used?
Upon the deployment of a new application, the node sends the following sequence:
* Are there any mod_proxy modules that must be removed?


</font>
</font>

Latest revision as of 21:07, 8 February 2017

Internal

Architecture

mod_cluster general idea is that the httpd process instantiates balancers, which are mechanism capable of sending requests into a set of back end nodes. Each balancer has a set of worker nodes, each worker node corresponding to a back end node. The node population associated with a balancer is dynamic - explain that. Virtual hosts work with balancers: a virtual host has the capability to either send requests into one or more balancers, based, which in turn send the requests into their node population, or serves the request locally from its DocumentRoot. The balancers are dynamically updated with contexts over the network.

Questions:

  • Understand the relationship between manager and a virtual host - do we need a virtual host to wrap around a manager?
  • Is the top level balancer common to all virtual hosts while one declared inside a proxy is private to that proxy? Try to use a balancer declared in VH1 from VH2.
  • Clarify the relationship between balancer and virtual host.
  • Understand how a back end node dynamically updates its own context on the front-end. This makes ProxyPass superfluous.

httpd multicasts itself as soon as it starts. JBoss mod_cluster service listens to proxy advertisements:

21:08:31,156 INFO  [org.jboss.modcluster] (ServerService Thread Pool -- 62) MODCLUSTER000032: Listening to proxy advertisements on /224.0.1.105:23364

Requests flow from httpd to AS nodes, and the AS nodes open feedback channels back to httpd.


Manager

The mod_cluster manager runs inside a virtual host. The manager is available over HTTP at http://<virtual-host-ip-address>:<virtual-host-port>/mod_cluster-manager

Balancer

TODO: what is a "balancer". There's a default one, whose default name is "mycluster". The name of the balancer can be specified by the back-end nodes. What is it? What is its function. Represent it on the diagram.


balancer://<balancer-name> is used in ProxyPass directives.


Places in which a balancer can be configured:

Load Balancing and Node Busyness

mod_cluster has the capability of load balancing requests to back end nodes based on various load balancing algorithms, one of those being "node busyness". The mod_cluster httpd plugin constantly receives "busyness" data from the back end nodes and uses this information to send requests to the nodes perceived to be under the lowest load.


"busyness" will be used to decide what node to send the request only for new requests or when the node that owns the request is associated with is not available. If the request is associated with session that is already "owned" by a back end node, and that node appears "live" to the load balancer, it will be always used as a target, regardless of how "busy" it is.

Worker Node

Worker nodes have the choice of registering with a list of httpd proxy node, at startup. If the socket connections between the EAP nodes and the httpd proxy are severed later,, the EAP nodes should automatically attempt to auto-re-register with the proxy.

Alias

Context

Domain

What is a domain?

Nods might be partitioned into mod-cluster 'domains'. Each domain can run on different versions of EAP / JGroups. Sticky sessions per mod-cluster domain is done by appending the node and domain identity to the jsessionid. Sessions are sticky per domain: so if a node crashes, a node from the same domain is picked. If none is available, then any domain is picked.Via the mod-cluster-manager app, you could de-activate entire domains, which means that new sessions were not created in that domain, but existing sessions would continue to be served. When all sessions of a given domain had timed out, the domain could be shut down (and updated, for example).

Worker Node - Proxy Communication

The STATUS command is used to report the current load of the servlet container the proxy. It is reported as a byte value between 0 and 100, where 0 represents max load, and 100 represents no load. The STATUS command is sent from the EAP node to the httpd proxy every 10 seconds (default). If the connection is broken, the EAP node will reset its config with the proxy.

Modules

mod_manager

mod_manager, also known as the cluster manager module, manages the communication with the worker nodes. It receives and acknowledges node messages concerning registration, node load data and life cycle events of the application deployed on worker nodes.

mod_manager configuration options are described here:

mod_manager Configuration

PersistSlots

The metadata associated with nodes, aliases and contexts is sent by the worker nodes during the registration process, and then subsequently updated via messages, but is not persisted, by default. Thus, if a httpd node is stopped and then restarted, it loses the metadata, so it "forgets" about the backend nodes, unless the worker nodes explicitly re-register.

During that time then the httpd does have complete knowledge about each EAP node in the backend cluster, it is not able to correctly load-balance when it received valid (but unknown to it) JVM Routes.

This behavior can be avoided by setting PersistSlots to "on".

There are no damaging side-effects to setting "PersistSlots" to "on".

For more details on configuration, see:

PersistSlots Configuration

mod_proxy_cluster

This is known as the proxy balancer module, and it is a replacement for the standard mod_proxy_balancer that comes with mod_proxy. Note that mod_proxy_cluster cannot work correctly if mod_proxy's mod_proxy_balancer is loaded, so mod_proxy_balancer must be removed from the httpd configuration.

mod_proxy_cluster handles the routing of requests to cluster nodes. The proxy balancer selects the appropriate node to forward the request to, based on application location in the cluster, current state of each of the cluster nodes, and the Session ID (if a request is part of an established session).

For more details on configuration, see:

mod_proxy_cluster Configuration

mod_advertise

mod_advertise is also known as the proxy advertisement module; it broadcasts the existence of the proxy server via UDP multicast messages. The server advertisement messages contain the IP address and port number where the proxy is listening for responses from nodes that wish to join the load-balancing cluster. This module must be defined alongside mod_manager in the VirtualHost element of the httpd configuration. Note that mod_advertise multicasts the VirtualHost where is configured, so it must be the same VirtualHost where mod_manager is defined.

For more details on configuration, see:

mod_advertise Configuration

mod_cluster_slotmem

Available in mod_slotmem.so.

mod_proxy

Note that this is not a mod_cluster module, but a standard httpd server module, and it is required by mod_cluster, as mod_cluster delegates to it. For more details see:

mod_proxy

mod_proxy directives such as ProxyIOBufferSize are used to configure mod_cluster.

mod_proxy_ajp

Note that this is not a mod_cluster module, but a standard httpd server module, and it is required by mod_cluster, as mod_cluster delegates to it. For more details see:

mod_proxy_ajp

jboss.node.name

JBoss mod_cluster service uses the "jboss.node.name" system property as identity information when registering with httpd.

Sticky Session

This is how sticky session is configured: mod_cluster Sticky Session Configuration.

Organizatorium

Initialization

Upon the initialization of the mod_cluster service on the JBoss node, it sends the following sequence of requests:

  • INFO /
  • CONFIG /JVMRoute=webr01&Port=8009&Host=192.168.0.147&Type=ajp&StickySessionForce=No&Maxattempts=1
  • At this moment the balancer is created and the corresponding worker is initialized in each existing httpd child process.
  • STATUS /JVMRoute=webr01&Load=-1 - periodic status updates + periodic ajp_cping_cpong initiated by the web server. Not sure at this point who initiates who, and who answers who - they seem to be related.
  • proxy_cluster_watchdog_func (?)

Upon the deployment of a new application, the node sends the following sequence: