Java Memory Concepts
Internal
Heap
The heap is the runtime data area in the JVM where all class instances and arrays are allocated. The heap is allocated when the JVM starts, and it may be of a fixed or variable size. The objects that are no longer in use are automatically reclaimed by a garbage collector. Metrics reflecting heap usage are exposed by the Memory MBean, which can be obtained from the platform MBean server.
The maximum amount of memory that cam be allocated to the heap can be set on command line with:
java ... -Xmx1g ...
The heap usually consists of several memory pools, depending on what garbage collector is in use:
Eden
Eden statistics are exposed by the "java.lang:type=MemoryPool,name=PS Eden Space" MemoryPool MBean.
Survivor Space
Survivor Space statistics are exposed by the "java.lang:type=MemoryPool,name=PS Survivor Space" MemoryPool MBean.
Old Generation
The Old (or Tenured) Generation is the area on the heap where objects instances that survive a certain number of new generation collections are promoted to, and stored. Once promoted to the Old Generation, the object instances are verified for reachability and possibly discarded if unreachable less often, depending on the specific garbage collection algorithm. Aside from algorithm-specific operations, all known garbage collection algorithms have the option to initiate a full garbage collection, when all the JVM threads are suspended, and the entire heap, including the Old Generation, is inspected for reachability and subsequently, unreachable objects are discarded. Objects that survive a full garbage collection have a good change to stay on the heap for the entire life time of the JVM. Memory leaks are caused by this kind of object instances.
The following graph shows memory accumulating on the Old Generation during a long running test. The graph represents the Old Generation occupancy and capacity for two clustered JVMs (a1 and a2):
The size of the Old Generation cannot be set directly, but it is the difference between the size of the heap (-Xms/-Xmx) and the size of the new generation (-XX:NewSize/-XX:MaxNewSize). The actual size of the Old Generation is reported in the GC log on Full GC, if -XX:+PrintGCDetails is used:
2017-08-18T18:00:02.439-0400: 1900.971: [Full GC (Metadata GC Threshold) ... [ParOldGen: 725452K->870337K(8192000K)] ...
Old Generation statistics are exposed by the "java.lang:type=MemoryPool,name=PS Old Gen" MemoryPool MBean.
Non-Heap Memory
Anything else that is not a class instance or an array is allocated by the JVM in memory areas different from heap. This includes internal JVM memory management, which is performed through system calls like malloc or mmap. This non-heap system memory is referred to as native memory or non-heap memory. The non-heap memory include:
Metaspace
A method area that stores per-class metadata. This includes structures such as a runtime constant pool, per-class field and method definition data, and the code for method and constructors. The method area is shared among all threads. The method area is created at the JVM startup. The JVM manages the space used for the metadata. Space is requested from the O/S and then divided in chunks. A class loader will allocate space for metadata from its chunks - a chunk is bound to a specific class loader. When classes are unloaded from a class loader, its chunks are recycled for reuse or returned to the O/S. Metadata uses mmap-ed space and not malloc-ed space.
The metaspace can be characterized by the following metrics, reported when verbose GC logging is enabled:
Metaspace used 2684K, capacity 4486K, committed 4864K, reserved 1056768K
class space used 291K, capacity 386K, committed 512K, reserved 1048576K
where:
- used - the amount of space used for loaded classes and other metadata.
- capacity - space available for metadata in currently allocated chunks.
- committed - the amount of space available for chunks.
- reserved - amount of space reserved, but not necessarily committed for metadata.
On the "class space" line, the reported values are the corresponding values for the class area in Metaspace when compressed class pointers are used.
Metaspace is dynamically managed by the JVM. Metadata is deallocated when the corresponding class loader is garbage collected. A high water mark is used for inducing a GC: when committed memory of all metaspaces reaches this level, a GC is triggered. The level can be set with [[]]
.
Metaspace Tuning
-XX:MaxMetaspaceSize
-XX:MaxMetaspaceSize=256m
Sets the maximum amount of native memory that can be allocated for class metadata. By default, the size is not limited.
Note: there were cases when the max metaspace size was set with -XX:MaxMetaspaceSize but GC logging reported a larger reserved metaspace size.
-XX:MetaspaceSize
-XX:MetaspaceSize=256m
Sets the size of the allocated class metadata space that will trigger a garbage collection the first time it is exceeded. This threshold for a garbage collection is increased or decreased depending on the amount of metadata used. The default size depends on the platform.
Code Cache
An area where the JIT compiler stores native machine code translated from Java bytecode.
Compressed Class Space
Thread Stacks
A thread stack storage area.
Thread Stack Memory Management
The default thread stack size on 64-bit systems is 1024K. 64k is the least amount of stack space allowed per thread.
It can be modified with with the -Xss
option:
java ... -Xss<size> ...
where "<size>" represents the amount of memory and the measure unit (ex "2048k").
Native Memory Tracking
JVM offers facilities to track native memory usage:
-XX:NativeMemoryTracking
-XX:NativeMemoryTracking=off|summary|detail
where the options are:
- off - do not track. This is the default.
- summary - only track memory usage by JVM subsystems, such as Java heap, class, code and thread.
- details - in addition to tracking memory usage by JVM subsystems, track memory usage by individual CallSite, individual virtual memory region and its committed regions.
-XX:+PrintNMTStatistics
Enables printing of collected native memory tracking data at JVM exit when native memory tracking is enabled (see -XX:NativeMemoryTracking
). By default, this option is disabled and native memory tracking data is not printed. The option is diagnostic and must be enabled via -XX:+UnlockDiagnosticVMOptions
.
Memory Pool
A memory pool represents a memory area managed by the JVM, and it can belong either to the heap or the non-heap memory. A memory pool is managed by one or more memory managers. Statistics about specific memory pools are exposed by the corresponding Memory Pool MBean. Examples of memory pools:
- Eden (heap)
- Survivor Space (heap)
- Old Generation (heap)
Memory Manager
A memory manager is the JVM runtime component that manages one or more memory pools.
The garbage collectors are a type of memory manager.
Other non-heap memory managers:
- The CodeCache manager - manages the Code Cache memory pool.
- The Metaspace manager - manages the Metaspace memory pool.
The memory managers are exposed over JMX by MemoryManager MBeans and the Garbage Collector MBeans.
Runtime Constant Pool
Garbage Collector
A set of one or more memory managers responsible for reclaiming memory that is occupied by unreachable objects.
The parallel collector, for example, consists of two distinct memory managers:
- "Scavenge" memory manager, that interacts with the Eden and the Survivor Space memory pools.
- "Mark and Sweep" memory manager, that interacts with all three heap memory pools (Eden, Survivor Space and Old Generation).
Garbage Collection Operations
Full Garbage Collection
GC Roots
The GC (Garbage Collector) Roots are special objects from the perspective of the Garbage Collector: the Garbage Collector collects only the objects that are not GC Roots and are not accessible by transitive references from GC Roots. An object can be transitively referred from more than one GC Root.
There are several kinds of GC Roots:
- Classes Loaded by a System Class Loader - such classes can never be unloaded. They can hold objects via static fields. Classes loaded by custom class loaders are not roots, unless the corresponding instance of java.lang.Class happens to be root of other kind.
- Thread - a live thread.
- Stack Local - local variables or parameters of a Java method
- JNI Local - local variable or parameter of a JNI method
- JNI Global - global JNI reference
- Monitor Used - objects used as a monitor for synchronization.
- Held by JVM - objects held from garbage collection by JVM for its own purposes. The actual list of such objects depends on the JVM implementations. Possible known cases are: the system class loader, a few important exception classes, a few pre-allocated objects for exception handling, and custom class loaders when they are in process of loading classes.
Alos see:
Reachability Scopes
YourKit exposes aggregated statistics for objects in different reachability scopes, see the Reachability view.
Strongly Reachable
Strongly reachable objects are objects reachable from GC roots via at least one strong reference, or being GC roots themselves. Such objects will not be garbage collected until all strong references to them are nulled and/or they remain GC roots. They tend to accumulate in the Old Generation and memory leaks should be searched among them
Weakly Reachable
Weakly reachable objects are objects reachable from GC roots via weak references only. Such objects can be deleted by garbage collector when it decided to free some memory.
Softly Reachable
Softly reachable objects are objects reachable from GC roots via soft references only. Such objects can be deleted by garbage collector when it decided to free some memory.
Unreachable
These are "dead" objects - objects unreachable from GC roots but not yet collected. Once the Garbage Collector decides to delete them, they will be deleted.
Pending Finalization
All unreachable objects that are overriding Object.finalize() method will be placed to the finalizer queue before actual deletion.
Memory Leak
A memory leak is the phenomenon of accumulating object instances that are not needed anymore according to the application logic, but are retained on the heap and cannot be collected because there is at least a transitive reference to them from one of the GC roots. In other words, for each leaked object there is always a path that starts from a GC Root and contains, or ends with, the leaked object. As time passes, the heap becomes increasingly occupied, until no more memory is available for new allocations. The application will cease to function, so that is why memory leaks are considered application defects.
The same symptoms occur if the JVM has a heap that is too small for the steady state memory level of the application.
A procedure to diagnose a memory leak using the YourKit profiler is described here:
Steady State Memory Level
For application that do not leak memory, and that are subjected to constant load, the old generation occupancy usually stabilizes to a certain level, referred to as steady state memory level. If the JVM heap is sized to low for a certain (otherwise normal) steady state level, the symptoms may be similar to those of a memory leak.
A procedure to diagnose elevated steady state level is described here:
A procedure to diagnose a memory leak using the YourKit profiler is described here:
HPROF Format
Java has built-in capability for dumping heap snapshots in the HPROF binary format. The dump file have usually the *.hprof extension. Various profilers may have the capability to capture heap dumps in extended formats, that include additional troubleshooting information, as, for example, the YourKit memory snapshot format.