YourKit Concepts

From NovaOrdis Knowledge Base
Jump to navigation Jump to search

Internal

Profiler Agent

Remote Profiling

https://www.yourkit.com/docs/java/help/remote_profiling.jsp

Remote profiling is the situation when the profiled application and the profiler UI run on different machines, and the profiler UI communicates with the profiler agent over the network. Remote profiling is only possible if the application JVM loads the profiler agent. There are two methods to load the profiler agent within the target JVM:

  1. Manual configuration: the target JVM is started with a configuration that loads the profiler agent. Starting the JVM with the agent is recommended, because attaching the agent to a running JVM has limitations in profiling functionality and is not always possible. The detailed procedure to configure a JVM for remote profiling is described here: Manually Configure Target JVM for Remote Profiling.
  2. The 'attach' technique: The profiler agent is loaded into a running JVM without restart, using the "attach" technique. For more details see https://www.yourkit.com/docs/java/help/attach_agent.jsp and https://www.yourkit.com/docs/java/help/attach_wizard.jsp.

Memory Profiling

YourKit can be used to diagnose several types of memory problems: elevated steady-state level, memory leaks and excessive garbage collection. The memory telemetry information is maintained in a circular buffer in the profiler agent's memory.

Memory Snapshot in YourKit Format

A memory snapshot represents the memory state of the profiled application at the moment the snapshot was captured. The snapshot contains information about all loaded classes, all existing object instances, the values of their primitive fields and arrays of primitive types, and references between objects. Optionally, a memory snapshot in YourKit format may contain information about object allocations. By default, YourKit capture memory snapshots in its own format. Optionally, memory snapshots can be captured in a HPROF format.

HPROF Memory Snapshot

HPROF format

Object Generations

Object Allocation Recording

Object allocation recording consists in tracking and recording the method calls where objects are created. Object allocation recording is also known as "allocation telemetry". Allocation recording is based on byte code instrumentation but by default, object allocation recording is not enabled, as the operation may have performance implications. When the allocation recording is not enabled, there is almost no overhead, even if the bytecode instrumentation is present. Bytecode instrumentation may be eliminated with "disablealloc" and "disableall" startup options. When object allocation recording is enabled, the "Object Allocation Recording" graph in the "Memory" tab shows the number of objects created per second. Object allocation can be recorded in two modes recording of the thread and stack where objects are allocated (default) and object counting mode.

Recording of the thread and stack where objects are allocated provides most detail about object allocation, but it also comes with the highest overhead. The full stack and thread where a particular object is created is determined and remembered in the memory snapshot for each recorded object. The collection overhead can be kept under control if only a certain percentage of object allocation is recorded, skipping all others. It is also possible to configure the profiler to only record allocation of objects over a certain size. This options is configured in the "Reduce overhead by not recording all objects" section of the control UI.

It is also possible to use a sampling thread to obtain stacks of running threads, instead of recording the exact stack trace of reach recorded new object. Just like the case of CPU sampling, the sampled object allocation recording results are relevant only for methods that are longer than the sampling period. This mode require expert skills in interpreting the results. It is enabled in the "Estimated (sampled) stacks instead of exact stacks" in the control UI.

Object counting provides allocated object counts by class, then by immediate allocator method with the line number, if available. It does not provide stack traces, and does not track particular instances. Object allocated in different threads are summed and they cannot distinguished. Counts are not guaranteed to be exact. To ensure minimal overhead, allocation counters are updated without taking any locks or using test-and-set style atomic operations. If the same method running in parallel in different threads running on different CPU cores simultaneously create instance of the same class, some of them may be missed by a non-atomic counter. Object counting mode has the lowest (almost zero) overhead of these two methods.

Recorded allocations are shown in the "Allocations" view of the "Memory Tab".

Automated Memory Snapshot Trigger

To Process

To Process: