Profiling (computer programming)

In software engineering, profiling is a form of dynamic program analysis that measures, for example, the space or time complexity of a program, the usage of particular instructions, or the frequency and duration of function calls. Most commonly, profiling information serves to aid program optimization.
Profiling is achieved by instrumenting either the program source code or its binary executable form using a tool called a profiler. Profilers may use a number of different techniques, such as event-based, statistical, instrumented, and simulation methods.

Gathering program events

Profilers use a wide variety of techniques to collect data, including hardware interrupts, code instrumentation, instruction set simulation, operating system hooks, and performance counters. Profilers are used in the performance engineering process.

Use of profilers

The output of a profiler may be:

A statistical summary of the events observed

/* ------------ source------------------------- count */
0001 IF X = "A" 0055
0002 THEN DO
0003 ADD 1 to XCOUNT 0032
0004 ELSE
0005 IF X = "B" 0055

A stream of recorded events
An ongoing interaction with the hypervisor

A profiler can be applied to an individual method or at the scale of a module or program, to identify performance bottlenecks by making long-running code obvious. A profiler can be used to understand code from a timing point of view, with the objective of optimizing it to handle various runtime conditions or various loads. Profiling results can be ingested by a compiler that provides profile-guided optimization. Profiling results can be used to guide the design and optimization of an individual algorithm; the Krauss matching wildcards algorithm is an example. Profilers are built into some application performance management systems that aggregate profiling data to provide insight into transaction workloads in distributed applications.

History

Performance-analysis tools existed on IBM/360 and IBM/370 platforms from the early 1970s, usually based on timer interrupts which recorded the program status word at set timer-intervals to detect "hot spots" in executing code. This was an early example of sampling. In early 1974 instruction-set simulators permitted full trace and other performance-monitoring features.
Profiler-driven program analysis on Unix dates back to 1973, when Unix systems included a basic tool, prof, which listed each function and how much of program execution time it used. In 1982 gprof extended the concept to a complete call graph analysis.
In 1994, Amitabh Srivastava and Alan Eustace of Digital Equipment Corporation published a paper describing ATOM. The ATOM platform converts a program into its own profiler: at compile time, it inserts code into the program to be analyzed. That inserted code outputs analysis data. This technique - modifying a program to analyze itself - is known as "instrumentation".
In 2004 both the gprof and ATOM papers appeared on the list of the 50 most influential PLDI papers for the 20-year period ending in 1999.

Profiler types based on output

Flat profiler

Flat profilers compute the average call times, from the calls, and do not break down the call times based on the callee or the context.

Call-graph profiler

profilers show the call times, and frequencies of the functions, and also the call-chains involved based on the callee. In some tools full context is not preserved.

Input-sensitive profiler

Input-sensitive profilers add a further dimension to flat or call-graph profilers by relating performance measures to features of the input workloads, such as input size or input values. They generate charts that characterize how an application's performance scales as a function of its input.

Data granularity in profiler types

Profilers, which are also programs themselves, analyze target programs by collecting information on their execution. Based on their data granularity, on how profilers collect information, they are classified into event based or statistical profilers. Profilers interrupt program execution to collect information, which may result in a limited resolution in the time measurements, which should be taken with a grain of salt. Basic block profilers report a number of machine clock cycles devoted to executing each line of code, or a timing based on adding these together; the timings reported per basic block may not reflect a difference between cache hits and misses.

Event-based profilers

The programming languages listed here have event-based profilers:

Java: the JVMTI API, formerly JVMPI, provides hooks to profilers, for trapping events like calls, class-load, unload, thread enter leave.
.NET: Can attach a profiling agent as a COM server to the CLR using Profiling API. Like Java, the runtime then provides various callbacks into the agent, for trapping events like method JIT / enter / leave, object creation, etc. Particularly powerful in that the profiling agent can rewrite the target application's bytecode in arbitrary ways.
Python: Python profiling includes the profile module, hotshot, and using the 'sys.setprofile' function to trap events like c_, python_.
Ruby: Ruby also uses a similar interface to Python for profiling. Flat-profiler in profile.rb, module, and ruby-prof a C-extension are present.
Statistical profilers

Some profilers operate by sampling. A sampling profiler probes the target program's call stack at regular intervals using operating system interrupts. Sampling profiles are typically less numerically accurate and specific, but allow the target program to run at near full speed.
The resulting data are not exact, but a statistical approximation. "The actual amount of error is usually more than one sampling period. In fact, if a value is n times the sampling period, the expected error in it is the square-root of n sampling periods."
In practice, sampling profilers can often provide a more accurate picture of the target program's execution than other approaches, as they are not as intrusive to the target program, and thus don't have as many side effects. Also since they don't affect the execution speed as much, they can detect issues that would otherwise be hidden. They are also relatively immune to over-evaluating the cost of small, frequently called routines or 'tight' loops. They can show the relative amount of time spent in user mode versus interruptible kernel mode such as system call processing.
Still, kernel code to handle the interrupts entails a minor loss of CPU cycles, diverted cache usage, and is unable to distinguish the various tasks occurring in uninterruptible kernel code.
Dedicated hardware can go beyond this: ARM Cortex-M3 and some recent MIPS processors JTAG interface have a PCSAMPLE register, which samples the program counter in a truly undetectable manner, allowing non-intrusive collection of a flat profile.
Some commonly used statistical profilers for Java/managed code are SmartBear Software's AQtime and Microsoft's CLR Profiler. Those profilers also support native code profiling, along with Apple Inc.'s Shark, OProfile, Intel VTune and Parallel Amplifier, and Oracle Performance Analyzer, among others.

Instrumentation

This technique effectively adds instructions to the target program to collect the required information. Note that instrumenting a program can cause performance changes, and may in some cases lead to inaccurate results and/or heisenbugs. The effect will depend on what information is being collected, on the level of timing details reported, and on whether basic block profiling is used in conjunction with instrumentation. For example, adding code to count every procedure/routine call will probably have less effect than counting how many times each statement is obeyed. A few computers have special hardware to collect information; in this case the impact on the program is minimal.
Instrumentation is key to determining the level of control and amount of time resolution available to the profilers.

Manual: Performed by the programmer, e.g. by adding instructions to explicitly calculate runtimes, simply count events or calls to measurement APIs such as the Application Response Measurement standard.
Automatic source level: instrumentation added to the source code by an automatic tool according to an instrumentation policy.
Intermediate language: instrumentation added to assembly or decompiled bytecodes giving support for multiple higher-level source languages and avoiding binary offset re-writing issues.
Compiler assisted
Binary translation: The tool adds instrumentation to a compiled executable.
Runtime instrumentation: Directly before execution the code is instrumented. The program run is fully supervised and controlled by the tool.
Runtime injection: More lightweight than runtime instrumentation. Code is modified at runtime to have jumps to helper functions.
Interpreter instrumentation
Interpreter debug options can enable the collection of performance metrics as the interpreter encounters each target statement. A bytecode, control table or JIT interpreters are three examples that usually have complete control over execution of the target code, thus enabling extremely comprehensive data collection opportunities.
Hypervisor/Simulator
Hypervisor: Data are collected by running the unmodified program under a hypervisor. Example: SIMMON
Simulator and Hypervisor: Data collected interactively and selectively by running the unmodified program under an Instruction Set Simulator.

Popular movies

The Hunger Games (film) - 2012 American dystopian action thriller science fiction-adventure film directed by Gary Ross and based on Suzanne Collins’s 2008 novel of the same name. It is the first insta...
untitled Captain Marvel sequel - part of Marvel Cinematic Universe....
Killers of the Flower Moon (film project) - Killers of the Flower Moon - film project in United States of America. It was presented as drama, detective fiction, thriller. The film project starred Leonardo Dicaprio, Robert De Niro. Director of...
Five Nights at Freddy's (film) - Five Nights at Freddy's - film published in 2017 in United States of America. Scenarist of the film - Scott Cawthon....

Popular books

Book of Revelation - The Book of Revelation is the final book of the New Testament, and consequently is also the final book of the Christian Bible. Its title is derived from the first word of the Koine Greek text: apok...
Book of Genesis - account of the creation of the world, the early history of humanity, Israel's ancestors and the origins...
Gospel of Matthew - The Gospel According to Matthew is the first book of the New Testament and one of the three synoptic gospels. It tells how Israel's Messiah, rejected and executed in Israel, pronounces judgement on ...
Michelin Guide - Michelin Guides are a series of guide books published by the French tyre company Michelin for more than a century. The term normally refers to the annually published Michelin Red Guide , the oldest...
Psalms - The Book of Psalms , commonly referred to simply as Psalms , the Psalter or "the Psalms", is the first book of the Ketuvim , the third section of the Hebrew Bible, and thus a book of th...
Ecclesiastes - Ecclesiastes is one of 24 books of the Tanakh , where it is classified as one of the Ketuvim . Originally written c. 450–200 BCE, it is also among the canonical Wisdom literature of the Old Tes...
The 48 Laws of Power - non-fiction book by American author Robert Greene. The book...

Popular television series

The Crown (TV series) - historical drama web television series about the reign of Queen Elizabeth II, created and principally written by Peter Morgan, and produced by Left Bank Pictures and Sony Pictures Tel...
Friends - American sitcom television series, created by David Crane and Marta Kauffman, which aired on NBC from September 22, 1994, to May 6, 2004, lasting ten seasons. With an ensemble cast sta...
Young Sheldon - spin-off prequel to The Big Bang Theory and begins with the character Sheldon...
Modern Family - American television mockumentary family sitcom created by Christopher Lloyd and Steven Levitan for the American Broadcasting Company. It ran for eleven seasons, from September 23...
Loki (TV series) - upcoming American web television miniseries created for Disney+ by Michael Waldron, based on the Marvel Comics character of the same name. It is set in the Marvel Cinematic Universe, shar...
Game of Thrones - American fantasy drama television series created by David Benioff and D. B. Weiss for HBO. It...
Shameless (American TV series) - American comedy-drama television series developed by John Wells which debuted on Showtime on January 9, 2011. It...