Using perf

From OMAPpedia

Revision as of 11:46, 21 July 2010 by H dupras (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to: navigation, search

Performance counters are special hardware registers available on most modern CPUs. These registers count the number of certain types of hw events: such as instructions executed, cache-misses suffered, or branches mispredicted - without slowing down the kernel or applications. These registers can also trigger interrupts when a threshold number of events have passed - and can thus be used to profile the code that runs on that CPU.

The Linux Performance Counter subsystem provides rich abstractions over these hardware capabilities. It provides per task, per CPU and per-workload counters, counter groups, and it provides sampling capabilities on top of those - and more.

It also provides abstraction for 'software events' - such as minor/major page faults, task migrations, task context-switches and tracepoints.

There is a new tool ('perf') that makes full use of this new kernel subsystem. It can be used to optimize, validate and measure applications, workloads or the full system.

'perf' is hosted in the upstream kernel repository and can be found under: tools/perf/


Contents

perf structure

Perf uses breakpoint from different sources that handle the register scheduling, thread/cpu attachment, etc.

     ptrace       kgdb      ftrace   perf syscall
         \          |          /         /
          \         |         /         /
                                       /
           Core breakpoint API        /
                                     /
                    |               /
                    |              /

             Breakpoints perf events

That's why, to fully use perf, you have to activate all this module such as Ftrace in the kernel configuration.


Installation

Actually, perf tool cannot be cross compile due to his different library needed. At this time, he can be build on OMAP with ubuntu installed. Prior to compile it, install the libelf library which is needed for the installation and execution of perf.

# apt-get install libelf-dev
# make 
# make install

For the other OS, A prebuilt version of the perf tool is available here

Futhermore, the following flags has to be activate into the kernel configuration :

* PERF_EVENTS
* PERF_COUNTERS.

Getting Started

Once you have installed 'perf' on your system, the simplest way to start profiling an userspace program is to use the "perf record" and "perf report" command as follows:

$ perf record -f -- git gc
 
Counting objects: 1283571, done.
Compressing objects: 100% (206724/206724), done.
Writing objects: 100% (1283571/1283571), done.
Total 1283571 (delta 1070675), reused 1281443 (delta 1068566)
[ perf record: Captured and wrote 31.054 MB perf.data (~1356768 samples) ]
 
$ perf report --sort comm,dso,symbol | head -10
# Samples: 1355726
#
# Overhead          Command                            Shared Object  Symbol
# ........  ...............  .......................................  ......
#
    31.53%              git  /usr/bin/git                             [.] 0x0000000009804f
    13.41%        git-prune  /usr/bin/git-prune                       [.] 0x000000000ad06d
    10.05%              git  /lib/tls/i686/cmov/libc-2.8.90.so        [.] _nl_make_l10nflist
     5.36%        git-prune  /usr/lib/libz.so.1.2.3.3                 [.] 0x00000000009d51
     4.48%              git  /lib/tls/i686/cmov/libc-2.8.90.so        [.] memcpy


perf event tracepoint

# perf list
[...]
kmem:kmalloc                             [Tracepoint event]
kmem:kmem_cache_alloc                    [Tracepoint event]
kmem:kmalloc_node                        [Tracepoint event]
kmem:kmem_cache_alloc_node               [Tracepoint event]
kmem:kfree                               [Tracepoint event]
kmem:kmem_cache_free                     [Tracepoint event]
kmem:mm_page_free_direct                 [Tracepoint event]
kmem:mm_pagevec_free                     [Tracepoint event]
kmem:mm_page_alloc                       [Tracepoint event]
kmem:mm_page_alloc_zone_locked           [Tracepoint event]
kmem:mm_page_pcpu_drain                  [Tracepoint event]

Then any (or all) of the above event sources can be activated and measured. For example the page alloc/free properties of a 'hackbench run' are:

# perf stat -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc -e kmem:mm_pagevec_free -e kmem:mm_page_free_direct ./hackbench 10
Time: 0.575
Performance counter stats for './hackbench 10':
         13857  kmem:mm_page_pcpu_drain 
         27576  kmem:mm_page_alloc      
          6025  kmem:mm_pagevec_free    
         20934  kmem:mm_page_free_direct
   0.613972165  seconds time elapsed


You can observe the statistical properties as well, by using the 'repeat the workload N times' feature of perf stat:

# perf stat --repeat 5 -e kmem:mm_page_pcpu_drain -e kmem:mm_page_alloc -e kmem:mm_pagevec_free -e kmem:mm_page_free_direct ./hackbench 10
Time: 0.627
Time: 0.644
Time: 0.564
Time: 0.559
Time: 0.626
Performance counter stats for './hackbench 10' (5 runs):
         12920  kmem:mm_page_pcpu_drain    ( +-   3.359% )
         25035  kmem:mm_page_alloc         ( +-   3.783% )
         6104  kmem:mm_pagevec_free       ( +-   0.934% )
        18376  kmem:mm_page_free_direct   ( +-   4.941% )
  0.643954516  seconds time elapsed   ( +-   2.363% )


Scripting support for perf

Recently, a support was added for using perl and python scripts with the perf tool. Interpreters for both perl and python can be embedded into the perf executable, which allows processing the raw perf trace data stream in either of those languages.

Multiple different example scripts are provided with perf, which can be listed from perf itself:

# perf trace -l
List of available trace scripts:
syscall-counts [comm]                system-wide syscall counts
syscall-counts-by-pid [comm]         system-wide syscall counts, by pid
failed-syscalls-by-pid [comm]        system-wide failed syscalls, by pid
workqueue-stats                      workqueue stats (ins/exe/create/destroy)
check-perf-trace                     useless but exhaustive test script
failed-syscalls [comm]               system-wide failed syscalls
wakeup-latency                       system-wide min/max/avg wakeup latency
rw-by-file <comm>                    r/w activity for a program, by file
rw-by-pid                            system-wide r/w activity


This list is a mix of perl and python scripts that live in the tools/perf/scripts/{perl,python}

The installed scripts can be used as follows:

# perf trace record failed-syscalls
   ^C[ perf record: Woken up 11 times to write data ]                         
   [ perf record: Captured and wrote 1.939 MB perf.data (~84709 samples) ] 
# perf trace report failed-syscalls
   perf trace started with Perl script \

/root/libexec/perf-core/scripts/perl/failed-syscalls.pl


   failed syscalls, by comm:

   comm                    # errors
   --------------------  ----------
   firefox                     1721
   claws-mail                   149
   konsole                       99
   X                             77
   emacs                         56
   [...]

   failed syscalls, by syscall:

   syscall                           # errors
   ------------------------------  ----------
   sys_read                              2042
   sys_futex                              130
   sys_mmap_pgoff                          71
   sys_access                              33
   sys_stat64                               5
   sys_inotify_add_watch                    4
   [...]


Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox