Power Management Debug and Profiling
From OMAPpedia
| Line 14: | Line 14: | ||
* Initial discussion on LKML: cf. http://marc.info/?l=linux-kernel&m=128195697205096&w=4 | * Initial discussion on LKML: cf. http://marc.info/?l=linux-kernel&m=128195697205096&w=4 | ||
* Generic code + use of the tracepoints for ARM/OMAP: submitted to the LKML, cf. http://marc.info/?l=linux-kernel&m=128384411510589&w=2 | * Generic code + use of the tracepoints for ARM/OMAP: submitted to the LKML, cf. http://marc.info/?l=linux-kernel&m=128384411510589&w=2 | ||
| + | |||
| + | ==ToDo== | ||
| + | * profile the low level ASM code for OMAP. Because only the WKUP domain scratchpad survives the OFF mode a trick must be used to first store a timestamp in the scratchpad, then the event will be generated using the timestamp. | ||
| + | * pytimechart needs some changes to display the new PM events correctly, with the correct parameters (i.e. clock names for the clock_enable and clock_disable events...). | ||
| + | * add low power transition code events for fine profiling: POWER_ENTER/POWER_EXIT or POWER_ASLEEP/POWER_WAKEN? | ||
| + | * add low power transition code path latency stats in pytimechart | ||
| + | * various pytimechart bugfixes and re-write | ||
| + | * other tools: cpufrequtils, powertop ... | ||
==Compilation and usage== | ==Compilation and usage== | ||
| Line 421: | Line 429: | ||
* Close-up of the previous chart | * Close-up of the previous chart | ||
[[File:Pytimechart screenshot5 rs.jpg|center|thumb|320px]] | [[File:Pytimechart screenshot5 rs.jpg|center|thumb|320px]] | ||
| - | |||
| - | |||
| - | |||
| - | |||
| - | |||
==l-o kernel patch/RFC== | ==l-o kernel patch/RFC== | ||
Revision as of 08:15, 7 September 2010
Contents |
PM debug and profiling
Tracing of clock and power domains events
The perf and ftrace frameworks are used to generate the following events:
- cpuidle transitions to and from idle,
- system suspend,
- individual clocks enable, disable and set_rate,
- cpufreq/DVFS transitions,
- power domain transitions.
Cf. the patch/RFC below for code details (code is in WIP status).
Patches submissions:
- Initial discussion on LKML: cf. http://marc.info/?l=linux-kernel&m=128195697205096&w=4
- Generic code + use of the tracepoints for ARM/OMAP: submitted to the LKML, cf. http://marc.info/?l=linux-kernel&m=128384411510589&w=2
ToDo
- profile the low level ASM code for OMAP. Because only the WKUP domain scratchpad survives the OFF mode a trick must be used to first store a timestamp in the scratchpad, then the event will be generated using the timestamp.
- pytimechart needs some changes to display the new PM events correctly, with the correct parameters (i.e. clock names for the clock_enable and clock_disable events...).
- add low power transition code events for fine profiling: POWER_ENTER/POWER_EXIT or POWER_ASLEEP/POWER_WAKEN?
- add low power transition code path latency stats in pytimechart
- various pytimechart bugfixes and re-write
- other tools: cpufrequtils, powertop ...
Compilation and usage
Starting from the pm branch of Tony's tree (at http://git.kernel.org/?p=linux/kernel/git/tmlind/linux-omap-2.6.git), the patch below needs to be applied in order to get support for the new events and the OMAP specific trace points.
The config in use is omap3_pm_defconfig, with the following extra options to be enabled:
- CONFIG_PERF_EVENTS=y
- CONFIG_TRACEPOINTS=y
- CONFIG_HW_PERF_EVENTS=y
- CONFIG_EVENT_TRACING=y
- CONFIG_ENABLE_DEFAULT_TRACERS=y
- CONFIG_TRACING=y
- CONFIG_GENERIC_TRACER=y
- CONFIG_TRACING_SUPPORT=y
- CONFIG_FTRACE=y
Userspace filesystem
The filesystem archive File:Arago-console-image-omap3evm.tar.gz contains the necessary power management related utilities (cpufreq, perf) as well as a few useful utilities (bash, sshd).
To add the perf tool to the build, a new recipe must be added and a patch to kernel.bbclass is needed:
- 1: Patch to arago-oe-dev/kernel.bbclass
diff --git a/classes/kernel.bbclass b/classes/kernel.bbclass
index 9555f30..3636d46 100644
--- a/classes/kernel.bbclass
+++ b/classes/kernel.bbclass
@@ -156,7 +156,7 @@ kernel_do_install() {
mkdir -p $kerneldir/include/asm-generic
cp -fR include/asm-generic/* $kerneldir/include/asm-generic/
- for entry in drivers/crypto drivers/media include/generated include/linux include/net include/pcmcia include/media include/acpi include/sound include/video include/scsi include/trace include/mtd include/rdma include/drm include/xen; do
+ for entry in drivers/crypto drivers/media include/generated include/linux include/net include/pcmcia include/media include/acpi include/sound include/video include/scsi include/trace include/mtd include/rdma include/drm include/xen tools/perf lib; do
if [ -d $entry ]; then
mkdir -p $kerneldir/$entry
cp -fR $entry/* $kerneldir/$entry/
- 2: A linux-perf recipe has been added as arago/recipes/linux/linux-perf.bb
DESCRIPTION = "Linux perf tracing tool"
LICENSE = "GPL"
PR = "r0"
DEPENDS = "linux libelf"
RDEPENDS_${PN} = "libelf"
S = "${WORKDIR}"
PACKAGE_ARCH="${MACHINE_ARCH}"
do_compile() {
oe_runmake -C ${STAGING_KERNEL_DIR}/tools/perf NO_LIBPERL=1
}
do_install() {
oe_runmake -C ${STAGING_KERNEL_DIR}/tools/perf DESTDIR="${D}" install
}
do_stage () {
:
}
A few extra packages have been added by adding the following to arago/conf/local.conf
IMAGE_INSTALL += " bash sshd linux-perf cpufrequtils"
The filesystem has been built using Arago's arago-console-image task, a lot of patience and perferably a few GB's of RAM ;p
MACHINE=omap3evm bitbake arago-console-image
Here we go with the login invitation!!
_____ _____ _ _
| _ |___ ___ ___ ___ | _ |___ ___ |_|___ ___| |_
| | _| .'| . | . | | __| _| . | | | -_| _| _|
|__|__|_| |__,|_ |___| |__| |_| |___|_| |___|___|_|
|___| |___|
Arago Project http://arago-project.org omap3evm ttyS0
Arago 2010.07 omap3evm ttyS0
omap3evm login:
More info at http://arago-project.org/wiki/index.php/Setting_Up_Build_Environment.
Note about the kernel: Arago builds the TI PSP kernel for OMAP. The kernel used for PM Debug and Profiling comes from the pm branch of Tony's git tree.
Tracing instructions
Mounting debugfs
# mount -t debugfs nodev /sys/kernel/debug/
Enabling power events
- Enabling all power events
# echo 1 > /sys/kernel/debug/tracing/events/power/enable
- Per event enabling/disabling
# echo 1 > /sys/kernel/debug/tracing/events/power/power_end/enable # echo 0 > /sys/kernel/debug/tracing/events/power/power_end/enable
- Optionally the scheduling events can be enabled, in order to get the details on the CPU activity
# echo 1 > /sys/kernel/debug/tracing/events/sched/enable
Dump trace
- Consuming events from the ring buffer
# cat /sys/kernel/debug/tracing/trace_pipe
- Dump of the buffer, non comsuming
# cat /sys/kernel/debug/tracing/trace
PM specific commands reference
cpuidle
# echo 5 > /sys/devices/platform/serial8250.0/sleep_timeout # echo 5 > /sys/devices/platform/serial8250.1/sleep_timeout # echo 5 > /sys/devices/platform/serial8250.2/sleep_timeout # echo 1 > /sys/kernel/debug/pm_debug/sleep_while_idle # echo 1 > /sys/kernel/debug/pm_debug/enable_off_mode # uptime 06:33:42 up 15:35, load average: 0.88, 0.86, 0.70 # cat /sys/kernel/debug/pm_debug/count usbhost_pwrdm (OFF),OFF:1,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0 sgx_pwrdm (OFF),OFF:1,RET:0,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0 per_pwrdm (ON),OFF:89107,RET:14182,INA:0,ON:103290,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0 dss_pwrdm (OFF),OFF:1,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0 cam_pwrdm (OFF),OFF:1,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0 core_pwrdm (ON),OFF:57359,RET:31776,INA:0,ON:89136,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0,RET-MEMBANK2-OFF:0 neon_pwrdm (ON),OFF:93707,RET:9496,INA:240,ON:103444,RET-LOGIC-OFF:0 mpu_pwrdm (ON),OFF:93707,RET:9496,INA:240,ON:103444,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0 iva2_pwrdm (OFF),OFF:1,RET:1,INA:0,ON:1,RET-LOGIC-OFF:0,RET-MEMBANK1-OFF:0,RET-MEMBANK2-OFF:0,RET-MEMBANK3-OFF:0,RET-MEMBANK4-OFF:0 per_clkdm->per_pwrdm (11) usbhost_clkdm->usbhost_pwrdm (0) cam_clkdm->cam_pwrdm (0) dss_clkdm->dss_pwrdm (0) core_l4_clkdm->core_pwrdm (8) core_l3_clkdm->core_pwrdm (4) d2d_clkdm->core_pwrdm (0) sgx_clkdm->sgx_pwrdm (0) iva2_clkdm->iva2_pwrdm (0) neon_clkdm->neon_pwrdm (0) mpu_clkdm->mpu_pwrdm (0) prm_clkdm->wkup_pwrdm (0) cm_clkdm->core_pwrdm (0) # cat /sys/kernel/debug/pm_debug/time usbhost_pwrdm (OFF),OFF:55673971160888,RET:477995910645,INA:0,ON:687805175 sgx_pwrdm (OFF),OFF:56151967041016,RET:0,INA:0,ON:687866210 per_pwrdm (ON),OFF:54581824584529,RET:889348693710,INA:0,ON:681481628987 dss_pwrdm (OFF),OFF:55673971282959,RET:477967254639,INA:0,ON:716400146 cam_pwrdm (OFF),OFF:55673971282959,RET:477967376709,INA:0,ON:716278076 core_pwrdm (ON),OFF:49292550109366,RET:5300081390665,INA:0,ON:1560023437713 neon_pwrdm (ON),OFF:54474985870229,RET:994778198203,INA:532592784,ON:682358307045 mpu_pwrdm (ON),OFF:54475175811532,RET:994814758256,INA:533477796,ON:682130920677 iva2_pwrdm (OFF),OFF:55673971405029,RET:477995727540,INA:0,ON:687866210
cpufreq/DVFS
# echo 550000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed # echo 500000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed # echo 250000 > /sys/devices/system/cpu/cpu0/cpufreq/scaling_setspeed # echo ondemand > /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor
Trace logs
- Test setup:
- on-demand DVFS,
- 5s timeout on UARTs,
- sleep_while_idle=1,
- enable_off_mode=1.
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 1606.593781: power_start: type=1 state=1 <= Here starts the idle loop
<idle>-0 [000] 1606.593781: power_domain_transition: mpu_pwrdm state=0 | Power domain transition, MPU domain to target state OFF
<idle>-0 [000] 1606.593781: power_domain_transition: core_pwrdm state=1 | Power domain transition, CORE domain to target state RET
<idle>-0 [000] 1606.593842: power_domain_transition: neon_pwrdm state=0 | Power domain transition, NEON domain to target state OFF
<idle>-0 [000] 1606.593872: clock_disable: gpio6_dbck state=0 | Clock gpio6_dbck disable
<idle>-0 [000] 1606.593903: clock_disable: uart3_ick state=0 | ...
<idle>-0 [000] 1606.593903: clock_disable: uart3_fck state=0 | All UARTs clocks disable
<idle>-0 [000] 1606.593903: clock_disable: uart1_ick state=0 | ...
<idle>-0 [000] 1606.593903: clock_disable: uart1_fck state=0 |
<idle>-0 [000] 1606.593933: clock_disable: uart2_ick state=0 |
<idle>-0 [000] 1606.593933: clock_disable: uart2_fck state=0 |
<idle>-0 [000] 1606.640106: clock_enable: uart1_ick state=1 | All UARTs clocks enable
<idle>-0 [000] 1606.640137: clock_enable: uart1_fck state=1 | ...
<idle>-0 [000] 1606.640228: clock_enable: uart2_ick state=1 |
<idle>-0 [000] 1606.640228: clock_enable: uart2_fck state=1 |
<idle>-0 [000] 1606.640259: clock_enable: gpio6_dbck state=1 |
<idle>-0 [000] 1606.640259: clock_enable: uart3_ick state=1 |
<idle>-0 [000] 1606.640259: clock_enable: uart3_fck state=1 |
<idle>-0 [000] 1606.640564: power_end: dummy=65535 <= Here ends the idle loop
kondemand/0-404 [000] 1606.640717: power_frequency: type=2 state=125000000 <= DVFS event triggered by the on-demand governor, MPU freq=125MHz
kondemand/0-404 [000] 1606.640778: clock_set_rate: dpll3_m2_ck state=166000000 | Clock tree updates: dpll3_m2 to 166MHz
kondemand/0-404 [000] 1606.641144: clock_set_rate: dpll1_ck state=125000000 | dpll1 to 125MHz
kondemand/0-404 [000] 1606.641357: clock_set_rate: dpll2_ck state=90000000 | dpll2 to 90MHz
<idle>-0 [000] 1606.642914: power_start: type=1 state=1 <= End of DVFS, idle loop entry
<idle>-0 [000] 1606.642914: power_domain_transition: mpu_pwrdm state=0
<idle>-0 [000] 1606.642944: power_domain_transition: core_pwrdm state=3
<idle>-0 [000] 1606.643127: power_domain_transition: neon_pwrdm state=0
<idle>-0 [000] 1606.643158: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 1606.643188: power_domain_transition: per_pwrdm state=1
<idle>-0 [000] 1606.643219: clock_disable: uart3_ick state=0
<idle>-0 [000] 1606.643250: clock_disable: uart3_fck state=0
<idle>-0 [000] 1606.654236: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 1606.654266: clock_enable: uart3_ick state=1
<idle>-0 [000] 1606.654266: clock_enable: uart3_fck state=1
<idle>-0 [000] 1606.654297: power_domain_transition: per_pwrdm state=0
<idle>-0 [000] 1606.654846: power_end: dummy=65535
<idle>-0 [000] 1606.677887: power_start: type=1 state=1
<idle>-0 [000] 1606.677948: power_domain_transition: mpu_pwrdm state=3
<idle>-0 [000] 1606.677948: power_domain_transition: core_pwrdm state=3
<idle>-0 [000] 1606.678131: power_domain_transition: neon_pwrdm state=3
<idle>-0 [000] 1606.678192: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 1606.678223: power_domain_transition: per_pwrdm state=1
<idle>-0 [000] 1606.678253: clock_disable: uart3_ick state=0
<idle>-0 [000] 1606.678253: clock_disable: uart3_fck state=0
<idle>-0 [000] 1606.678314: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 1606.678345: clock_enable: uart3_ick state=1
<idle>-0 [000] 1606.678375: clock_enable: uart3_fck state=1
<idle>-0 [000] 1606.678406: power_domain_transition: per_pwrdm state=0
<idle>-0 [000] 1606.678925: power_end: dummy=65535
<idle>-0 [000] 1606.682739: power_start: type=1 state=1
<idle>-0 [000] 1606.682770: power_domain_transition: mpu_pwrdm state=0
<idle>-0 [000] 1606.682800: power_domain_transition: core_pwrdm state=1
<idle>-0 [000] 1606.682953: power_domain_transition: neon_pwrdm state=0
<idle>-0 [000] 1606.683014: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 1606.683105: clock_disable: uart3_ick state=0
<idle>-0 [000] 1606.683105: clock_disable: uart3_fck state=0
<idle>-0 [000] 1606.683136: clock_disable: uart1_ick state=0
<idle>-0 [000] 1606.683167: clock_disable: uart1_fck state=0
<idle>-0 [000] 1606.683197: clock_disable: uart2_ick state=0
<idle>-0 [000] 1606.683197: clock_disable: uart2_fck state=0
<idle>-0 [000] 1606.767761: clock_enable: uart1_ick state=1
<idle>-0 [000] 1606.767822: clock_enable: uart1_fck state=1
<idle>-0 [000] 1606.767975: clock_enable: uart2_ick state=1
<idle>-0 [000] 1606.768005: clock_enable: uart2_fck state=1
<idle>-0 [000] 1606.768066: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 1606.768097: clock_enable: uart3_ick state=1
<idle>-0 [000] 1606.768097: clock_enable: uart3_fck state=1
<idle>-0 [000] 1606.768921: power_end: dummy=65535
...
kondemand/0-404 [000] 2043.842743: power_frequency: type=2 state=600000000
kondemand/0-404 [000] 2043.843628: clock_set_rate: dpll1_ck state=600000000
kondemand/0-404 [000] 2043.843903: clock_set_rate: dpll2_ck state=430000000
kondemand/0-404 [000] 2043.844299: clock_set_rate: dpll3_m2_ck state=332000000
- Test setup:
- on-demand DVFS,
- 5s timeout on UARTs,
- sleep_while_idle=1,
- enable_off_mode=1,
- triggering a suspend.
# echo mem > /sys/power/state
PM: Syncing filesystems ... done.
Freezing user space processes ... (elapsed 0.01 seconds) done.
Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
Suspending console(s) (use no_console_suspend to debug)
PM: suspend of devices complete after 8.514 msecs
PM: late suspend of devices complete after 3.112 msecs
Successfully put all powerdomains to target state
PM: early resume of devices complete after 0.457 msecs
PM: resume of devices complete after 9.918 msecs
Restarting tasks ... done.
# cat /sys/kernel/debug/tracing/trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<idle>-0 [000] 2153.873413: power_start: type=1 state=1 <= Here starts the idle loop
<idle>-0 [000] 2153.873444: power_domain_transition: mpu_pwrdm state=3 | Power domain transition, MPU domain to target state ON
<idle>-0 [000] 2153.873474: power_domain_transition: core_pwrdm state=3 | Power domain transition, CORE domain to target state ON
<idle>-0 [000] 2153.873688: power_domain_transition: neon_pwrdm state=3 | Power domain transition, NEON domain to target state ON
<idle>-0 [000] 2153.873749: clock_disable: gpio6_dbck state=0 | Clock gpio6_dbck disable
<idle>-0 [000] 2153.873749: power_domain_transition: per_pwrdm state=1 | ...
<idle>-0 [000] 2153.873779: clock_disable: uart3_ick state=0 |
<idle>-0 [000] 2153.873810: clock_disable: uart3_fck state=0 |
<idle>-0 [000] 2153.889099: clock_enable: gpio6_dbck state=1 |
<idle>-0 [000] 2153.889130: clock_enable: uart3_ick state=1 | Clock uart3_ick enable
<idle>-0 [000] 2153.889130: clock_enable: uart3_fck state=1 |
<idle>-0 [000] 2153.889160: power_domain_transition: per_pwrdm state=0 |
<idle>-0 [000] 2153.889740: power_end: dummy=65535 <= Here ends the idle loop
sh-1148 [000] 2153.915649: power_domain_transition: mpu_pwrdm state=0 <= Suspend entry: all power domains to OFF
sh-1148 [000] 2153.915710: power_domain_transition: neon_pwrdm state=0 |
sh-1148 [000] 2153.915741: power_domain_transition: core_pwrdm state=0 |
sh-1148 [000] 2153.915771: power_start: type=3 state=0 | state=0: suspend entry
sh-1148 [000] 2153.915985: power_domain_transition: neon_pwrdm state=0 | ...
sh-1148 [000] 2153.916016: clock_disable: gpio6_dbck state=0 | Clocks disable
sh-1148 [000] 2153.916138: clock_disable: uart3_ick state=0 | ...
sh-1148 [000] 2153.916138: clock_disable: uart3_fck state=0 |
sh-1148 [000] 2153.916199: clock_disable: uart1_ick state=0 |
sh-1148 [000] 2153.916199: clock_disable: uart1_fck state=0 |
sh-1148 [000] 2153.916229: clock_disable: uart2_ick state=0 |
sh-1148 [000] 2153.916260: clock_disable: uart2_fck state=0 | Board frozen for ~10s ...
sh-1148 [000] 2164.128876: clock_enable: hsotgusb_ick state=1 | Resuming from deep sleep, enabling clocks
sh-1148 [000] 2164.128967: clock_disable: hsotgusb_ick state=0 | ...
sh-1148 [000] 2164.128998: clock_enable: uart1_ick state=1 |
sh-1148 [000] 2164.129028: clock_enable: uart1_fck state=1 |
sh-1148 [000] 2164.129150: clock_enable: uart2_ick state=1 |
sh-1148 [000] 2164.129181: clock_enable: uart2_fck state=1 |
sh-1148 [000] 2164.129272: clock_enable: gpio6_dbck state=1 |
sh-1148 [000] 2164.129303: clock_enable: uart3_ick state=1 |
sh-1148 [000] 2164.129333: clock_enable: uart3_fck state=1 |
sh-1148 [000] 2164.129608: power_domain_transition: mpu_pwrdm state=3 | Power domain transition, MPU domain to target state ON
sh-1148 [000] 2164.129639: power_domain_transition: neon_pwrdm state=3 | Power domain transition, NEON domain to target state ON
sh-1148 [000] 2164.129669: power_domain_transition: core_pwrdm state=3 | Power domain transition, CORE domain to target state ON.
<idle>-0 [000] 2164.270233: power_start: type=1 state=1 <= System fully awake, idle loop entry
<idle>-0 [000] 2164.270294: power_domain_transition: mpu_pwrdm state=3 | ...
<idle>-0 [000] 2164.270325: power_domain_transition: core_pwrdm state=3
<idle>-0 [000] 2164.270538: power_domain_transition: neon_pwrdm state=3
<idle>-0 [000] 2164.270569: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 2164.270599: power_domain_transition: per_pwrdm state=1
<idle>-0 [000] 2164.270660: clock_disable: uart3_ick state=0
<idle>-0 [000] 2164.270660: clock_disable: uart3_fck state=0
<idle>-0 [000] 2164.275848: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 2164.275879: clock_enable: uart3_ick state=1
<idle>-0 [000] 2164.275909: clock_enable: uart3_fck state=1
<idle>-0 [000] 2164.275940: power_domain_transition: per_pwrdm state=0 | ...
<idle>-0 [000] 2164.276642: power_end: dummy=65535 <= Idle loop ends
...
<idle>-0 [000] 2167.608856: power_start: type=1 state=1
<idle>-0 [000] 2167.608887: power_domain_transition: mpu_pwrdm state=3
<idle>-0 [000] 2167.608887: power_domain_transition: core_pwrdm state=3
<idle>-0 [000] 2167.609131: power_domain_transition: neon_pwrdm state=3
<idle>-0 [000] 2167.609161: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 2167.609192: power_domain_transition: per_pwrdm state=1
<idle>-0 [000] 2167.609222: clock_disable: uart3_ick state=0
<idle>-0 [000] 2167.609253: clock_disable: uart3_fck state=0
<idle>-0 [000] 2167.689941: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 2167.689972: clock_enable: uart3_ick state=1
<idle>-0 [000] 2167.690002: clock_enable: uart3_fck state=1
<idle>-0 [000] 2167.690033: power_domain_transition: per_pwrdm state=0
<idle>-0 [000] 2167.690613: power_end: dummy=65535
...
!! Need to wait ~6s for idle to change states from ON to RET/OFF ??
...
<idle>-0 [000] 2170.605103: power_start: type=1 state=1
<idle>-0 [000] 2170.605133: power_domain_transition: mpu_pwrdm state=0
<idle>-0 [000] 2170.605133: power_domain_transition: core_pwrdm state=0
<idle>-0 [000] 2170.605316: power_domain_transition: neon_pwrdm state=0
<idle>-0 [000] 2170.605347: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 2170.605438: clock_disable: uart3_ick state=0
<idle>-0 [000] 2170.605469: clock_disable: uart3_fck state=0
<idle>-0 [000] 2170.605499: clock_disable: uart1_ick state=0
<idle>-0 [000] 2170.605499: clock_disable: uart1_fck state=0
<idle>-0 [000] 2170.605530: clock_disable: uart2_ick state=0
<idle>-0 [000] 2170.605560: clock_disable: uart2_fck state=0
<idle>-0 [000] 2171.031067: clock_enable: hsotgusb_ick state=1
<idle>-0 [000] 2171.031158: clock_disable: hsotgusb_ick state=0
<idle>-0 [000] 2171.031189: clock_enable: uart1_ick state=1
<idle>-0 [000] 2171.031189: clock_enable: uart1_fck state=1
<idle>-0 [000] 2171.031311: clock_enable: uart2_ick state=1
<idle>-0 [000] 2171.031311: clock_enable: uart2_fck state=1
<idle>-0 [000] 2171.031403: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 2171.031433: clock_enable: uart3_ick state=1
<idle>-0 [000] 2171.031433: clock_enable: uart3_fck state=1
<idle>-0 [000] 2171.032379: power_end: dummy=65535
<idle>-0 [000] 2171.033203: power_start: type=1 state=1
<idle>-0 [000] 2171.033234: power_domain_transition: mpu_pwrdm state=0
<idle>-0 [000] 2171.033264: power_domain_transition: core_pwrdm state=3
<idle>-0 [000] 2171.033447: power_domain_transition: neon_pwrdm state=0
<idle>-0 [000] 2171.033478: clock_disable: gpio6_dbck state=0
<idle>-0 [000] 2171.033508: power_domain_transition: per_pwrdm state=1
<idle>-0 [000] 2171.033539: clock_disable: uart3_ick state=0
<idle>-0 [000] 2171.033569: clock_disable: uart3_fck state=0
<idle>-0 [000] 2171.138824: clock_enable: gpio6_dbck state=1
<idle>-0 [000] 2171.138916: clock_enable: uart3_ick state=1
<idle>-0 [000] 2171.138916: clock_enable: uart3_fck state=1
<idle>-0 [000] 2171.138977: power_domain_transition: per_pwrdm state=0
<idle>-0 [000] 2171.139832: power_end: dummy=65535
Trace parsing and GUIs
pytimechart UI
Intro
PyTimechart is a tool that parses ftrace text traces, and display them with the help of a dynamic plot framework, Chaco (http://code.enthought.com/chaco/ ) The GUI makes the best it can to ease the browsing of huge traces.
Download and installation
Cf. http://gitorious.com/pytimechart/ for download of the latest code.
pytimechart depends on python-chaco. It can be installed on recent distributions (tested on Debian 6 and Ubuntu 10.04).
Usage
Cf. http://gitorious.com/pytimechart/pages/Home for usage instructions.
Screenshot
A good screenshot says it all!
- This chart shows the scheduling details, the ondemand cpufreq governor changing the CPU frequencies and cpuidle changing the CPU C-states (5 is for C6: MPU OFF + Core RET)
- Similar chart, with the clock and power_domain events added
- Close-up of the previous chart
l-o kernel patch/RFC
diff --git a/arch/arm/mach-omap2/cpuidle34xx.c b/arch/arm/mach-omap2/cpuidle34xx.c
index 3d3d035..6113bd9 100644
--- a/arch/arm/mach-omap2/cpuidle34xx.c
+++ b/arch/arm/mach-omap2/cpuidle34xx.c
@@ -24,6 +24,7 @@
#include <linux/sched.h>
#include <linux/cpuidle.h>
+#include <trace/events/power.h>
#include <plat/prcm.h>
#include <plat/irqs.h>
@@ -130,6 +131,7 @@ static int omap3_enter_idle(struct cpuidle_device *dev,
local_irq_disable();
local_fiq_disable();
+ trace_power_start(POWER_CSTATE, cx->type, smp_processor_id());
pwrdm_set_next_pwrst(mpu_pd, mpu_state);
pwrdm_set_next_pwrst(core_pd, core_state);
diff --git a/arch/arm/mach-omap2/pm34xx.c b/arch/arm/mach-omap2/pm34xx.c
index f25bc3d..7bf8a87 100644
--- a/arch/arm/mach-omap2/pm34xx.c
+++ b/arch/arm/mach-omap2/pm34xx.c
@@ -28,6 +28,7 @@
#include <linux/clk.h>
#include <linux/delay.h>
#include <linux/slab.h>
+#include <trace/events/power.h>
#include <plat/sram.h>
#include <plat/clockdomain.h>
@@ -588,6 +589,7 @@ static void omap3_pm_idle(void)
if (omap_irq_pending() || need_resched())
goto out;
+ trace_power_start(POWER_CSTATE, 1, smp_processor_id());
omap_sram_idle();
out:
@@ -628,6 +630,8 @@ static int omap3_pm_suspend(void)
omap2_pm_wakeup_on_timer(wakeup_timer_seconds,
wakeup_timer_milliseconds);
+ trace_power_start(POWER_SSTATE, 1, smp_processor_id());
+
/* Read current next_pwrsts */
list_for_each_entry(pwrst, &pwrst_list, node)
pwrst->saved_state = pwrdm_read_next_pwrst(pwrst->pwrdm);
diff --git a/arch/arm/mach-omap2/powerdomain.c b/arch/arm/mach-omap2/powerdomain.c
index 6527ec3..73cbe9a 100644
--- a/arch/arm/mach-omap2/powerdomain.c
+++ b/arch/arm/mach-omap2/powerdomain.c
@@ -23,6 +23,7 @@
#include <linux/errno.h>
#include <linux/err.h>
#include <linux/io.h>
+#include <trace/events/power.h>
#include <asm/atomic.h>
@@ -440,6 +441,8 @@ int pwrdm_set_next_pwrst(struct powerdomain *pwrdm, u8 pwrst)
pr_debug("powerdomain: setting next powerstate for %s to %0x\n",
pwrdm->name, pwrst);
+ trace_power_domain_target(pwrdm->name, pwrst, smp_processor_id());
+
prm_rmw_mod_reg_bits(OMAP_POWERSTATE_MASK,
(pwrst << OMAP_POWERSTATE_SHIFT),
pwrdm->prcm_offs, pwrstctrl_reg_offs);
diff --git a/arch/arm/plat-omap/clock.c b/arch/arm/plat-omap/clock.c
index 7190cbd..d6518f5 100644
--- a/arch/arm/plat-omap/clock.c
+++ b/arch/arm/plat-omap/clock.c
@@ -21,6 +21,7 @@
#include <linux/cpufreq.h>
#include <linux/debugfs.h>
#include <linux/io.h>
+#include <trace/events/power.h>
#include <plat/clock.h>
@@ -43,8 +44,10 @@ int clk_enable(struct clk *clk)
return -EINVAL;
spin_lock_irqsave(&clockfw_lock, flags);
- if (arch_clock->clk_enable)
+ if (arch_clock->clk_enable) {
+ trace_clock_enable(clk->name, 1, smp_processor_id());
ret = arch_clock->clk_enable(clk);
+ }
spin_unlock_irqrestore(&clockfw_lock, flags);
return ret;
@@ -66,8 +69,10 @@ void clk_disable(struct clk *clk)
goto out;
}
- if (arch_clock->clk_disable)
+ if (arch_clock->clk_disable) {
+ trace_clock_disable(clk->name, 0, smp_processor_id());
arch_clock->clk_disable(clk);
+ }
out:
spin_unlock_irqrestore(&clockfw_lock, flags);
@@ -120,8 +125,10 @@ int clk_set_rate(struct clk *clk, unsigned long rate)
return ret;
spin_lock_irqsave(&clockfw_lock, flags);
- if (arch_clock->clk_set_rate)
+ if (arch_clock->clk_set_rate) {
+ trace_clock_set_rate(clk->name, rate, smp_processor_id());
ret = arch_clock->clk_set_rate(clk, rate);
+ }
if (ret == 0) {
if (clk->recalc)
clk->rate = clk->recalc(clk);
diff --git a/arch/arm/plat-omap/cpu-omap.c b/arch/arm/plat-omap/cpu-omap.c
index df08829..cc4e41f 100644
--- a/arch/arm/plat-omap/cpu-omap.c
+++ b/arch/arm/plat-omap/cpu-omap.c
@@ -25,6 +25,7 @@
#include <linux/err.h>
#include <linux/clk.h>
#include <linux/io.h>
+#include <trace/events/power.h>
#include <mach/hardware.h>
#include <plat/clock.h>
@@ -116,8 +117,10 @@ static int omap_target(struct cpufreq_policy *policy,
cpufreq_notify_transition(&freqs, CPUFREQ_POSTCHANGE);
#elif defined(CONFIG_ARCH_OMAP3) && !defined(CONFIG_OMAP_PM_NONE)
freq = target_freq * 1000;
- if (opp_find_freq_ceil(mpu_dev, &freq))
+ if (opp_find_freq_ceil(mpu_dev, &freq)) {
+ trace_power_frequency(POWER_PSTATE, freq, smp_processor_id());
omap_pm_cpu_set_freq(freq);
+ }
#endif
return ret;
}
diff --git a/include/trace/events/power.h b/include/trace/events/power.h
index 35a2a6e..090fcfd 100644
--- a/include/trace/events/power.h
+++ b/include/trace/events/power.h
@@ -13,9 +13,14 @@ enum {
POWER_NONE = 0,
POWER_CSTATE = 1,
POWER_PSTATE = 2,
+ POWER_SSTATE = 3,
};
#endif
+/*
+ * The power events are used for cpuidle & suspend (power_start, power_end)
+ * and for cpufreq (power_frequency)
+ */
DECLARE_EVENT_CLASS(power,
TP_PROTO(unsigned int type, unsigned int state, unsigned int cpu_id),
@@ -70,6 +75,85 @@ TRACE_EVENT(power_end,
);
+/*
+ * The clock events are used for clock enable/disable and for
+ * clock rate change
+ */
+DECLARE_EVENT_CLASS(clock,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id),
+
+ TP_STRUCT__entry(
+ __string( name, name )
+ __field( u64, state )
+ __field( u64, cpu_id )
+ ),
+
+ TP_fast_assign(
+ __assign_str(name, name);
+ __entry->state = state;
+ __entry->cpu_id = cpu_id;
+ ),
+
+ TP_printk("%s state=%lu cpu_id=%lu", __get_str(name),
+ (unsigned long)__entry->state, (unsigned long)__entry->cpu_id)
+);
+
+DEFINE_EVENT(clock, clock_enable,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id)
+);
+
+DEFINE_EVENT(clock, clock_disable,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id)
+);
+
+DEFINE_EVENT(clock, clock_set_rate,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id)
+);
+
+/*
+ * The power domain events are used for power domains transitions
+ */
+DECLARE_EVENT_CLASS(power_domain,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id),
+
+ TP_STRUCT__entry(
+ __string( name, name )
+ __field( u64, state )
+ __field( u64, cpu_id )
+ ),
+
+ TP_fast_assign(
+ __assign_str(name, name);
+ __entry->state = state;
+ __entry->cpu_id = cpu_id;
+),
+
+ TP_printk("%s state=%lu cpu_id=%lu", __get_str(name),
+ (unsigned long)__entry->state, (unsigned long)__entry->cpu_id)
+);
+
+DEFINE_EVENT(power_domain, power_domain_target,
+
+ TP_PROTO(const char *name, unsigned int state, unsigned int cpu_id),
+
+ TP_ARGS(name, state, cpu_id)
+);
+
#endif /* _TRACE_POWER_H */
/* This part must be outside protection */