Power Management Device Latencies Measurement
From OMAPpedia
| Line 211: | Line 211: | ||
|- style="font-style:italic; color:red;" | |- style="font-style:italic; color:red;" | ||
|Wake-up event: IO or GPT12 | |Wake-up event: IO or GPT12 | ||
| - | |HW trace A, GPT12=0 (if GPT wake-up) | + | |HW trace A (if IO wake-up), GPT12=0 (if GPT wake-up) |
| - | | - | ||
|- style="font-style:italic; color:red;" | |- style="font-style:italic; color:red;" | ||
| Line 219: | Line 219: | ||
|- style="font-style:italic; color:orange;" | |- style="font-style:italic; color:orange;" | ||
|WFI exit | |WFI exit | ||
| - | |||
|GPT12 sampling right after WFI | |GPT12 sampling right after WFI | ||
| + | | - | ||
|- | |- | ||
|omap_sram_idle 1 | |omap_sram_idle 1 | ||
Revision as of 16:11, 11 November 2010
Contents |
PM Devices constraintes measurements
Introduction
To correctly implement the device latency constraint support it is needed to get accurate measurements of the system low power modes overhead:
- Total amount of time taken for a device to become accessible, and so the time for the device to wake-up from a given low power mode.
- It includes turning the clocks on, bringing the clockdomain out of inactive, power domain out of RET or OFF (with context restore) state.
- This constraint mainly governs the deepest device idle state (only clocks cut, clockdomain in inactive, Powerdomain in RET or off) acceptable to the device at any given time.
This wiki page details the measurements setup and the results. The latency data is to be fed into the constraints latency patches.
Kernel patches & build
Some kernel changes are required for the kernel instrumentation. The patches and config are attached to this page
- Starting point: linux-omap master branch at commit a83d12a47c9a8c78a184910150797045d69a1570: Linux-omap rebuilt: Updated to v2.6.36, add 24xx uart fix
- khilman's patch to fix the low power mode: 1e91c5f70da4d7d108cbcf026164d001e0e688b3: OMAP: bus-level PM: enable use of runtime PM API for suspend/resume
- Experimental workaround to allow the system idle
a5a24bc82d3f98758f8fdd0cb0af71012b735477 OMAP-cpuidle-workaround
- Tracing instrumentation patches
- 2f1544b4db9e164b6954ed0888b0d6a6c5dcf8d4 tracing, perf: add more power related events
- d681364bf20082da41b0afa77eadba93b187f695 perf: add suspend tracepoint calls
- 07d6076194cd054382ef216d0e09ed597744a49a OMAP3: clean up ASM idle code
- b893c12126419f1cbc1bc692d7daa84830ee68ea OMAP3: add low power entry/exit latency trace points
- GPIO instrumentation
- 5c2a88d6c997fc2216bb22a95289fb6e9a6acede OMAP3: Add HW tracing code
- GPT instrumentation
- a6274e11bf0a4ad205318b611df71d98048e1fc8 OMAP3: Use GPT12 timer for low level PM instrumentation
- Kernel config for Beagleboard
Changes: DSS for Beagle, Initramfs Busybox root FS
HW traces details
The trace points are connected on Beagleboard rev B7.
- Trace A: on the USER button, at the connection to R36. This signal is the system wake-up event. The trigger is set on the raising edge of the signal.
- Trace B: USR1 LED (GPIO_149). This signal is set at the end of omap_sram_idle, along with trace_power_start(POWER_WAKEN, 7, smp_processor_id());. This allows to synchronize the time between the HW and the SW traces.
!Warning! The HW power supplies and external clocks are not cut off in this config (no support for System OFF in l-o), so the HW latencies are lower than expected. The HW measurements need to be performed as soon as l-o supports the System OFF. The measurements from TI are used for the real HW latency.
Here are some scope screenshots showing the time delta between the wake-up event (USER button press, trace A) and the end of omap_sram_idle (USR1 Led).
For RET mode, showing a delta of 408us:
For OFF mode, showing a delta of 2700us:
GPT tracer
Since GPT12 is used as a wake-up source from the idle mode, it can be used to track the timings during the wake-up sequence. A patch is needed to let the timer count after it overflowed and woke up the system.
The GPT runs on 32KHz clock and so the resolution is limited to 30.518us. Given the latencies to measure for OFF mode, the resolution is accpetable.
4 GPT measurements are performed during the wake-up:
- At the wake-up event the GPT overflows and the counter value is 0,
- At the time the WFI instrcution is done, before the MPU context restore code (in ASM),
- At the same time as the SW tracers 1 and 7. This allows to synchronize the HW and SW tracers.
SW trace usage
Enable the power events and dump the trace:
# echo 1 > /debug/tracing/events/power/enable # cat /debug/tracing/trace_pipe &
Enable the system idle in RET mode:
# echo 5 > /sys/devices/platform/omap/omap-hsuart.0/sleep_timeout # echo 5 > /sys/devices/platform/omap/omap-hsuart.1/sleep_timeout # echo 5 > /sys/devices/platform/omap/omap-hsuart.2/sleep_timeout # echo 0 > /debug/pm_debug/enable_off_mode # echo 1 > /debug/pm_debug/sleep_while_idle
Trace output:
[ 62.311462] *** GPT12 wake-up (HW wake-up, ASM restore, delta trace1-7): 183, 0, 244 us => Dump of GPT timing deltas
<idle>-0 [000] 62.241608: power_start: type=1 state=1 cpu_id=0 => Idle start
<idle>-0 [000] 62.241608: power_start: type=4 state=1 cpu_id=0 => First suspend SW trace in omap_sram_idle
<idle>-0 [000] 62.241638: power_start: type=4 state=2 cpu_id=0 => ...
<idle>-0 [000] 62.241669: power_start: type=4 state=3 cpu_id=0
<idle>-0 [000] 62.241699: power_domain_target: name=neon_pwrdm state=1 cpu_id=0
<idle>-0 [000] 62.241699: power_start: type=4 state=4 cpu_id=0
<idle>-0 [000] 62.241699: clock_disable: name=uart3_fck state=0 cpu_id=0
<idle>-0 [000] 62.241730: power_start: type=4 state=5 cpu_id=0
<idle>-0 [000] 62.241730: clock_disable: name=uart1_fck state=0 cpu_id=0
<idle>-0 [000] 62.241730: clock_disable: name=uart2_fck state=0 cpu_id=0
<idle>-0 [000] 62.241760: power_start: type=4 state=6 cpu_id=0
<idle>-0 [000] 62.241760: power_start: type=4 state=7 cpu_id=0
<idle>-0 [000] 62.241760: power_start: type=4 state=8 cpu_id=0 => Last suspend SW trace in omap_sram_idle
<idle>-0 [000] 62.311188: power_start: type=5 state=1 cpu_id=0 => First resume SW trace in omap_sram_idle
<idle>-0 [000] 62.311188: power_start: type=5 state=2 cpu_id=0 => ...
<idle>-0 [000] 62.311188: power_start: type=5 state=3 cpu_id=0
<idle>-0 [000] 62.311188: power_start: type=5 state=4 cpu_id=0
<idle>-0 [000] 62.311218: clock_enable: name=uart1_fck state=1 cpu_id=0
<idle>-0 [000] 62.311310: clock_enable: name=uart2_fck state=1 cpu_id=0
<idle>-0 [000] 62.311310: power_start: type=5 state=5 cpu_id=0
<idle>-0 [000] 62.311340: clock_enable: name=uart3_fck state=1 cpu_id=0
<idle>-0 [000] 62.311340: power_start: type=5 state=6 cpu_id=0
<idle>-0 [000] 62.311432: power_start: type=5 state=7 cpu_id=0 => Last resume SW trace in omap_sram_idle
<idle>-0 [000] 62.319885: power_end: cpu_id=0 => Idle end
Enable the system idle in OFF mode:
# echo 5 > /sys/devices/platform/omap/omap-hsuart.0/sleep_timeout # echo 5 > /sys/devices/platform/omap/omap-hsuart.1/sleep_timeout # echo 5 > /sys/devices/platform/omap/omap-hsuart.2/sleep_timeout # echo 1 > /debug/pm_debug/enable_off_mode # echo 1 > /debug/pm_debug/sleep_while_idle
Trace output:
/ # echo 1 > /debug/pm_debug/enable_off_mode
/ #
sh-503 [000] 70.862366: power_domain_target: name=iva2_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862396: power_domain_target: name=mpu_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862396: power_domain_target: name=neon_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862396: power_domain_target: name=core_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862427: power_domain_target: name=cam_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862457: power_domain_target: name=dss_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862488: power_domain_target: name=per_pwrdm state=0 cpu_id=0
sh-503 [000] 70.862488: power_domain_target: name=usbhost_pwrdm state=0 cpu_id=0
/ #
[ 557.240020] *** GPT12 wake-up (HW wake-up, ASM restore, delta trace1-7): 1495, 915, 488 us => Dump of GPT timing deltas
<idle>-0 [000] 557.156769: power_start: type=1 state=1 cpu_id=0 => Idle start
<idle>-0 [000] 557.156769: power_start: type=4 state=1 cpu_id=0 => First suspend SW trace in omap_sram_idle
<idle>-0 [000] 557.156769: power_start: type=4 state=2 cpu_id=0 => ...
<idle>-0 [000] 557.156830: power_start: type=4 state=3 cpu_id=0
<idle>-0 [000] 557.156830: power_domain_target: name=neon_pwrdm state=0 cpu_id=0
<idle>-0 [000] 557.156830: power_start: type=4 state=4 cpu_id=0
<idle>-0 [000] 557.156860: clock_disable: name=uart3_fck state=0 cpu_id=0
<idle>-0 [000] 557.156891: power_start: type=4 state=5 cpu_id=0
<idle>-0 [000] 557.156891: clock_disable: name=uart1_fck state=0 cpu_id=0
<idle>-0 [000] 557.156921: clock_disable: name=uart2_fck state=0 cpu_id=0
<idle>-0 [000] 557.157013: power_start: type=4 state=6 cpu_id=0
<idle>-0 [000] 557.157013: power_start: type=4 state=7 cpu_id=0
<idle>-0 [000] 557.157898: power_start: type=4 state=8 cpu_id=0 => Last suspend SW trace in omap_sram_idle
<idle>-0 [000] 557.236084: power_start: type=5 state=1 cpu_id=0 => First resume SW trace in omap_sram_idle
<idle>-0 [000] 557.236145: power_start: type=5 state=2 cpu_id=0 => ...
<idle>-0 [000] 557.236206: power_start: type=5 state=3 cpu_id=0
<idle>-0 [000] 557.236267: power_start: type=5 state=4 cpu_id=0
<idle>-0 [000] 557.236389: clock_enable: name=uart1_fck state=1 cpu_id=0
<idle>-0 [000] 557.236450: clock_enable: name=uart2_fck state=1 cpu_id=0
<idle>-0 [000] 557.236450: power_start: type=5 state=5 cpu_id=0
<idle>-0 [000] 557.236481: clock_enable: name=uart3_fck state=1 cpu_id=0
<idle>-0 [000] 557.236511: power_start: type=5 state=6 cpu_id=0
<idle>-0 [000] 557.236572: power_start: type=5 state=7 cpu_id=0 => Last resume SW trace in omap_sram_idle
<idle>-0 [000] 557.248718: power_end: cpu_id=0 => Idle end
Timings results
Results interpretation
The low power transition sequence is pictured as nested calls to functions:
The measured results (from the HW and SW traces) are mapped to the pictured states according to the following table:
| Pictured state | Trace point | Performed SW action |
|---|---|---|
| Idle enter | start suspend | System ready to enter idle |
| omap_sram_idle 1 | suspend trace point 1 | Enter omap_sram_idle |
| omap_sram_idle 2 | suspend trace point 2 | calculation of next power domains modes |
| omap_sram_idle 3 | suspend trace point 3 | Power domains pre-transition: program power domains current state, clear status |
| omap_sram_idle 4 | suspend trace point 4 | Context save for NEON; IO pad and chain new state programmed |
| omap_sram_idle 5 | suspend trace point 5 | Context save for PER, GPIO; prepare UARTs 2&3 |
| omap_sram_idle 6 | suspend trace point 6 | Context save for CORE and PRCM; prepare UARTs 0&1 |
| omap_sram_idle 7 | suspend trace point 7 | Context save for INTC; program SDRC |
| WFI enter | suspend trace point 8 | GPIO HW trace; MPU context save in ASM (caches, registers, disable cache & prediction) |
| System OFF active | - sys_off_mode, external clocks and power supplies to be measured with System OFF support | - |
| Wake-up event: IO or GPT12 | HW trace A (if IO wake-up), GPT12=0 (if GPT wake-up) | - |
| System OFF inactive | - sys_off_mode, external clocks and power supplies to be measured with System OFF support | - |
| WFI exit | GPT12 sampling right after WFI | - |
| omap_sram_idle 1 | GPT12 sampling at return from ASM code; wake-up trace point 1 | SDRC errata for ES3.1; MPU context restore, MMU restore and enable |
| omap_sram_idle 2 | wake-up trace point 2 | cpu_init |
| omap_sram_idle 3 | wake-up trace point 3 | SDRC settings restore |
| omap_sram_idle 4 | wake-up trace point 4 | Restore MMU tables, enable caches and prediction |
| omap_sram_idle 5 | wake-up trace point 5 | Context restore for CORE, PRCM, SRAM, SMS; resume UARTs 0&1 |
| omap_sram_idle 6 | wake-up trace point 6 | Context restore for PER, INTC, GPIO, IO pad & chain; resume UARTS 2&3 |
| omap_sram_idle 7 | wake-up trace point 7, GPT sampling, HW trace B | Power domains post-transition: program power domains current state, clear status; restore SDRC settings |
| Idle exit |
Backup data
Some timings measurements have been made at chip characterization. The following table gives the results:
| Characterization measurement | Full RET (us) | Full OFF (us) | Remark |
|---|---|---|---|
| HW sleep latency: from WFI enter till sys_off_mode active | 154 | - | Not measured in OFF mode, to be done once System OFF support is in l-o |
| HW total sleep latency: from WFI enter till System OFF (voltages and external clocks cut off) | 494 | 3784 | |
| HW wake-up latency: from sys_off_mode inactive till WFI exit | 245 | - | Not measured in OFF mode. The MPU context restore code is considered as part of the HW restore |
| HW total wake-up latency: from wake-up event till WFI exit | 8479* | 8749* | OK for RET since no MPU restore code is needed. OFF mode: it is assumed this contains the MPU context restore code |
*: The value of PRM_CLKSETUP (and VOLTSETUP possibly) need some optimization. A value of 0xFF for CLKSETUP means a clock stabilization time of 8ms while it is recommended to use 5.25ms.
Full RET mode
- From idle start till omap_sram_idle entry: 0us
- From omap_sram_idle entry till WFI: 152us
- ... HW sleep...
- From WKUP event till WFI (HW wake-up - GPT12): 183us
- From WFI till return from omap34xx_save_cpu_context_wfi (MPU context restore in ASM): 0us
- From return from omap34xx_save_cpu_context_wfi till end of omap_sram_idle (System restore): 244us
- From end of omap_sram_idle till return from idle: 30us
Full OFF mode
- Suspend of devices: 112.396 ms
- Late suspend of devices: 1.739 ms
- From suspend start till omap_sram_idle entry: 61us
- From omap_sram_idle entry till WFI: 1190us
- ... HW sleep... including MPU context restore in ASM
- From WKUP event (USER button) till return from omap34xx_save_cpu_context_wfi: 2151us
- From return from omap34xx_save_cpu_context_wfi till end of omap_sram_idle: 457us
- From end of omap_sram_idle till return from suspend: 92us
- Early resume of devices: 0.488 ms
- Resume of devices: 429.962 msecs
Timings results
Device constraint code patches, derived from the timings results.
Attachments
Kernel patches and config
File:OMAP latency measurements patches and config.tar.gz
--Jpihet 14:21, 5 November 2010 (UTC)