Power Management Device Latencies Measurement

From OMAPpedia

(Difference between revisions)
Jump to: navigation, search
Line 4: Line 4:
To correctly implement the device latency constraint support it is needed to get accurate measurements of the system low power modes overhead:
To correctly implement the device latency constraint support it is needed to get accurate measurements of the system low power modes overhead:
* Total amount of time taken for a device to become accessible, and so the time for the device to wake-up from a given low power mode.
* Total amount of time taken for a device to become accessible, and so the time for the device to wake-up from a given low power mode.
-
* It includes turning the clocks on, bringing the clockdomain out of inactive, power domain out of RET or OFF (with context restore) state.
+
* It includes turning the clocks on, bringing the clock domain out of inactive, power domain out of RET or OFF (with context restore) state.
-
* This constraint mainly governs the deepest device idle state (only clocks cut, clockdomain in inactive, Powerdomain in RET or off) acceptable to the device at any given time.  
+
* This constraint mainly governs the deepest device idle state (only clocks cut, clock domain in inactive, power domain in RET or off) acceptable to the device at any given time.  
This wiki page details the measurements setup and the results. The latency data is to be fed into the constraints latency patches.
This wiki page details the measurements setup and the results. The latency data is to be fed into the constraints latency patches.
Line 254: Line 254:
  |}
  |}
-
===Backup data===
+
==Results==
-
Some timings measurements have been made at chip characterization.
+
===PSI measurements results===
-
The following table gives the results:
+
Some timings measurements have been made by the TI PSI team.
-
{|border="1"
+
The following tables gives the results for the sleep and wake-up latencies for the C-states:
-
!Characterization measurement
+
 
-
!Full RET (us)
+
image sleep
-
!Full OFF (us)
+
image wake-up
-
!Remark
+
 
-
|-
+
Note: in the linux code there is no C7/C8/C9 as in the table. C7 is MPU OFF + CORE OFF, which is identical to C9 in the table.
-
|HW sleep latency: from WFI enter till sys_off_mode active
+
 
-
|154
+
A model with the energy spent in the C-states has been built from the measured numbers. Here is the graph of the energy vs time:
-
| -
+
 
-
|Not measured in OFF mode, to be done once System OFF support is in l-o
+
image graph
-
|-
+
 
-
|HW total sleep latency: from WFI enter till System OFF (voltages and external clocks cut off)
+
Taking the minimum energy from the graph allows to identify the 4 energy-wise interesting C-states: C1, C3, C5, C9 and the threshold time for those C-states to be efficient:
-
|494
+
-
|3784
+
-
|
+
-
|-
+
-
|HW wake-up latency: from sys_off_mode inactive till WFI exit
+
-
|245
+
-
| -
+
-
|Not measured in OFF mode. The MPU context restore code is considered as part of the HW restore
+
-
|-
+
-
|HW total wake-up latency: from wake-up event till WFI exit
+
-
|8479*
+
-
|8749*
+
-
|OK for RET since no MPU restore code is needed. OFF mode: it is assumed this contains the MPU context restore code
+
-
|-
+
-
|}
+
-
<nowiki>*</nowiki>: The value of PRM_CLKSETUP (and VOLTSETUP possibly) need some optimization. A value of 0xFF for CLKSETUP means a clock stabilization time of 8ms while it is recommended to use 5.25ms.
+
-
==Timings results==
+
image table 4 C-states
 +
 
 +
Notes:
 +
* The measurements have been performed at OPP50
 +
* No data has been measured for C9 (MPU OFF + CORE OFF). Data from the HW and SW trace points are used to fill in the results.
 +
* The sys_offmode signal is not supported and so not used for the measurements. A value of 8ms is used in the table. From the T2 scripts page the value should be 11.5ms. The measurements data and the threshold for C9 need to be corrected.
 +
* The sys_clkreq signal is not used and so a correction is needed. ToBeDone
 +
 
 +
===HW and SW measurements results===
 +
Here are the results for full RET and full OFF modes:
{|border="1"
{|border="1"
  !Sequence
  !Sequence
Line 321: Line 314:
  |-
  |-
|}
|}
 +
 +
===Aggregated timings results===
 +
From the various sources of data the following figures are used:
 +
==Device latency patches==
==Device latency patches==

Revision as of 09:10, 2 September 2011

Contents

PM Devices constraintes measurements

Introduction

To correctly implement the device latency constraint support it is needed to get accurate measurements of the system low power modes overhead:

This wiki page details the measurements setup and the results. The latency data is to be fed into the constraints latency patches.

Kernel patches & build

Some kernel changes are required for the kernel instrumentation. The patches and config are attached to this page

a5a24bc82d3f98758f8fdd0cb0af71012b735477 OMAP-cpuidle-workaround

Changes: enable IDLE, DSS for Beagle, Initramfs Busybox root FS

HW traces details

The trace points are connected on Beagleboard rev B7.

!Warning! The HW power supplies and external clocks are not cut off in this config (no support for System OFF in l-o), so the HW latencies are lower than expected. The HW measurements need to be performed as soon as l-o supports the System OFF. The measurements from TI are used for the real HW latency.

Here are some scope screenshots showing the time delta between the wake-up event (USER button press, trace A) and the end of omap_sram_idle (USR1 Led).

For RET mode, showing a delta of 408us:

Scope capture ret.jpg

For OFF mode, showing a delta of 2700us:

Scope capture off.jpg

GPT tracer

Since GPT12 is used as a wake-up source from the idle mode, it can be used to track the timings during the wake-up sequence. A patch is needed to let the timer count after it overflowed and woke up the system.

The GPT runs on 32KHz clock and so the resolution is limited to 30.518us. Given the latencies to measure for OFF mode, the resolution is accpetable.

4 GPT measurements are performed during the wake-up:

SW trace usage

Enable the power events and dump the trace:

# echo 1 > /debug/tracing/events/power/enable
# cat /debug/tracing/trace_pipe &

Enable the system idle in RET mode:

# echo 5 > /sys/devices/platform/omap/omap-hsuart.0/sleep_timeout 
# echo 5 > /sys/devices/platform/omap/omap-hsuart.1/sleep_timeout 
# echo 5 > /sys/devices/platform/omap/omap-hsuart.2/sleep_timeout 

# echo 0 > /debug/pm_debug/enable_off_mode
# echo 1 > /debug/pm_debug/sleep_while_idle

Trace output:

[   62.311462] *** GPT12 wake-up (HW wake-up, ASM restore, delta trace1-7): 183, 0, 244 us       => Dump of GPT timing deltas
          <idle>-0     [000]    62.241608: power_start: type=1 state=1 cpu_id=0                  => Idle start
          <idle>-0     [000]    62.241608: power_start: type=4 state=1 cpu_id=0                  => First suspend SW trace in omap_sram_idle
          <idle>-0     [000]    62.241638: power_start: type=4 state=2 cpu_id=0                  => ...
          <idle>-0     [000]    62.241669: power_start: type=4 state=3 cpu_id=0
          <idle>-0     [000]    62.241699: power_domain_target: name=neon_pwrdm state=1 cpu_id=0
          <idle>-0     [000]    62.241699: power_start: type=4 state=4 cpu_id=0
          <idle>-0     [000]    62.241699: clock_disable: name=uart3_fck state=0 cpu_id=0
          <idle>-0     [000]    62.241730: power_start: type=4 state=5 cpu_id=0
          <idle>-0     [000]    62.241730: clock_disable: name=uart1_fck state=0 cpu_id=0
          <idle>-0     [000]    62.241730: clock_disable: name=uart2_fck state=0 cpu_id=0
          <idle>-0     [000]    62.241760: power_start: type=4 state=6 cpu_id=0
          <idle>-0     [000]    62.241760: power_start: type=4 state=7 cpu_id=0
          <idle>-0     [000]    62.241760: power_start: type=4 state=8 cpu_id=0                  => Last suspend SW trace in omap_sram_idle
          <idle>-0     [000]    62.311188: power_start: type=5 state=1 cpu_id=0                  => First resume SW trace in omap_sram_idle
          <idle>-0     [000]    62.311188: power_start: type=5 state=2 cpu_id=0                  => ...
          <idle>-0     [000]    62.311188: power_start: type=5 state=3 cpu_id=0
          <idle>-0     [000]    62.311188: power_start: type=5 state=4 cpu_id=0
          <idle>-0     [000]    62.311218: clock_enable: name=uart1_fck state=1 cpu_id=0
          <idle>-0     [000]    62.311310: clock_enable: name=uart2_fck state=1 cpu_id=0
          <idle>-0     [000]    62.311310: power_start: type=5 state=5 cpu_id=0
          <idle>-0     [000]    62.311340: clock_enable: name=uart3_fck state=1 cpu_id=0
          <idle>-0     [000]    62.311340: power_start: type=5 state=6 cpu_id=0
          <idle>-0     [000]    62.311432: power_start: type=5 state=7 cpu_id=0                  => Last resume SW trace in omap_sram_idle
          <idle>-0     [000]    62.311462: power_end: cpu_id=0                                   => Idle end

Enable the system idle in OFF mode:

# echo 5 > /sys/devices/platform/omap/omap-hsuart.0/sleep_timeout 
# echo 5 > /sys/devices/platform/omap/omap-hsuart.1/sleep_timeout 
# echo 5 > /sys/devices/platform/omap/omap-hsuart.2/sleep_timeout 

# echo 1 > /debug/pm_debug/enable_off_mode
# echo 1 > /debug/pm_debug/sleep_while_idle

Trace output:

/ # echo 1 > /debug/pm_debug/enable_off_mode
/ #           
              sh-503   [000]    70.862366: power_domain_target: name=iva2_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862396: power_domain_target: name=mpu_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862396: power_domain_target: name=neon_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862396: power_domain_target: name=core_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862427: power_domain_target: name=cam_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862457: power_domain_target: name=dss_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862488: power_domain_target: name=per_pwrdm state=0 cpu_id=0
              sh-503   [000]    70.862488: power_domain_target: name=usbhost_pwrdm state=0 cpu_id=0
/ # 
[  557.240020] *** GPT12 wake-up (HW wake-up, ASM restore, delta trace1-7): 1495, 915, 488 us    => Dump of GPT timing deltas
          <idle>-0     [000]   557.156769: power_start: type=1 state=1 cpu_id=0                  => Idle start
          <idle>-0     [000]   557.156769: power_start: type=4 state=1 cpu_id=0                  => First suspend SW trace in omap_sram_idle
          <idle>-0     [000]   557.156769: power_start: type=4 state=2 cpu_id=0                  => ...
          <idle>-0     [000]   557.156830: power_start: type=4 state=3 cpu_id=0
          <idle>-0     [000]   557.156830: power_domain_target: name=neon_pwrdm state=0 cpu_id=0
          <idle>-0     [000]   557.156830: power_start: type=4 state=4 cpu_id=0
          <idle>-0     [000]   557.156860: clock_disable: name=uart3_fck state=0 cpu_id=0
          <idle>-0     [000]   557.156891: power_start: type=4 state=5 cpu_id=0
          <idle>-0     [000]   557.156891: clock_disable: name=uart1_fck state=0 cpu_id=0
          <idle>-0     [000]   557.156921: clock_disable: name=uart2_fck state=0 cpu_id=0
          <idle>-0     [000]   557.157013: power_start: type=4 state=6 cpu_id=0
          <idle>-0     [000]   557.157013: power_start: type=4 state=7 cpu_id=0
          <idle>-0     [000]   557.157898: power_start: type=4 state=8 cpu_id=0                  => Last suspend SW trace in omap_sram_idle
          <idle>-0     [000]   557.236084: power_start: type=5 state=1 cpu_id=0                  => First resume SW trace in omap_sram_idle
          <idle>-0     [000]   557.236145: power_start: type=5 state=2 cpu_id=0                  => ...
          <idle>-0     [000]   557.236206: power_start: type=5 state=3 cpu_id=0
          <idle>-0     [000]   557.236267: power_start: type=5 state=4 cpu_id=0
          <idle>-0     [000]   557.236389: clock_enable: name=uart1_fck state=1 cpu_id=0
          <idle>-0     [000]   557.236450: clock_enable: name=uart2_fck state=1 cpu_id=0
          <idle>-0     [000]   557.236450: power_start: type=5 state=5 cpu_id=0
          <idle>-0     [000]   557.236481: clock_enable: name=uart3_fck state=1 cpu_id=0
          <idle>-0     [000]   557.236511: power_start: type=5 state=6 cpu_id=0
          <idle>-0     [000]   557.236572: power_start: type=5 state=7 cpu_id=0                  => Last resume SW trace in omap_sram_idle
          <idle>-0     [000]   557.236602: power_end: cpu_id=0                                   => Idle end

Results interpretation

The low power transition sequence is pictured as nested calls to functions:

Low power transition sequence.png

The measured results (from the HW and SW traces) are mapped to the pictured states according to the following table:

Pictured state Trace point Performed SW action
Idle enter start suspend System ready to enter idle
omap_sram_idle 1 suspend trace point 1 Enter omap_sram_idle
omap_sram_idle 2 suspend trace point 2 calculation of next power domains modes
omap_sram_idle 3 suspend trace point 3 Power domains pre-transition: program power domains current state, clear status
omap_sram_idle 4 suspend trace point 4 Context save for NEON
IO pad and chain new state programmed
omap_sram_idle 5 suspend trace point 5 Context save for PER, GPIO
Prepare UARTs 2&3
omap_sram_idle 6 suspend trace point 6 Context save for CORE and PRCM
Prepare UARTs 0&1
omap_sram_idle 7 suspend trace point 7 Context save for INTC
Program SDRC
WFI enter suspend trace point 8 GPIO HW trace
MPU context save in ASM (caches, registers, disable cache & prediction)
System OFF active - sys_off_mode, external clocks and power supplies to be measured with System OFF support -
Wake-up event: IO or GPT12 HW trace A (if IO wake-up)
GPT12=0 (if GPT wake-up)
-
System OFF inactive - sys_off_mode, external clocks and power supplies to be measured with System OFF support -
WFI exit GPT12 sampling right after WFI -
omap_sram_idle 1 GPT12 sampling at return from ASM code
Wake-up trace point 1
SDRC errata for ES3.1
MPU context restore
MMU restore and enable
omap_sram_idle 2 wake-up trace point 2 cpu_init
omap_sram_idle 3 wake-up trace point 3 SDRC settings restore
omap_sram_idle 4 wake-up trace point 4 Restore MMU tables
Enable caches and prediction
omap_sram_idle 5 wake-up trace point 5 Context restore for CORE, PRCM, SRAM, SMS
Resume UARTs 0&1
omap_sram_idle 6 wake-up trace point 6 Context restore for PER, INTC, GPIO
IO pad & chain
Resume UARTS 2&3
omap_sram_idle 7 wake-up trace point 7
GPT sampling
HW trace B
Power domains post-transition: program power domains current state, clear status
Restore SDRC settings
Idle exit exit suspend System out of idle

Results

PSI measurements results

Some timings measurements have been made by the TI PSI team. The following tables gives the results for the sleep and wake-up latencies for the C-states:

image sleep image wake-up

Note: in the linux code there is no C7/C8/C9 as in the table. C7 is MPU OFF + CORE OFF, which is identical to C9 in the table.

A model with the energy spent in the C-states has been built from the measured numbers. Here is the graph of the energy vs time:

image graph

Taking the minimum energy from the graph allows to identify the 4 energy-wise interesting C-states: C1, C3, C5, C9 and the threshold time for those C-states to be efficient:

image table 4 C-states

Notes:

HW and SW measurements results

Here are the results for full RET and full OFF modes:

Sequence Time (us) - RET = C5 Time (us) - OFF = C9
From idle start till omap_sram_idle entry 0 0
From omap_sram_idle entry till WFI 152 1129
... HW sleep...
From WKUP event till WFI
(HW wake-up - GPT12)
183 1495
From WFI till return from omap34xx_save_cpu_context_wfi
(MPU context restore in ASM)
0 915
From return from omap34xx_save_cpu_context_wfi till end of omap_sram_idle
(System restore)
244 488
From end of omap_sram_idle till return from idle 30 30

Aggregated timings results

From the various sources of data the following figures are used:


Device latency patches

Device constraint code patches, derived from the timings results and measurements with various low power modes combinations.

Attachments

Kernel patches and config

File:OMAP latency measurements patches and config.tar.gz

--Jpihet 12 November 2010

Personal tools
Namespaces
Variants
Actions
Navigation
Toolbox