萬字整理 | systrace的實現

佈道師peter 發佈 2022-09-06T23:05:10.364511+00:00

'-o trace.html',指定了trace的輸出文件;'gfx view wm am dalvik input sched freq idle',指定了需要抓取的事件;

作者簡介:偉林,中年碼農,從事過電信、手機、安全、晶片等行業,目前依舊從事Linux方向開發工作,個人愛好Linux相關知識分享,個人微博CSDN pwl999。

上圖基本就能說清systrace的整個框架:

  • 1、systrace調用atrace抓取目標機的trace數據;

  • 2、systrace把trace數據和'prefix.html'、'suffix.html'、'systrace_trace_viewer.html'合成一個'trace.html'文件;

  • 3、使用chrome瀏覽器打開'trace.html'就可以非常方便的以圖形化的形式來查看和分析trace數據。背後是Trace-Viewer的腳本在運行;

內核態和用戶態的存儲trace數據的實現:

  • 1、內核trace信息,通過trace event記錄到ftrace的buffer中;

  • 2、用戶態(app/java framework/native)是通過使用Trace類來記錄trace信息的,也是記錄到內核ftrace buffer當中的,是通過"/sys/kernel/debug/tracing/trace_marker"接口記錄的。

1、systrace的使用

在定位Android性能問題的時候,我們經常會用到systrace工具。配置到ADB,連接上目標手機,在主機側使用類似命令啟動systrace:

ASOP_root/external/chromium-trace/catapult/systrace/bin$ ./systrace -o trace.html -t 10 gfx view wm am dalvik input sched freq idle
Starting tracing (10 seconds)
Tracing completed. Collecting output...
Outputting Systrace results...
Tracing complete, writing results

Wrote trace HTML File: file:///ASOP_ROOT/external/chromium-trace/catapult/systrace/bin/trace.html

systrace命令在ASOP源碼包的ASOP_ROOT/external/chromium-trace/catapult/systrace/bin路徑下。上述命令的參數:

  • '-t 10',指定了抓取trace的時長為10s;

  • '-o trace.html',指定了trace的輸出文件;

  • 'gfx view wm am dalvik input sched freq idle',指定了需要抓取的事件;

在命令啟動以後,我們就可以在目標機上進行滑動、啟動app等一系列操作,10s內的這些操作都會被記錄下來最後dump進trace.html文件。我們可以通過google的chrome瀏覽器來查看、分析trace.html:'google-chrome trace.html'。

可以通過'-l'選項來查看目標機支持的systrace事件全集:

$ ./systrace -l
gfx - Graphics
input - Input
view - View System
webview - WebView
wm - Window Manager
am - Activity Manager
sm - Sync Manager
audio - Audio
video - Video
camera - Camera
hal - Hardware Modules
app - Application
res - Resource Loading
dalvik - Dalvik VM
rs - Renderscript
bionic - Bionic C Library
power - Power Management
pm - Package Manager
ss - System Server
database - Database
network - Network
adb - ADB
pdx - PDX services
sched - CPU Scheduling
irq - IRQ Events
freq - CPU Frequency
idle - CPU Idle
disk - Disk I/O
workq - Kernel Workqueues
memreclaim - Kernel Memory Reclaim
regulators - Voltage and Current Regulators
binder_driver - Binder Kernel driver
binder_lock - Binder global lock trace
pagecache - Page cache

NOTE: more categories may be available with adb root

還可以通過'--help'選項來查看systrace命令的詳細選項:

$ ./systrace --help
Usage: systrace [options] [category1 [category2 ...]]

Example: systrace -b 32768 -t 15 gfx input view sched freq

Options:
-h, --help show this help message and exit
-o FILE write trace output to FILE
-j, --json write a JSON file
--link-assets (deprecated)
--asset-dir=ASSET_DIR
(deprecated)
-e DEVICE_SERIAL_NUMBER, --serial=DEVICE_SERIAL_NUMBER
adb device serial number
--timeout=TIMEOUT timeout for start and stop tracing (seconds)
--collection-timeout=COLLECTION_TIMEOUT
timeout for data collection (seconds)
-t N, --time=N trace for N seconds
--target=TARGET choose tracing target (android or Linux)
-b N, --buf-size=N use a trace buffer size of N KB
-l, --list-categories
list the available categories and exit

Atrace options:
--atrace-categories=ATRACE_CATEGORIES
Select atrace categories with a comma-delimited list,
e.g. --atrace-categories=cat1,cat2,cat3
-k KFUNCS, --ktrace=KFUNCS
specify a comma-separated list of kernel functions to
trace
--no-compress Tell the device not to send the trace data in
compressed form.
-a APP_NAME, --app=APP_NAME
enable application-level tracing for comma-separated
list of app cmdlines
--from-file=FROM_FILE
read the trace from a file (compressed) rather than
running a live trace

BattOr trace options:
--battor-categories=BATTOR_CATEGORIES
Select battor categories with a comma-delimited list,
e.g. --battor-categories=cat1,cat2,cat3
--serial-map=SERIAL_MAP
File containing pregenerated map of phone serial
numbers to BattOr serial numbers.
--battor-path=BATTOR_PATH
specify a BattOr path to use
--battor Use the BattOr tracing agent.

Ftrace options:
--ftrace-categories=FTRACE_CATEGORIES
Select ftrace categories with a comma-delimited list,
e.g. --ftrace-categories=cat1,cat2,cat3

WALT trace options:
--walt Use the WALT tracing agent. WALT is a device for
measuring latency of physical sensors on phones and
computers. See https://github.com/google/walt

2、主機systrace命令的實現(python)

systrace實質上是一個python文件,整個python相關包在ASOP_ROOT/external/chromium-trace/catapult/路徑下,以此路徑為根(root=ASOP_ROOT/external/chromium-trace/catapult),我們來分析它的實現過程:

./systrace/bin/systrace:

import os
import sys

# (1) 將'./systrace/'路徑加入到python的搜索路徑 #
_SYSTRACE_DIR = os.path.abspath(
os.path.join(os.path.dirname(__file__), os.path.pardir))
sys.path.insert(0, _SYSTRACE_DIR)

# (2) 這樣就能找到'./systrace/systrace/run_systrace.py'模塊並導入 #
from systrace import run_systrace

# (3) 調用run_systrace.py模塊中的main函數 #
if __name__ == '__main__':
sys.exit(run_systrace.main)

./systrace/systrace/run_systrace.py:


# (3.1) main函數繼續調用main_impl函數 #
def main:
main_impl(sys.argv)



def main_impl(arguments):
# Parse the command line options.

# (3.1.1) 將用戶輸入的參數解析為options和categories #
options, categories = parse_options(arguments)

# Override --atrace-categories and --ftrace-categories flags if command-line
# categories are provided.

# (3.1.2) 如果解析出了trace事件categories #
# 根據options.target指定的平台'android'/'linux',來給options中的選項賦值 #
# 平台是使用sytrace命令的'--target=TARGET'選項來指定的,如果沒有指定默認值是'android' #
if categories:
if options.target == 'android':
options.atrace_categories = categories
elif options.target == 'linux':
options.ftrace_categories = categories
else:
raise RuntimeError('Categories are only valid for atrace/ftrace. Target '
'platform must be either Android or Linux.')


./systrace/systrace/run_systrace.py:
# Include atrace categories by default in Systrace.
# (3.1.3) 如果平台是'android',且沒有指定trace事件,給出默認的trace事件 #
if options.target == 'android' and not options.atrace_categories:
options.atrace_categories = atrace_agent.DEFAULT_CATEGORIES

# (3.1.4) 如果平台是'android',且不是從文件中讀取數據,那麼就是從實際的目標機中讀取數據了 #
if options.target == 'android' and not options.from_file:
# 初始化adb #
initialize_devil
# 如果沒有指定目標機的serialnumber,嘗試讀取 #
if not options.device_serial_number:
devices = [a.GetDeviceSerial() for a in adb_wrapper.AdbWrapper.Devices()]
if len(devices) == 0:
raise RuntimeError('No ADB devices connected.')
elif len(devices) >= 2:
raise RuntimeError('Multiple devices connected, serial number required')
options.device_serial_number = devices[0]

# If list_categories is selected, just print the list of categories.
# In this case, use of the tracing controller is not necessary.
# (3.1.5) 如果當前是'systrace -l'命令,列出目標機支持的所有trace事件後直接返回 #
if options.list_categories:
if options.target == 'android':
# 調用'systrace/systrace/tracing_agents/atrace_agent.py'文件中的list_categories函數 #
# 最後調用的是`adb shell atrace --list_categories'命令 #
atrace_agent.list_categories(options)
elif options.target == 'linux':
ftrace_agent.list_categories(options)
return

# Set up the systrace runner and start tracing.
# (3.1.6) 如果是普通的trace命令,根據'systrace/systrace/systrace_runner.py'模塊中的SystraceRunner類來創建對象 #
controller = systrace_runner.SystraceRunner(
os.path.dirname(os.path.abspath(__file__)), options)

# (3.1.6.1) 開始tracing #
controller.StartTracing

# Wait for the given number of seconds or until the user presses enter.
# pylint: disable=superfluous-parens
# (need the parens so no syntax error if trying to load with Python 3)
if options.from_file is not None:
print('Reading results from file.')
elif options.trace_time:
print('Starting tracing (%d seconds)' % options.trace_time)
time.sleep(options.trace_time)
else:
raw_input('Starting tracing (stop with enter)')

# Stop tracing and collect the output.
print('Tracing completed. Collecting output...')

# (3.1.6.2) 停止tracing #
controller.StopTracing

print('Outputting Systrace results...')

# (3.1.6.3) 輸出tracing結果到文件中 #
controller.OutputSystraceResults(write_json=options.write_json)

我們可以看到trace過的重點最後落在SystraceRunner對象的創建,以及.StartTracing/.StopTracing/.OutputSystraceResults幾個方法上。

下面我們詳細分析一下這幾個步驟的實現過程。

2.1、SystraceRunner類的初始化

SystraceRunner類在systrace_runner.py文件當中,我們來查看起具體實現。

./systrace/systrace/systrace_runner.py:

AGENT_MODULES = [android_process_data_agent, atrace_agent,
atrace_from_file_agent, battor_trace_agent,
ftrace_agent, walt_agent]

class SystraceRunner(object):
def __init__(self, script_dir, options):
"""Constructor.

Args:
script_dir: Directory containing the trace viewer script
(systrace_trace_viewer.html)
options: Object containing command line options.
"""
# Parse command line arguments and create agents.
self._script_dir = script_dir
self._out_filename = options.output_file

# (1) #
agents_with_config = tracing_controller.CreateAgentsWithConfig(
options, AGENT_MODULES)

# (2) #
controller_config = tracing_controller.GetControllerConfig(options)

# Set up tracing controller.

# (3) #
self._tracing_controller = tracing_controller.TracingController(
agents_with_config, controller_config)
  • 1、分析tracing_controller.CreateAgentsWithConfig的實現;

./systrace/systrace/tracing_controller.py:

def CreateAgentsWithConfig(options, modules):
"""Create tracing agents.

This function will determine which tracing agents are valid given the
options and create those agents along with their corresponding configuration
object.
Args:
options: The command-line options.
modules: The modules for either Systrace or profile_chrome.
TODO(washingtonp): After all profile_chrome agents are in
Systrace, this parameter will no longer be valid.
Returns:
A list of AgentWithConfig options containing agents and their corresponding
configuration object.
"""
result =

# (1.1) 遍歷modules中的agent調用相應的函數。對'android'來說,基本只會調用到atrace_agent、android_process_data_agent #
for module in modules:

# (1.1.1) #
config = module.get_config(options)

# (1.1.2) #
agent = module.try_create_agent(config)

if agent and config:
# (1.1.3) 創建一個AgentWithConfig類的對象,用來存儲config和agent #
result.append(AgentWithConfig(agent, config))
return [x for x in result if x and x.agent]

|→

./systrace/systrace/tracing_agents/atrace_agent.py:

# (1.1.1) 創建一個AtraceConfig類的對象,用來存儲optios中的相關配置 #
def get_config(options):
return AtraceConfig(options.atrace_categories,
options.trace_buf_size, options.kfuncs,
options.app_name, options.compress_trace_data,
options.from_file, options.device_serial_number,
options.trace_time, options.target)



class AtraceConfig(tracing_agents.TracingConfig):
def __init__(self, atrace_categories, trace_buf_size, kfuncs,
app_name, compress_trace_data, from_file,
device_serial_number, trace_time, target):
tracing_agents.TracingConfig.__init__(self)
self.atrace_categories = atrace_categories
self.trace_buf_size = trace_buf_size
self.kfuncs = kfuncs
self.app_name = app_name
self.compress_trace_data = compress_trace_data
self.from_file = from_file
self.device_serial_number = device_serial_number
self.trace_time = trace_time
self.target = target
# (1.1.2) 創建一個AtraceAgent類的對象 #
def try_create_agent(config):
"""Create an Atrace agent.

Args:
config: Command line config.
"""

# 根據config的配置,判斷當前Atrace agent是否需要創建 #
if config.target != 'android':
return None
if config.from_file is not None:
return None

if not config.atrace_categories:
return None

# Check device SDK version.
device_sdk_version = util.get_device_sdk_version
if device_sdk_version < version_codes.JELLY_BEAN_MR2:
print ('Device SDK versions < 18 (Jellybean MR2) not supported.\n'
'Your device SDK version is %d.' % device_sdk_version)
return None

return AtraceAgent(device_sdk_version)



class AtraceAgent(tracing_agents.TracingAgent):

def __init__(self, device_sdk_version):
super(AtraceAgent, self).__init__
self._device_sdk_version = device_sdk_version
self._adb = None
self._trace_data = None
self._tracer_args = None
self._collection_thread = None
self._device_utils = None
self._device_serial_number = None
self._config = None
self._categories = None

|→

./systrace/systrace/tracing_agents/android_process_data_agent.py:

def try_create_agent(config):
if config.target != 'android':
return None
if config.from_file is not None:
return None
return AndroidProcessDataAgent



class AndroidProcessDataAgent(tracing_agents.TracingAgent):
def __init__(self):
super(AndroidProcessDataAgent, self).__init__
self._trace_data = ""
self._device = None
  • 2、分析tracing_controller.GetControllerConfig的實現;

./systrace/systrace/tracing_controller.py:

# (2.1) 創建一個TracingControllerConfig類的對象,用來存儲optios中的相關配置 #
def GetControllerConfig(options):
return TracingControllerConfig(options.output_file, options.trace_time,
options.write_json,
options.link_assets, options.asset_dir,
options.timeout, options.collection_timeout,
options.device_serial_number, options.target)



class TracingControllerConfig(tracing_agents.TracingConfig):
def __init__(self, output_file, trace_time, write_json,
link_assets, asset_dir, timeout, collection_timeout,
device_serial_number, target):
tracing_agents.TracingConfig.__init__(self)
self.output_file = output_file
self.trace_time = trace_time
self.write_json = write_json
self.link_assets = link_assets
self.asset_dir = asset_dir
self.timeout = timeout
self.collection_timeout = collection_timeout
self.device_serial_number = device_serial_number
self.target = target

  • 3、分析tracing_controller.TracingController的實現;

./systrace/systrace/tracing_controller.py:

# (3.1) 創建一個TracingController類的對象,稍後的start/stop/output操作都使用該對象的方法 #
class TracingController(object):
def __init__(self, agents_with_config, controller_config):
"""Create tracing controller.

Create a tracing controller object. Note that the tracing
controller is also a tracing agent.

Args:
agents_with_config: List of tracing agents for this controller with the
corresponding tracing configuration objects.
controller_config: Configuration options for the tracing controller.
"""
self._child_agents = None

# (3.1.1) 存儲agent對象和config #
self._child_agents_with_config = agents_with_config
# (3.1.2) 新建一個TracingControllerAgent類的對象 #
self._controller_agent = TracingControllerAgent
# (3.1.3) 存儲controller層級的config #
self._controller_config = controller_config
self._trace_in_progress = False
self.all_results = None

2.2、SystraceRunner.StartTracing

入口是SystraceRunner類的.StartTracing方法,./systrace/systrace/systrace_runner.py:

class SystraceRunner(object):

def StartTracing(self):
# (1) #
self._tracing_controller.StartTracing

繼續調用到TracingController類的.StartTracing方法,./systrace/systrace/tracing_controller.py:

class TracingController(object):

def StartTracing(self):
"""Start tracing for all tracing agents.

This function starts tracing for both the controller tracing agent
and the child tracing agents.

Returns:
Boolean indicating whether or not the start tracing succeeded.
Start tracing is considered successful if at least the
controller tracing agent was started.
"""

# (1.1) 設置對象中的tracing啟動標誌 #
assert not self._trace_in_progress, 'Trace already in progress.'
self._trace_in_progress = True

# Start the controller tracing agents. Controller tracing agent
# must be started successfully to proceed.
# (1.2) 啟動成員對象_controller_agent的.StartAgentTracing方法 #
# 記錄一些log #
if not self._controller_agent.StartAgentTracing(
self._controller_config,
timeout=self._controller_config.timeout):
print 'Unable to start controller tracing agent.'
return False

# Start the child tracing agents.
succ_agents =
# (1.3) 逐個啟動agent list中agent的.StartAgentTracing方法 #
# 對'Android'來說,就一個agent:AtraceAgent #
for agent_and_config in self._child_agents_with_config:
agent = agent_and_config.agent
config = agent_and_config.config

# 啟動agent #
if agent.StartAgentTracing(config,
timeout=self._controller_config.timeout):
# 把啟動成功的agent加入succ_agents list #
succ_agents.append(agent)
else:
print 'Agent %s not started.' % str(agent)

# Print warning if all agents not started.
na = len(self._child_agents_with_config)
ns = len(succ_agents)
if ns < na:
print 'Warning: Only %d of %d tracing agents started.' % (ns, na)
self._child_agents = succ_agents
return True

|→

繼續調用到AtraceAgent類的.StartTracing方法,./systrace/systrace/tracing_agents/atrace_agent.py:

class AtraceAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StartAgentTracing(self, config, timeout=None):
assert config.atrace_categories, 'Atrace categories are missing!'
self._config = config

# systrace命令下達的需要trace的事件 #
self._categories = config.atrace_categories
if isinstance(self._categories, list):
self._categories = ','.join(self._categories)

# 使用'atrace --list_categories'獲取的目標機支持的可trace事件 #
avail_cats = get_available_categories(config, self._device_sdk_version)

# (1.3.1) 判斷命令中是否有目標機不支持的trace事件 #
unavailable = [x for x in self._categories.split(',') if
x not in avail_cats]
self._categories = [x for x in self._categories.split(',') if
x in avail_cats]
if unavailable:
print 'These categories are unavailable: ' + ' '.join(unavailable)
self._device_utils = device_utils.DeviceUtils(config.device_serial_number)
self._device_serial_number = config.device_serial_number

# (1.3.2) 構造參數:'atrace ... categories' #
# '...'包括以下選項: #
# '-z': compress_trace_data #
# '-t trace_time' #
# '-b trace_buf_size' #
# '-a app_name' #
# '-k kfuncs' #
self._tracer_args = _construct_atrace_args(config,
self._categories)
# (1.3.3) 執行命令:'atrace ... categories --async_start' #
self._device_utils.RunShellCommand(
self._tracer_args + ['--async_start'], check_return=True)
return True

|→

繼續調用到AndroidProcessDataAgent類的.StartTracing方法,./systrace/systrace/tracing_agents/android_process_data_agent.py:

class AndroidProcessDataAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StartAgentTracing(self, config, timeout=None):
self._device = device_utils.DeviceUtils(config.device_serial_number)
self._trace_data += self._get_process_snapshot
return True



def _get_process_snapshot(self):
use_legacy = False
try:
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC, check_return=True, as_root=True, shell=True)
except AdbShellCommandFailedError:
use_legacy = True

# Check length of 2 as we execute two commands, which in case of failure
# on old devices output 1 line each.
if use_legacy or len(dump) == 2:
logging.debug('Couldn\'t parse ps dump, trying legacy method ...')
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC_LEGACY, check_return=True, as_root=True, shell=True)
if len(dump) == 2:
logging.error('Unable to extract process data!')
return ""

return '\n'.join(dump) + '\n'

# 實際上就是在StartTracing和StopTracing時,各調用一次如下的ps命令 #
PS_COMMAND_PROC = "ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM" \
"&& ps -AT -o USER,PID,TID,CMD"

2.3、SystraceRunner.StopTracing

入口是SystraceRunner類的.StopTracing方法,./systrace/systrace/systrace_runner.py:

class SystraceRunner(object):

def StopTracing(self):
# (1) #
self._tracing_controller.StopTracing

繼續調用到TracingController類的.StopTracing方法,./systrace/systrace/tracing_controller.py:

class TracingController(object):

def StopTracing(self):
"""Issue clock sync marker and stop tracing for all tracing agents.

This function stops both the controller tracing agent
and the child tracing agents. It issues a clock sync marker prior
to stopping tracing.

Returns:
Boolean indicating whether or not the stop tracing succeeded
for all agents.
"""

# (1.1) 清除對象中的tracing啟動標誌 #
assert self._trace_in_progress, 'No trace in progress.'
self._trace_in_progress = False

# Issue the clock sync marker and stop the child tracing agents.
self._IssueClockSyncMarker
succ_agents =
# (1.2) 逐個調用agent list中agent的.StopAgentTracing方法 #
# 對'Android'來說,就一個agent:AtraceAgent #
for agent in self._child_agents:
if agent.StopAgentTracing(timeout=self._controller_config.timeout):
succ_agents.append(agent)
else:
print 'Agent %s not stopped.' % str(agent)

# Stop the controller tracing agent. Controller tracing agent
# must be stopped successfully to proceed.
# (1.3) 調用成員對象_controller_agent的.StartAgentTracing方法 #
# 記錄一些log #
if not self._controller_agent.StopAgentTracing(
timeout=self._controller_config.timeout):
print 'Unable to stop controller tracing agent.'
return False

# Print warning if all agents not stopped.
na = len(self._child_agents)
ns = len(succ_agents)
if ns < na:
print 'Warning: Only %d of %d tracing agents stopped.' % (ns, na)
self._child_agents = succ_agents

# Collect the results from all the stopped tracing agents.
all_results =
# (1.4) 匯總agent list中agent和成員對象_controller_agent所有的result #
for agent in self._child_agents + [self._controller_agent]:
try:
result = agent.GetResults(
timeout=self._controller_config.collection_timeout)
if not result:
print 'Warning: Timeout when getting results from %s.' % str(agent)
continue
if result.source_name in [r.source_name for r in all_results]:
print ('Warning: Duplicate tracing agents named %s.' %
result.source_name)
all_results.append(result)
# Check for exceptions. If any exceptions are seen, reraise and abort.
# Note that a timeout exception will be swalloed by the timeout
# mechanism and will not get to that point (it will return False instead
# of the trace result, which will be dealt with above)
except:
print 'Warning: Exception getting results from %s:' % str(agent)
print sys.exc_info[0]
raise
self.all_results = all_results
return all_results

|→

繼續調用到AtraceAgent類的.StopAgentTracing方法,./systrace/systrace/tracing_agents/atrace_agent.py:

class AtraceAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StopAgentTracing(self, timeout=None):
"""Stops tracing and starts collecting results.

To synchronously retrieve the results after calling this function,
call GetResults.
"""

# (1.2.1) 啟動stop tracing的線程 #
self._collection_thread = threading.Thread(
target=self._collect_and_preprocess)
self._collection_thread.start
return True



def _collect_and_preprocess(self):
"""Collects and preprocesses trace data.

Stores results in self._trace_data.
"""

# (1.2.1.1) dump trace數據,並stop tracing #
trace_data = self._collect_trace_data

# (1.2.1.2) 對獲取的trace數據進行預處理 #
self._trace_data = self._preprocess_trace_data(trace_data)



def _collect_trace_data(self):
"""Reads the output from atrace and stops the trace."""

# (1.2.1.1.1) dump出trace數據 #
# 執行命令:'atrace ... categories --async_dump' #
dump_cmd = self._tracer_args + ['--async_dump']
result = self._device_utils.RunShellCommand(
dump_cmd, raw_output=True, large_output=True, check_return=True)

# (1.2.1.1.2) 找到trace數據中'TRACE\:'開頭的位置 #
data_start = re.search(TRACE_START_REGEXP, result)
if data_start:
data_start = data_start.end(0)
else:
raise IOError('Unable to get atrace data. Did you forget adb root?')

# (1.2.1.1.3) 清除trace數據中類似無效數據:r'^capturing trace\.\.\. done|^capturing trace\.\.\.' #
output = re.sub(ADB_IGNORE_REGEXP, '', result[data_start:])

# (1.2.1.1.4) stop tracing #
# 執行命令:'atrace ... categories --async_stop' #
self._stop_trace
return output

def _preprocess_trace_data(self, trace_data):
"""Performs various processing on atrace data.

Args:
trace_data: The raw trace data.
Returns:
The processed trace data.
"""

# (1.2.1.2.1) 對trace數據進行一些strp和解壓 #
if trace_data:
trace_data = strip_and_decompress_trace(trace_data)

if not trace_data:
print >> sys.stderr, ('No data was captured. Output file was not '
'written.')
sys.exit(1)

# (1.2.1.2.2) 修復MISSING_TGIDS #
if _FIX_MISSING_TGIDS:
# Gather proc data from device and patch tgids
procfs_dump = self._device_utils.RunShellCommand(
'echo -n /proc/[0-9]*/task/[0-9]*',
shell=True, check_return=True)[0].split(' ')
pid2_tgid = extract_tgids(procfs_dump)
trace_data = fix_missing_tgids(trace_data, pid2_tgid)

# (1.2.1.2.3) 修復CIRCULAR_TRACES #
if _FIX_CIRCULAR_TRACES:
trace_data = fix_circular_traces(trace_data)

return trace_data

|→

繼續調用到AndroidProcessDataAgent類的.StopAgentTracing方法,./systrace/systrace/tracing_agents/android_process_data_agent.py:

class AndroidProcessDataAgent(tracing_agents.TracingAgent):

@py_utils.Timeout(tracing_agents.START_STOP_TIMEOUT)
def StopAgentTracing(self, timeout=None):
self._trace_data += self._get_process_snapshot
return True



def _get_process_snapshot(self):
use_legacy = False
try:
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC, check_return=True, as_root=True, shell=True)
except AdbShellCommandFailedError:
use_legacy = True

# Check length of 2 as we execute two commands, which in case of failure
# on old devices output 1 line each.
if use_legacy or len(dump) == 2:
logging.debug('Couldn\'t parse ps dump, trying legacy method ...')
dump = self._device.RunShellCommand( \
PS_COMMAND_PROC_LEGACY, check_return=True, as_root=True, shell=True)
if len(dump) == 2:
logging.error('Unable to extract process data!')
return ""

return '\n'.join(dump) + '\n'

# 實際上就是在StartTracing和StopTracing時,各調用一次如下的ps命令 #
PS_COMMAND_PROC = "ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM" \
"&& ps -AT -o USER,PID,TID,CMD"

2.4、SystraceRunner.OutputSystraceResults

入口是SystraceRunner類的.OutputSystraceResults方法,./systrace/systrace/systrace_runner.py:

class SystraceRunner(object):

def OutputSystraceResults(self, write_json=False):
"""Output the results of systrace to a file.

If output is necessary, then write the results of systrace to either (a)
a standalone HTML file, or (b) a json file which can be read by the
trace viewer.

Args:
write_json: Whether to output to a json file (if false, use HTML file)
"""
print 'Tracing complete, writing results'
if write_json:
# (1) 輸出成jason格式 #
result = output_generator.GenerateJSONOutput(
self._tracing_controller.all_results,
self._out_filename)
else:
# (2) 輸出成html格式 #
# all_results是上一步stoptracing時得到的trace結果 #
# _out_filename是使用'-o'選項指定的文件名 #
result = output_generator.GenerateHTMLOutput(
self._tracing_controller.all_results,
self._out_filename)
print '\nWrote trace %s file: file://%s\n' % (('JSON' if write_json
else 'HTML'), result)

繼續調用到output_generator模塊中的GenerateHTMLOutput函數,./systrace/systrace/output_generator.py:

def GenerateHTMLOutput(trace_results, output_file_name):
"""Write the results of systrace to an HTML file.

Args:
trace_results: A list of TraceResults.
output_file_name: The name of the HTML file that the trace viewer
results should be written to.
"""
def _ReadAsset(src_dir, filename):
return open(os.path.join(src_dir, filename)).read

# TODO(rnephew): The tracing output formatter is able to handle a single
# systrace trace just as well as it handles multiple traces. The obvious thing
# to do here would be to use it all for all systrace output: however, we want
# to continue using the legacy way of formatting systrace output when a single
# systrace and the tracing controller trace are present in order to match the
# Java verison of systrace. Java systrace is expected to be deleted at a later
# date. We should consolidate this logic when that happens.

# (2.1) 如果result list中成員個數大於3,使用新方法NewGenerateHTMLOutput #
if len(trace_results) > 3:
NewGenerateHTMLOutput(trace_results, output_file_name)
return os.path.abspath(output_file_name)

systrace_dir = os.path.abspath(os.path.dirname(__file__))

# (2.2) 嘗試更新./systrace/systrace/systrace_trace_viewer.html文件 #
try:
from systrace import update_systrace_trace_viewer
except ImportError:
pass
else:
update_systrace_trace_viewer.update

trace_viewer_html = _ReadAsset(systrace_dir, 'systrace_trace_viewer.html')

# Open the file in binary mode to prevent python from changing the
# line endings, then write the prefix.

# (2.3) 讀出'prefix.html','suffix.html','systrace_trace_viewer.html'文件的內容備用 #
systrace_dir = os.path.abspath(os.path.dirname(__file__))
html_prefix = _ReadAsset(systrace_dir, 'prefix.html')
html_suffix = _ReadAsset(systrace_dir, 'suffix.html')
trace_viewer_html = _ReadAsset(systrace_dir,
'systrace_trace_viewer.html')

# Open the file in binary mode to prevent python from changing the
# line endings, then write the prefix.
# (2.4) 打開一個名為'xxx.html'的輸出文件,準備寫入 #
html_file = open(output_file_name, 'wb')

# (2.4.1) 首先寫入'prefix.html'的內容 #
# 並且把其中的'{{SYSTRACE_TRACE_VIEWER_HTML}}'字符用'systrace_trace_viewer.html'的文件內容替換 #
html_file.write(html_prefix.replace('{{SYSTRACE_TRACE_VIEWER_HTML}}',
trace_viewer_html))

# Write the trace data itself. There is a separate section of the form
# <script class="trace-data" type="application/text"> ... </script>
# for each tracing agent (including the controller tracing agent).

# (2.4.2) 逐個按格式寫入trace_results中的trace數據 #
html_file.write('<!-- BEGIN TRACE -->\n')
for result in trace_results:
html_file.write(' <script class="trace-data" type="application/text">\n')
html_file.write(_ConvertToHtmlString(result.raw_data))
html_file.write(' </script>\n')
html_file.write('<!-- END TRACE -->\n')

# Write the suffix and finish.
# (2.4.3) 最後寫入'suffix.html'的內容 #
html_file.write(html_suffix)
html_file.close

final_path = os.path.abspath(output_file_name)
return final_path

2.4、執行的命令

我們在devil/devil/android/device_utils.py的RunShellCommand.run和devil/devil/utils/cmd_helper.py的Popen處加上調試列印,看到』./systrace -o trace.html -t 10 gfx view wm am dalvik input sched freq idle『命令的具體執行過程如下:

ASOP_ROOT/external/chromium-trace/catapult/systrace/bin/systrace -o trace.html -t 10 gfx view wm am dalvik input sched freq idle
Popen: [u'/usr/bin/adb', 'devices']
run: ( c=/data/local/tmp/cache_token;echo $EXTERNAL_STORAGE;cat $c 2>/dev/||echo;echo "636d6fb8-d59c-11e8-ae24-b8ca3a959992">$c &&getprop )>/data/local/tmp/temp_file-4aec8d95e60eb
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ( c=/data/local/tmp/cache_token;echo $EXTERNAL_STORAGE;cat $c 2>/dev/||echo;echo "636d6fb8-d59c-11e8-ae24-b8ca3a959992">$c &&getprop )>/data/local/tmp/temp_file-4aec8d95e60eb );echo %$?']
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'pull', '/data/local/tmp/temp_file-4aec8d95e60eb', '/tmp/tmple_FOq/tmp_ReadFileWithPull']
run: su 0 ls /root && ! ls /root
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', Popen: [u'/usr/bin/adb', 'shell''-s', , '8'( su 0 ls72QADT5KWKRG', /root && ! ls /root 'shell', );echo'rm -f /data/local/tmp/ %$?']temp_file-4aec8d95e60eb']

run: ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD );echo %$?']
run: atrace --list_categories
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( atrace --list_categories );echo %$?']
run: atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_start
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_start );echo %$?']
Starting tracing (10 seconds)
Tracing completed. Collecting output...
run: ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ps -A -o USER,PID,PPID,VSIZE,RSS,WCHAN,ADDR=PC,S,NAME,COMM&& ps -AT -o USER,PID,TID,CMD );echo %$?']
run: ( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_dump )>/data/local/tmp/temp_file-57edea8625b51
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( ( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_dump )>/data/local/tmp/temp_file-57edea8625b51 );echo %$?']
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'pull', '/data/local/tmp/temp_file-57edea8625b51', '/tmp/tmp_preLk/tmp_ReadFileWithPull']
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', 'rm -f /data/local/tmp/temp_file-57edea8625b51']
run: atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_stop
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_stop );echo %$?']
run: echo -n /proc/[0-9]*/task/[0-9]*
Popen: [u'/usr/bin/adb', '-s', '872QADT5KWKRG', 'shell', '( echo -n /proc/[0-9]*/task/[0-9]* );echo %$?']
Outputting Systrace results...
Tracing complete, writing results

Wrote trace HTML file: file:///home/pengweilin/sdb2/m1872/code/repo/external/chromium-trace/catapult/systrace/bin/pwl1.html

精簡起來是調用目標機上的atrace命令實現的:

adb shell atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_start
adb shell atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_dump > /data/local/tmp/temp_file-57edea8625b51
adb pull /data/local/tmp/temp_file-57edea8625b51 /tmp/tmp_preLk/tmp_ReadFileWithPull
adb shell rm -f /data/local/tmp/temp_file-57edea8625b51
adb shell atrace -z -t 10 -b 4096 gfx view wm am dalvik input sched freq idle --async_stop

3、目標機atrace命令的實現(c++)

從上一節可以看到主機的systrace命令實際最後都是調用目標機上的atrace命令實現的。從atrace的命令格式上看,分為兩部分:

  • [options]:就是'-'和'--'打頭的選項;

  • [categories...]:指定需要trace哪些事件。就是類似'gfx view wm am dalvik input sched freq idle'這些;

$ atrace --help
usage: atrace [options] [categories...]
options include:
-a appname enable app-level tracing for a comma separated list of cmdlines
-b N use a trace buffer size of N KB
-c trace into a circular buffer
-f filename use the categories written in a file as space-separated
values in a line
-k fname,... trace the listed kernel functions
-n ignore signals
-s N sleep for N seconds before tracing [default 0]
-t N trace for N seconds [default 5]
-z compress the trace dump
--async_start start circular trace and return immediately
--async_dump dump the current contents of circular trace buffer
--async_stop stop tracing and dump the current contents of circular
trace buffer
--stream stream trace to stdout as it enters the trace buffer
Note: this can take significant CPU time, and is best
used for measuring things that are not affected by
CPU performance, like pagecache usage.
--list_categories
list the available tracing categories
-o filename write the trace to the specified file instead
of stdout.

atrace的源碼在ASOP_ROOT/frameworks/native/cmds/atrace/atrace.cpp,我們來具體分析一下它的實現:

int main(int argc, char **argv)
{
bool async = false;
bool traceStart = true;
bool traceStop = true;
bool traceDump = true;
bool traceStream = false;

/* (1) atrace跟"--help"選項,印出幫助信息 */
if (argc == 2 && 0 == strcmp(argv[1], "--help")) {
showHelp(argv[0]);
exit(0);
}

/* (2) ftrace目錄是否存在 */
if (!findTraceFiles) {
fprintf(stderr, "No trace folder found\n");
exit(-1);
}

/* (3) 逐條解析atrace的命令參數 */
for (;;) {
int ret;
int option_index = 0;
static struct option long_options = {
{"async_start", no_argument, 0, 0 },
{"async_stop", no_argument, 0, 0 },
{"async_dump", no_argument, 0, 0 },
{"list_categories", no_argument, 0, 0 },
{"stream", no_argument, 0, 0 },
{ 0, 0, 0, 0 }
};

/* (3.1) 嘗試使用option來解析atrace的命令參數 */
ret = getopt_long(argc, argv, "a:b:cf:k:ns:t:zo:",
long_options, &option_index);

/* (3.2) 如果使用option解析失敗,嘗試使用category來進行解析 */
if (ret < 0) {
for (int i = optind; i < argc; i++) {
if (!setCategoryEnable(argv[i], true)) {
fprintf(stderr, "error enabling tracing category \"%s\"\n", argv[i]);
exit(1);
}
}
break;
}

/* (3.3) 根據解析出來的option,設置flag */
switch(ret) {
case 'a':
g_debugAppCmdLine = optarg;
break;

case 'b':
g_traceBufferSizeKB = atoi(optarg);
break;

case 'c':
g_traceOverwrite = true;
break;

case 'f':
g_categoriesFile = optarg;
break;

case 'k':
g_kernelTraceFuncs = optarg;
break;

case 'n':
g_nohup = true;
break;

case 's':
g_initialSleepSecs = atoi(optarg);
break;

case 't':
g_traceDurationSeconds = atoi(optarg);
break;

case 'z':
g_compress = true;
break;

case 'o':
g_outputFile = optarg;
break;

case 0:
if (!strcmp(long_options[option_index].name, "async_start")) {
async = true;
traceStop = false;
traceDump = false;
g_traceOverwrite = true;
} else if (!strcmp(long_options[option_index].name, "async_stop")) {
async = true;
traceStart = false;
} else if (!strcmp(long_options[option_index].name, "async_dump")) {
async = true;
traceStart = false;
traceStop = false;
} else if (!strcmp(long_options[option_index].name, "stream")) {
traceStream = true;
traceDump = false;
} else if (!strcmp(long_options[option_index].name, "list_categories")) {
listSupportedCategories;
exit(0);
}
break;

default:
fprintf(stderr, "\n");
showHelp(argv[0]);
exit(-1);
break;
}
}

registerSigHandler;

if (g_initialSleepSecs > 0) {
sleep(g_initialSleepSecs);
}

bool ok = true;
/* (4) "-- async_start" option命令的實際執行動作 */
if (traceStart) {
/* (4.1) */
ok &= setUpTrace;

/* (4.2) */
ok &= startTrace;
}

if (ok && traceStart) {
if (!traceStream) {
printf("capturing trace...");
fflush(stdout);
}

// We clear the trace after starting it because tracing gets enabled for
// each CPU individually in the kernel. Having the beginning of the trace
// contain entries from only one CPU can cause "begin" entries without a
// matching "end" entry to show up if a task gets migrated from one CPU to
// another.
/* (4.3) 清除trace內容 */
ok = clearTrace;

/* (4.4) 通過trace_marker文件接口,寫入同步信息 */
writeClockSyncMarker;

/* (4.5) async==false的支持 */
if (ok && !async && !traceStream) {
// Sleep to allow the trace to be captured.
struct timespec timeLeft;
timeLeft.tv_sec = g_traceDurationSeconds;
timeLeft.tv_nsec = 0;
do {
if (g_traceAborted) {
break;
}
} while (nanosleep(&timeLeft, &timeLeft) == -1 && errno == EINTR);
}

/* (4.6) traceStream的支持 */
if (traceStream) {
streamTrace;
}
}

// Stop the trace and restore the default settings.
/* (5) "-- async_stop" option命令的實際執行動作 */
if (traceStop)
stopTrace;

/* (6) "-- async_dump" option命令的實際執行動作 */
if (ok && traceDump) {
if (!g_traceAborted) {
printf(" done\n");
fflush(stdout);
int outFd = STDOUT_FILENO;
if (g_outputFile) {
outFd = open(g_outputFile, O_WRONLY | O_CREAT | O_TRUNC, 0644);
}
if (outFd == -1) {
printf("Failed to open '%s', err=%d", g_outputFile, errno);
} else {
dprintf(outFd, "TRACE:\n");
dumpTrace(outFd);
if (g_outputFile) {
close(outFd);
}
}
} else {
printf("\ntrace aborted.\n");
fflush(stdout);
}
clearTrace;
} else if (!ok) {
fprintf(stderr, "unable to start tracing\n");
}

// Reset the trace buffer size to 1.
if (traceStop)
cleanUpTrace;

return g_traceAborted ? 1 : 0;
}

3.1、categories

categories指的是可被trace的事件的類別,它的全集如下:

/* Tracing categories */
static const TracingCategory k_categories = {
{ "gfx", "Graphics", ATRACE_TAG_GRAPHICS, {
{ OPT, "events/mdss/enable" },
} },
{ "input", "Input", ATRACE_TAG_INPUT, { } },
{ "view", "View System", ATRACE_TAG_VIEW, { } },
{ "webview", "WebView", ATRACE_TAG_WEBVIEW, { } },
{ "wm", "Window Manager", ATRACE_TAG_WINDOW_MANAGER, { } },
{ "am", "Activity Manager", ATRACE_TAG_ACTIVITY_MANAGER, { } },
{ "sm", "Sync Manager", ATRACE_TAG_SYNC_MANAGER, { } },
{ "audio", "Audio", ATRACE_TAG_AUDIO, { } },
{ "video", "Video", ATRACE_TAG_VIDEO, { } },
{ "camera", "Camera", ATRACE_TAG_CAMERA, { } },
{ "hal", "Hardware Modules", ATRACE_TAG_HAL, { } },
{ "app", "Application", ATRACE_TAG_APP, { } },
{ "res", "Resource Loading", ATRACE_TAG_RESOURCES, { } },
{ "dalvik", "Dalvik VM", ATRACE_TAG_DALVIK, { } },
{ "rs", "RenderScript", ATRACE_TAG_RS, { } },
{ "bionic", "Bionic C Library", ATRACE_TAG_BIONIC, { } },
{ "power", "Power Management", ATRACE_TAG_POWER, { } },
{ "pm", "Package Manager", ATRACE_TAG_PACKAGE_MANAGER, { } },
{ "ss", "System Server", ATRACE_TAG_SYSTEM_SERVER, { } },
{ "database", "Database", ATRACE_TAG_DATABASE, { } },
{ "network", "Network", ATRACE_TAG_NETWORK, { } },
{ "adb", "ADB", ATRACE_TAG_ADB, { } },
{ k_coreServiceCategory, "Core services", 0, { } },
{ k_pdxServiceCategory, "PDX services", 0, { } },
{ "sched", "CPU Scheduling", 0, {
{ REQ, "events/sched/sched_switch/enable" },
{ REQ, "events/sched/sched_wakeup/enable" },
{ OPT, "events/sched/sched_waking/enable" },
{ OPT, "events/sched/sched_blocked_reason/enable" },
{ OPT, "events/sched/sched_cpu_hotplug/enable" },
{ OPT, "events/cgroup/enable" },
} },
{ "irq", "IRQ Events", 0, {
{ REQ, "events/irq/enable" },
{ OPT, "events/ipi/enable" },
} },
{ "i2c", "I2C Events", 0, {
{ REQ, "events/i2c/enable" },
{ REQ, "events/i2c/i2c_read/enable" },
{ REQ, "events/i2c/i2c_write/enable" },
{ REQ, "events/i2c/i2c_result/enable" },
{ REQ, "events/i2c/i2c_reply/enable" },
{ OPT, "events/i2c/smbus_read/enable" },
{ OPT, "events/i2c/smbus_write/enable" },
{ OPT, "events/i2c/smbus_result/enable" },
{ OPT, "events/i2c/smbus_reply/enable" },
} },
{ "freq", "CPU Frequency", 0, {
{ REQ, "events/power/cpu_frequency/enable" },
{ OPT, "events/power/clock_set_rate/enable" },
{ OPT, "events/power/cpu_frequency_limits/enable" },
} },
{ "membus", "Memory Bus Utilization", 0, {
{ REQ, "events/memory_bus/enable" },
} },
{ "idle", "CPU Idle", 0, {
{ REQ, "events/power/cpu_idle/enable" },
} },
{ "disk", "Disk I/O", 0, {
{ OPT, "events/f2fs/f2fs_sync_file_enter/enable" },
{ OPT, "events/f2fs/f2fs_sync_file_exit/enable" },
{ OPT, "events/f2fs/f2fs_write_begin/enable" },
{ OPT, "events/f2fs/f2fs_write_end/enable" },
{ OPT, "events/ext4/ext4_da_write_begin/enable" },
{ OPT, "events/ext4/ext4_da_write_end/enable" },
{ OPT, "events/ext4/ext4_sync_file_enter/enable" },
{ OPT, "events/ext4/ext4_sync_file_exit/enable" },
{ REQ, "events/block/block_rq_issue/enable" },
{ REQ, "events/block/block_rq_complete/enable" },
} },
{ "mmc", "eMMC commands", 0, {
{ REQ, "events/mmc/enable" },
} },
{ "load", "CPU Load", 0, {
{ REQ, "events/cpufreq_interactive/enable" },
} },
{ "sync", "Synchronization", 0, {
{ REQ, "events/sync/enable" },
} },
{ "workq", "Kernel Workqueues", 0, {
{ REQ, "events/workqueue/enable" },
} },
{ "memreclaim", "Kernel Memory Reclaim", 0, {
{ REQ, "events/vmscan/mm_vmscan_direct_reclaim_begin/enable" },
{ REQ, "events/vmscan/mm_vmscan_direct_reclaim_end/enable" },
{ REQ, "events/vmscan/mm_vmscan_kswapd_wake/enable" },
{ REQ, "events/vmscan/mm_vmscan_kswapd_sleep/enable" },
{ REQ, "events/lowmemorykiller/enable" },
} },
{ "regulators", "Voltage and Current Regulators", 0, {
{ REQ, "events/regulator/enable" },
} },
{ "binder_driver", "Binder Kernel driver", 0, {
{ REQ, "events/binder/binder_transaction/enable" },
{ REQ, "events/binder/binder_transaction_received/enable" },
{ OPT, "events/binder/binder_set_priority/enable" },
} },
{ "binder_lock", "Binder global lock trace", 0, {
{ OPT, "events/binder/binder_lock/enable" },
{ OPT, "events/binder/binder_locked/enable" },
{ OPT, "events/binder/binder_unlock/enable" },
} },
{ "pagecache", "Page cache", 0, {
{ REQ, "events/filemap/enable" },
} },
};

這些categories可以簡單分為3類:

  • 1、使用內核ftrace的trace event實現的內核事件。例如"sched",.sysfiles欄位指定了trace event的路徑;

  • 2、使用Trace類實現的用戶態事件(app/java framework/native)。例如"input",.tags欄位指定了trace tag;

  • 3、service類型的事件。如k_coreServiceCategory和k_pdxServiceCategory;

具體目標機可能只支持categories全集中的一部分,所以atrace會對命令參數中指定的categories本機是否支持進行判斷:

main -> setCategoryEnable:

static bool setCategoryEnable(const char* name, bool enable)
{
/* (1) 逐個判斷命令參數中的categories,本機是否支持 */
for (size_t i = 0; i < arraysize(k_categories); i++) {
const TracingCategory& c = k_categories[i];
if (strcmp(name, c.name) == 0) {

/* (2) 判斷categories是否支持 */
if (isCategorySupported(c)) {
g_categoryEnables[i] = enable;
return true;
} else {
/* (3) 如果本機不支持categories,原因是否是沒有root權限? */
if (isCategorySupportedForRoot(c)) {
fprintf(stderr, "error: category \"%s\" requires root "
"privileges.\n", name);

/* (4) 本機不支持categories,報錯 */
} else {
fprintf(stderr, "error: category \"%s\" is not supported "
"on this device.\n", name);
}
return false;
}
}
}
fprintf(stderr, "error: unknown tracing category \"%s\"\n", name);
return false;
}



static bool isCategorySupported(const TracingCategory& category)
{
/* (2.1) 如果category是k_coreServiceCategory,判斷name是否為"core_services",判斷"ro.atrace.core.services"的property是否存在 */
if (strcmp(category.name, k_coreServiceCategory) == 0) {
return !android::base::GetProperty(k_coreServicesProp, "").empty;
}


/* (2.2) 如果category是k_pdxServiceCategory,判斷name是否為"pdx" */
if (strcmp(category.name, k_pdxServiceCategory) == 0) {
return true;
}

bool ok = category.tags != 0;
/* (2.2) 如果category是使用了內核的trace event,判斷trace4event相應的文件是否存在是否可寫 */
for (int i = 0; i < MAX_SYS_FILES; i++) {
const char* path = category.sysfiles[i].path;
bool req = category.sysfiles[i].required == REQ;
if (path != ) {
if (req) {
if (!fileIsWritable(path)) {
return false;
} else {
ok = true;
}
} else {
ok = true;
}
}
}
return ok;
}

我們也可以使用atrace命令的"--list_categories"選項,列出當前目標機支持的所有categories:

main -> listSupportedCategories:

static voidlistSupportedCategories
{
/* (1) 逐個遍歷k_categories數組中的所有category */
for (size_t i = 0; i < arraysize(k_categories); i++) {
const TracingCategory& c = k_categories[i];

/* (2) 判斷當前目標機系統是否支持 */
if (isCategorySupported(c)) {
printf(" %10s - %s\n", c.name, c.longname);
}
}
}

3.2、'--async_start'

當atrace命令使用了'--async_start'選項以後,會啟動指定的所有categories:

static bool setUpTrace
{
bool ok = true;

// Set up the tracing options.
/* (4.1.1) 如果categories是使用文件指定的,解析文件中的categories並使能 */
ok &= setCategoriesEnableFromFile(g_categoriesFile);

/* (4.1.2) 根據'-c'選項,配置trace event的overwrite:"/sys/kernel/debug/tracing/options/overwrite" */
ok &= setTraceOverwriteEnable(g_traceOverwrite);

/* (4.1.3) 根據'-b'選項,配置trace event的buffer size:"/sys/kernel/debug/tracing/buffer_size_kb" */
ok &= setTraceBufferSizeKB(g_traceBufferSizeKB);
// TODO: Re-enable after stabilization
//ok &= setCmdlineSize;

/* (4.1.4) 根據選項,配置:"/sys/kernel/debug/tracing/trace_clock" */
ok &= setClock;

/* (4.1.5) 使能:"/sys/kernel/debug/tracing/options/print-tgid" */
ok &= setPrintTgidEnableIfPresent(true);

/* (4.1.6) 根據'-k'選項指定的需要trace的kernel 函數,
配置當前tracer為function_graph:"/sys/kernel/debug/tracing/current_tracer"
在filter中配置需要trace的函數名:"/sys/kernel/debug/tracing/set_ftrace_filter"
*/
ok &= setKernelTraceFuncs(g_kernelTraceFuncs);

// Set up the tags property.
uint64_t tags = 0;
/* (4.1.7) 把categories中需要trace的tag進行'或'操作,並配置到"debug.atrace.tags.enableflags"的property中 */
for (size_t i = 0; i < arraysize(k_categories); i++) {
if (g_categoryEnables[i]) {
const TracingCategory &c = k_categories[i];
tags |= c.tags;
}
}
ok &= setTagsProperty(tags);

bool coreServicesTagEnabled = false;
for (size_t i = 0; i < arraysize(k_categories); i++) {
if (strcmp(k_categories[i].name, k_coreServiceCategory) == 0) {
coreServicesTagEnabled = g_categoryEnables[i];
}

// Set whether to poke PDX services in this session.
if (strcmp(k_categories[i].name, k_pdxServiceCategory) == 0) {
g_tracePdx = g_categoryEnables[i];
}
}

/* (4.1.8) 根據'-a'選項指定的需要trace的app name,最多指定16個,
配置到"debug.atrace.app_%d"的property當中
*/
std::string packageList(g_debugAppCmdLine);
if (coreServicesTagEnabled) {
if (!packageList.empty) {
packageList += ",";
}
packageList += android::base::GetProperty(k_coreServicesProp, "");
}
ok &= setAppCmdlineProperty(&packageList[0]);

/* (4.1.9) 讓defaultServiceManager中的所有service重新讀取property,使修改的配置生效 */
ok &= pokeBinderServices;

/* (4.1.10) 讓::android::hardware::defaultServiceManager中的所有service重新讀取property,使修改的配置生效 */
pokeHalServices;

/* (4.1.11) 如果categories指定了"pdx",讓pdx service重新讀取property,使修改的配置生效 */
if (g_tracePdx) {
ok &= ServiceUtility::PokeServices;
}

// Disable all the sysfs enables. This is done as a separate loop from
// the enables to allow the same enable to exist in multiple categories.
/* (4.1.12) disable掉所有的trace event */
ok &= disableKernelTraceEvents;

// Enable all the sysfs enables that are in an enabled category.
/* (4.1.13) 根據categories數組中的配置,使能相應的trace event */
for (size_t i = 0; i < arraysize(k_categories); i++) {
if (g_categoryEnables[i]) {
const TracingCategory &c = k_categories[i];
for (int j = 0; j < MAX_SYS_FILES; j++) {
const char* path = c.sysfiles[j].path;
bool required = c.sysfiles[j].required == REQ;
if (path != ) {
if (fileIsWritable(path)) {
ok &= setKernelOptionEnable(path, true);
} else if (required) {
fprintf(stderr, "error writing file %s\n", path);
ok = false;
}
}
}
}
}

return ok;
}
static bool startTrace
{
return setTracingEnabled(true);
}



// Enable or disable kernel tracing.
static bool setTracingEnabled(bool enable)
{
/* (4.2.1) 打開總開關:"/sys/kernel/debug/tracing/tracing_on" */
return setKernelOptionEnable(k_tracingOnPath, enable);
}

3.3、'--async_dump'

當atrace命令使用了'--async_dump'選項以後,會dump出trace內容:

main
{

/* (6) "-- async_dump" option命令的實際執行動作 */
if (ok && traceDump) {
if (!g_traceAborted) {
printf(" done\n");
fflush(stdout);

/* (6.1) 默認的dump信息輸出句柄 */
int outFd = STDOUT_FILENO;

/* (6.2) 使用'-o'選項指定的dump信息輸出句柄 */
if (g_outputFile) {
outFd = open(g_outputFile, O_WRONLY | O_CREAT | O_TRUNC, 0644);
}
if (outFd == -1) {
printf("Failed to open '%s', err=%d", g_outputFile, errno);
} else {
dprintf(outFd, "TRACE:\n");

/* (6.3) 讀出"/sys/kernel/debug/tracing/trace"的內容dump到輸出句柄 */
dumpTrace(outFd);
if (g_outputFile) {
close(outFd);
}
}
} else {
printf("\ntrace aborted.\n");
fflush(stdout);
}
clearTrace;
} else if (!ok) {
fprintf(stderr, "unable to start tracing\n");
}

}



3.4、'--async_stop'

當atrace命令使用了'--async_stop'選項以後,會停止所有categories的trace信息抓取:

static void stopTrace
{
setTracingEnabled(false);
}



static bool setTracingEnabled(bool enable)
{
/* (5.1) 關閉總開關:"/sys/kernel/debug/tracing/tracing_on" */
return setKernelOptionEnable(k_tracingOnPath, enable);
}

4、目標機Trace類的使用(java/c++)

atrace只是一個控制命令,實際的trace數據路徑是這樣分工的:

  • 1、內核trace信息,通過trace event記錄到ftrace的buffer中;

  • 2、用戶態(app/java framework/native)是通過使用Trace類來記錄trace信息的,實際上是通過"/sys/kernel/debug/tracing/trace_marker"接口記錄到內核ftracebuffer當中的。

我們查看atrace dump出來的原始信息,可以看到對應的"tracing_mark_write"欄位:

 ndroid.systemui-1856 ( 1856) [001] ...1 89888.553074: tracing_mark_write: S|1856|animator:alpha|62928891
ndroid.systemui-1856 ( 1856) [001] ...1 89888.553096: tracing_mark_write: F|1856|animator:alpha|62928891
ndroid.systemui-1856 ( 1856) [001] ...1 89888.553110: tracing_mark_write: S|1856|animator:scaleX|37096049
ndroid.systemui-1856 ( 1856) [001] ...1 89888.553131: tracing_mark_write: F|1856|animator:scaleX|37096049

這樣內核態和用戶態的trace數據最後都寫入到了ftrace buffer當中,也實現了時間同步,最後通過"/sys/kernel/debug/tracing/trace"接口讀出。

在用戶態的各個層次是這樣使用Trace類的:

4.1、app

import android.os.Trace;
Trace.beginSection(String sectionName)
Trace.EndSection

詳細例子可以參考:systrace

public class MyAdapter extends RecyclerView.Adapter<MyViewHolder> {
...
@Override
public MyViewHolder onCreateViewHolder(ViewGroup parent, int viewType) {
Trace.beginSection(&quot;MyAdapter.onCreateViewHolder&quot;);
MyViewHolder myViewHolder;
try {
myViewHolder = MyViewHolder.newInstance(parent);
} finally {
// In &#39;try...catch&#39; statements, always call <code><a href="/reference/android/os/Trace.html#endSection">endSection</a></code>
// in a &#39;finally&#39; block to ensure it is invoked even when an exception
// is thrown.
Trace.endSection;
}
return myViewHolder;
}</p>

<p>@Override
public void onBindViewHolder(MyViewHolder holder, int position) {
Trace.beginSection(&quot;MyAdapter.onBindViewHolder&quot;);
try {
try {
Trace.beginSection(&quot;MyAdapter.queryDatabase&quot;);
RowItem rowItem = queryDatabase(position);
mDataset.add(rowItem);
} finally {
Trace.endSection;
}
holder.bind(mDataset.get(position));
} finally {
Trace.endSection;
}
}
...
}

然後通過python systrace.py --app=sectionName 指定apk,或者通過ddms選擇指定apk,抓取systrace分析。

4.2、Java framework

import android.os.Trace;
Trace.traceBegin(long traceTag, String methodName)
Trace.traceEnd(long traceTag)

4.3、Native framework

#include <cutils/trace.h>
ATRACE_CALL

5、主機Trace-Viewer的實現(js)

從systrace python文件的解析我們可以看到,最後把trace數據和'prefix.html'、'suffix.html'、'systrace_trace_viewer.html'合成一個'trace.html'文件,使用chrome瀏覽器打開'trace.html'就可以非常方便的以圖形化的形式來查看和分析trace數據。

這裡面的關鍵就是Trace-Viewer,核心是其中的js腳本。該項目的github主頁。

本人js功底太差,看了一堆資料還是不得要領。如有感興趣的同學可以繼續深入研究,Trace-Viewer功能非常強大,用來做trace數據的分析和實現再合適不過了,如果可以利用起來自己修改非常強大。

關鍵字: