3.8 KiB
Recorded Call Graph Metrics
also known as call graph tracing.
Execute a python program and for each call being made, record the call and callee. This allows us to compare call graph resolution from static analysis with actual data -- that is, can we statically determine the target of each actual call correctly.
Using the call graph tracer does incur a heavy toll on the performance. Expect 10x longer to execute the program.
Number of calls recorded vary a little from run to run. I have not been able to pinpoint why.
Running against real projects
Currently it's possible to gather metrics from traced runs of the standard test suite of a few projects (defined in projects.json): youtube-dl, wcwidth, and flask.
To run against all projects, use
$ ./helper.sh all $(./helper.sh projects)
To view the results, use
$ head -n 100 projects/*/Metrics.txt
Expanding set of projects
It should be fairly straightforward to expand the set of projects. Most projects use tox for running their tests against multiple python versions. I didn't look into any kind of integration, but have manually picked out the instructions required to get going.
As an example, compare the tox.ini file from flask with the configuration
"flask": {
"repo": "https://github.com/pallets/flask.git",
"sha": "21c3df31de4bc2f838c945bd37d185210d9bab1a",
"module_command": "pytest -c /dev/null tests examples",
"setup": [
"pip install -r requirements/tests.txt",
"pip install -q -e examples/tutorial[test]",
"pip install -q -e examples/javascript[test]"
]
}
Local development
Setup
-
Ensure you have at least Python 3.7
-
Create virtual environment
python3 -m venv venvand activate it -
Install dependencies
pip install -r --upgrade requirements.txt -
Install this codebase as an editable package
pip install -e . -
Setup your editor. If you're using VS Code, create a new project for this folder, and use these settings for correct autoformatting of code on save:
{
"python.pythonPath": "venv/bin/python",
"python.linting.enabled": true,
"python.linting.flake8Enabled": true,
"python.formatting.provider": "black",
"editor.formatOnSave": true,
"[python]": {
"editor.codeActionsOnSave": {
"source.organizeImports": true
}
},
"python.autoComplete.extraPaths": [
"src"
]
}
- Enjoy writing code, and being able to run
cg-traceon your command line 🎉
Using it
After following setup instructions above, you should be able to reproduce the example trace by running
cg-trace --xml example/simple.xml example/simple.py
You can also run traces for all tests and build a database by running tests/create-test-db.sh. Then run the queries inside the ql/ directory.
Tracing Limitations
Multi-threading
Should be possible by using threading.setprofile, but that hasn't been done yet.
Code that uses sys.setprofile
Since that is our mechanism for recording calls, any code that uses sys.setprofile will not work together with the call-graph tracer.
Class instantiation
Does not always fire off an event in the sys.setprofile function (neither in sys.settrace), so is not recorded. Example:
r = range(10)
when disassembled (python -m dis <file>):
9 48 LOAD_NAME 7 (range)
50 LOAD_CONST 5 (10)
52 CALL_FUNCTION 1
54 STORE_NAME 8 (r)
but no event 😞