codeql

mirror of https://github.com/github/codeql.git synced 2026-04-08 00:24:03 +02:00

Files

Rasmus Lerchedahl Petersen 11c71fdd18 Python: remove EssaNodes

This commit removes SSA nodes from the data flow graph. Specifically, for a definition and use such as
```python
  x = expr
  y = x + 2
```
we used to have flow from `expr` to an SSA variable representing x and from that SSA variable to the use of `x` in the definition of `y`. Now we instead have flow from `expr` to the control flow node for `x` at line 1 and from there to the control flow node for `x` at line 2.

Specific changes:
- `EssaNode` from the data flow layer no longer exists.
- Several glue steps between `EssaNode`s and `CfgNode`s have been deleted.
- Entry nodes are now admitted as `CfgNodes` in the data flow layer (they were filtered out before).
- Entry nodes now have a new `toString` taking into account that the module name may be ambigous.
- Some tests have been rewritten to accomodate the changes, but only `python/ql/test/experimental/dataflow/basic/maximalFlowsConfig.qll` should have semantic changes.
- Comments have been updated
- Test output has been updated, but apart from `python/ql/test/experimental/dataflow/basic/maximalFlows.expected` only `python/ql/test/experimental/dataflow/typetracking-summaries/summaries.py` should have a semantic change. This is a bonus fix, probably meaning that something was never connected up correctly.

2023-11-20 21:35:32 +01:00

NaiveModel.expected

Python: remove EssaNodes

2023-11-20 21:35:32 +01:00

NaiveModel.ql

Python: Move framework tests out of experimental

2021-03-19 15:51:54 +01:00

ProperModel.expected

Python: remove EssaNodes

2023-11-20 21:35:32 +01:00

ProperModel.ql

Python: Move framework tests out of experimental

2021-03-19 15:51:54 +01:00

README.md

Python: Move framework tests out of experimental

2021-03-19 15:51:54 +01:00

SharedCode.qll

Python: Add TypeTrackingNode

2021-07-02 18:00:33 +00:00

test.py

spelling: across

2022-10-11 00:23:35 -04:00

README.md

This test illustrates that you need to be very careful when adding additional taint-steps or dataflow steps using TypeTracker.

The basic setup is that we're modeling the behavior of a (fictitious) external library class MyClass, and (fictitious) source of such an instance (the source function).

class MyClass:
    def __init__(self, value):
        self.value = value

    def get_value(self):
        return self.value

We want to extend our analysis to obj.get_value() is also tainted if obj is a tainted instance of MyClass.

The actual type-tracking is done in SharedCode.qll, but it's the way we use it that matters.

In NaiveModel.ql we add an additional taint step from an instance of MyClass to calls of the bound method get_value (that we have tracked). It provides us with the correct results, but the path explanations are not very useful, since we are now able to cross functions in one step.

In ProperModel.ql we split the additional taint step in two:

from tracked obj that is instance of MyClass, to obj.get_value but only exactly where the attribute is accessed (by an AttrNode). This is important, since if we allowed <any tracked qualifier>.get_value we would again be able to cross functions in one step.
from tracked get_value bound method to calls of it, but only exactly where the call is (by a CallNode). for same reason as above.

Try running the queries in VS Code to see the difference

Possible improvements

Using AttrNode directly in the code here means there is no easy way to add getattr support too all such predicates. Not really sure how to handle this in a generalized way though :|