Commit Graph

2905 Commits

Author SHA1 Message Date
Taus
b2cc91369a Python: Fix bad join in firstUse
This was what it looked like (at the point when I killed the evaluation):

```
Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#c5fa2be7#ff/2@i1#be98bwif after 1m50s:
274000     ~7%     {4} r1 = SCAN SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#c5fa2be7#ffff OUTPUT In.1, In.0 'def', In.2, In.3
2731768000 ~1%     {7} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::variableSourceUse#c5fa2be7#ffff ON FIRST 1 OUTPUT Rhs.0, Lhs.2, Lhs.3, Rhs.2, Rhs.3, Rhs.1 'use', Lhs.1 'def'
178000     ~4%     {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#c5fa2be7#fffff ON FIRST 5 OUTPUT Lhs.6 'def', Lhs.5 'use'
                    return r3
```

And this is what it looks like now:

```
Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#c5fa2be7#ff/2@i1#f9d6ewsi after 207ms:
931353  ~2%     {4} r1 = SCAN SsaCompute::SsaComputeImpl::AdjacentUsesImpl::variableSourceUse#c5fa2be7#ffff OUTPUT In.0, In.2, In.3, In.1 'use'
1050477 ~0%     {4} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#c5fa2be7#fffff_03412#join_rhs ON FIRST 3 OUTPUT Lhs.0, Rhs.3, Rhs.4, Lhs.3 'use'
506626  ~0%     {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#c5fa2be7#ffff_1230#join_rhs ON FIRST 3 OUTPUT Rhs.3 'def', Lhs.3 'use'
                return r3
```
2022-04-25 14:33:31 +00:00
Taus
49233268a9 Python: Fix bad join in getValue
We were building essentially a CP of all control flow nodes:

```
Tuple counts for Essa::AssignmentDefinition::getValue#dispred#f0820431#ff/2@dd1f67vl after 2m45s:
733365     ~6%     {3} r1 = JOIN Essa::TEssaNodeDefinition#24e22a14#ffff_30#join_rhs WITH Essa::EssaNodeDefinition::getDefiningNode#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Rhs.0
376588     ~0%     {2} r2 = JOIN r1 WITH SsaDefinitions::SsaSource::assignment_definition#9197156e#fff ON FIRST 2 OUTPUT Lhs.2 'this', Rhs.2 'result'
376588     ~0%     {3} r3 = JOIN r2 WITH Essa::TEssaNodeDefinition#24e22a14#ffff_30#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'this', Lhs.1 'result'
6965593033 ~2%     {3} r4 = JOIN r3 WITH project#SsaDefinitions::SsaSource::assignment_definition#9197156e ON FIRST 1 OUTPUT Lhs.1 'this', Rhs.1, Lhs.2 'result'
376588     ~0%     {2} r5 = JOIN r4 WITH Essa::EssaNodeDefinition::getDefiningNode#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.0 'this', Lhs.2 'result'
                    return r5
```

We first tried preventing the join on `result`, but this caused the
characteristic predicate to blow up instead. Finally, we figured just
putting the `value` part in a field would be sufficient, and this did
the trick.
2022-04-25 14:28:00 +00:00
Anders Schack-Mulligen
c06efa1f42 Dataflow: Sync. 2022-04-25 13:11:04 +02:00
Tom Hvitved
bffa8fa7cb Merge pull request #8641 from hvitved/dataflow/interpret-read-store
Data flow: Introduce `ContentSet`
2022-04-25 12:17:34 +02:00
Tom Hvitved
2466288656 Data flow: Simplify revFlowStore 2022-04-25 10:11:54 +02:00
Tom Hvitved
cf0a1e748a Add change notes 2022-04-25 09:17:40 +02:00
Erik Krogh Kristensen
45080e7777 PY: add missing qldoc 2022-04-22 15:30:31 +02:00
Tom Hvitved
bc6ee10583 Data flow: Sync files 2022-04-22 15:10:00 +02:00
Tom Hvitved
b033f107df Merge remote-tracking branch 'upstream/main' into dataflow/interpret-read-store 2022-04-22 14:35:02 +02:00
Erik Krogh Kristensen
ff73dbc35c delete redundant imports 2022-04-22 12:55:28 +02:00
Erik Krogh Kristensen
a96489b23d delete duplicate imports 2022-04-22 12:41:30 +02:00
Rasmus Wriedt Larsen
650d57083b Python: Recognize path arguments to pathlib methods 2022-04-22 11:01:59 +02:00
Rasmus Wriedt Larsen
059dea713d Python: Fix os.path.samefile modeling 2022-04-22 11:01:59 +02:00
github-actions[bot]
1aecfc67c2 Post-release preparation for codeql-cli-2.9.0 2022-04-21 19:22:19 +00:00
Dave Bartolomeo
b224f81e24 Fix formatting in change log 2022-04-21 10:59:38 -04:00
Dave Bartolomeo
fb710cd944 Fix formatting in change log 2022-04-21 10:59:03 -04:00
github-actions[bot]
eeaf233c29 Release preparation for version 2.9.0 2022-04-21 14:49:00 +00:00
Erik Krogh Kristensen
aec8413487 PY: mention newtype constructors in API graph label classes 2022-04-20 18:38:44 +02:00
Anders Schack-Mulligen
677c436e99 Merge pull request #8703 from aschackmull/dataflow/revert-state-in-out-barriers
Dataflow: Revert support for flow-state based in-/out-barriers
2022-04-20 14:54:02 +02:00
Rasmus Wriedt Larsen
bb6969a175 Merge branch 'main' into promote-xxe 2022-04-20 13:42:02 +02:00
Rasmus Wriedt Larsen
888a38c060 Python: Add change-note 2022-04-20 11:46:09 +02:00
Rasmus Wriedt Larsen
d70f247001 Python: More private import python 2022-04-20 11:42:13 +02:00
Rasmus Wriedt Larsen
084c8eb22e Python: Don't re-export python under DataFlow:: 2022-04-20 11:42:10 +02:00
yoff
0c7130602a Merge pull request #8731 from RasmusWL/delete-old-readme
Python: Delete old dataflow readme
2022-04-20 10:36:12 +02:00
yoff
a66153d73e Merge pull request #8733 from RasmusWL/split-dataflow-private
Python: Split `DataFlowPrivate`
2022-04-20 10:21:05 +02:00
Anders Schack-Mulligen
48fbbf2531 Dataflow: Add change notes. 2022-04-19 15:29:35 +02:00
Anders Schack-Mulligen
b521d64156 Dataflow: Sync. 2022-04-19 15:29:35 +02:00
Mathias Vorreiter Pedersen
91b413d59f Dataflow: Sync identical files. 2022-04-19 09:57:21 +01:00
Rasmus Wriedt Larsen
a271e17f04 Python: Move dataflow call-graph to new qll file
Seems like all other languages use a file called `DataFlowDispatch`. I
want to introduce a setup where we have (old) points-to based approach
in one file, and can develop a type-tracking based approach in another
file, so that's the reason for the naming differing slightly.

For which predicates go in which files, I have taken mostly inspiration
from C# and Ruby.
2022-04-13 15:56:57 +02:00
Rasmus Wriedt Larsen
3d15205084 Python: Autoformat 2022-04-13 15:36:16 +02:00
Rasmus Wriedt Larsen
ded4e9250c Python: Move IterableUnpacking to own file 2022-04-13 15:36:05 +02:00
Rasmus Wriedt Larsen
c740894408 Python: Move MatchUnpacking to own file
I had hoped that git would be able to see this as a rename, and
therefore I haven't done autoformat
2022-04-13 15:36:05 +02:00
Rasmus Wriedt Larsen
2e60172bfa Python: Delete old dataflow readme 2022-04-13 12:09:38 +02:00
Rasmus Wriedt Larsen
bdadf2b445 Python: Fix warnings 2022-04-13 10:30:59 +02:00
Rasmus Wriedt Larsen
4927f0018b Merge branch 'main' into django-filefield-uploadto 2022-04-13 10:22:28 +02:00
Edoardo Pirovano
f25618eed6 Bump minor version of all packs 2022-04-08 15:38:58 +01:00
Edoardo Pirovano
ce82c54b94 Merge branch 'main' into edoardo/3.5-mergeback 2022-04-08 15:30:58 +01:00
Rasmus Wriedt Larsen
8191be9d75 Python: Move last XXE/XML bomb out of experimental 2022-04-07 15:37:56 +02:00
Anders Schack-Mulligen
4eaec3953a Merge pull request #8694 from aschackmull/dataflow/cleanup-unused
Dataflow: Cleanup unused column
2022-04-07 15:16:27 +02:00
Anders Schack-Mulligen
7beed570f2 Dataflow: Sync. 2022-04-07 13:53:48 +02:00
Erik Krogh Kristensen
7e4c76c63b revert API-graph change in Flask.qll 2022-04-07 13:52:14 +02:00
Erik Krogh Kristensen
bdfd6bdc79 fix a ql/field-only-used-in-charpred warning 2022-04-07 13:52:14 +02:00
Erik Krogh Kristensen
50bfc8eaa0 refactor uses of API::Node::getAUse() that should have been something else 2022-04-07 13:52:13 +02:00
Erik Krogh Kristensen
4e5afab082 refactor more python type-trackers to API-graphs 2022-04-07 13:51:40 +02:00
Rasmus Wriedt Larsen
7728b6cf1b Python: Change XmlBomb vulnerability kind 2022-04-07 10:56:35 +02:00
Rasmus Wriedt Larsen
f2f0873d91 Python: Use new API::CallNode for XML constant check
This also means that the detection of the values passed to these keyword
arguments will no longer just be from a local scope, but can also be
across function boundaries.
2022-04-06 15:49:06 +02:00
Rasmus Wriedt Larsen
c784f15762 Python: Rename more XML classes to follow convention
- `XMLEtree` to `XmlEtree`
- `XMLSax` to `XmlSax`
- `LXML` to `Lxml`
- `XMLParser` to `XmlParser`
2022-04-06 15:44:54 +02:00
Rasmus Wriedt Larsen
f8f41428df Python: Minor refactor for FlaskViewClass 2022-04-06 15:15:42 +02:00
Rasmus Wriedt Larsen
1c2323eb85 Python: Refactor how we find a Class from API::Node
Using `getAnImmediateUse` might give better performance than `getAUse`.

Since all the changed code is about `API::Node`s that are found after
doing `.getASubclass*()`, this change is OK.

It's also nice to align how we actually do this.
2022-04-06 15:12:24 +02:00
Tom Hvitved
4099d1318f Data flow: Tweak two join-orders
Before
```
[2022-04-06 13:19:29] (96s) Tuple counts for DataFlowImpl2::Stage1::revFlowConsCand#7ad53399#ff/2@i14#aa10f2wi after 4.4s:
                      10681    ~0%     {2} r1 = SCAN DataFlowImpl2::Stage1::revFlow#7ad53399#fff#prev_delta OUTPUT In.0, In.2 'config'
                      982      ~1%     {3} r2 = JOIN r1 WITH DataFlowImpl2::readSet#7ad53399#ffff_2301#join_rhs ON FIRST 2 OUTPUT Rhs.3, Lhs.1 'config', Rhs.2
                      83691528 ~2%     {3} r3 = JOIN r2 WITH DataFlowPublic::ContentSet::getAReadContent#dispred#f0820431#ff ON FIRST 1 OUTPUT Lhs.1 'config', Lhs.2, Rhs.1 'c'
                      83581763 ~2%     {3} r4 = r3 AND NOT DataFlowImpl2::Stage1::revFlowConsCand#7ad53399#ff#prev(Lhs.2 'c', Lhs.0 'config')
                      83581763 ~0%     {3} r5 = SCAN r4 OUTPUT In.2 'c', In.0 'config', In.1
                      0        ~0%     {3} r6 = JOIN r5 WITH DataFlowImpl2::Stage1::fwdFlowConsCand#7ad53399#ff ON FIRST 2 OUTPUT Lhs.2, Lhs.1 'config', Lhs.0 'c'
                      0        ~0%     {2} r7 = JOIN r6 WITH DataFlowImpl2::Stage1::fwdFlow#7ad53399#2#fff_02#join_rhs ON FIRST 2 OUTPUT Lhs.2 'c', Lhs.1 'config'
                                       return r7
```

After
```
[2022-04-06 13:44:38] (6s) Tuple counts for DataFlowImpl2::Stage1::revFlowConsCand#7ad53399#ff/2@i14#5abbf2wn after 6ms:
                      10681  ~0%     {2} r1 = SCAN DataFlowImpl2::Stage1::revFlow#7ad53399#fff#prev_delta OUTPUT In.0, In.2 'config'
                      982    ~1%     {3} r2 = JOIN r1 WITH DataFlowImpl2::readSet#7ad53399#ffff_2301#join_rhs ON FIRST 2 OUTPUT Rhs.3, Lhs.1 'config', Rhs.2
                      109765 ~0%     {3} r3 = JOIN r2 WITH DataFlowImpl2::Stage1::fwdFlowConsCandSet#7ad53399#fff#reorder_0_2_1 ON FIRST 2 OUTPUT Lhs.1 'config', Lhs.2, Rhs.2 'c'
                      0      ~0%     {3} r4 = r3 AND NOT DataFlowImpl2::Stage1::revFlowConsCand#7ad53399#ff#prev(Lhs.2 'c', Lhs.0 'config')
                      0      ~0%     {3} r5 = SCAN r4 OUTPUT In.1, In.0 'config', In.2 'c'
                      0      ~0%     {2} r6 = JOIN r5 WITH DataFlowImpl2::Stage1::fwdFlow#7ad53399#2#fff_02#join_rhs ON FIRST 2 OUTPUT Lhs.2 'c', Lhs.1 'config'
                                     return r6
```
2022-04-06 13:52:30 +02:00