Taus
a73e382dfe
Python: Prevent bad join in hashlib model
...
I'm not entirely sure what triggered this bad join order, but some
combination of the use of abstract classes and the exclusion of `new`
caused this to go really wrong:
```
WeakSensitiveDataHashing.ql-15:Stdlib::Stdlib::HashlibDataPassedToHashClass#class#ffff ......... 15.5s
```
with the following tuple counts:
```
[2021-07-12 13:20:15] (16s) Tuple counts for Stdlib::Stdlib::HashlibDataPassedToHashClass#class#ffff/4@217901:
148810 ~3% {3} r1 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArg_dispred#fff ON FIRST 1 OUTPUT "hashlib", Lhs.1 'node', Lhs.0 'this'
148810 ~4% {3} r2 = JOIN r1 WITH ApiGraphs::API::Impl::MkModuleImport#ff@staged_ext ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'node', Lhs.2 'this'
7589310 ~486% {4} r3 = JOIN r2 WITH ApiGraphs::API::Impl::edge#2#fff@staged_ext ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.2 'this', Rhs.1, InverseAppend("getMember(\"","\")",Rhs.1)
6994070 ~490% {4} r4 = SELECT r3 ON In.3 != "new"
6994070 ~4503% {2} r5 = SCAN r4 OUTPUT In.1 'this', In.0 'node'
22 ~4% {3} r6 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArgByName_dispred#fff ON FIRST 1 OUTPUT "hashlib", Lhs.1 'node', Lhs.0 'this'
22 ~0% {3} r7 = JOIN r6 WITH ApiGraphs::API::Impl::MkModuleImport#ff@staged_ext ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'node', Lhs.2 'this'
1122 ~437% {4} r8 = JOIN r7 WITH ApiGraphs::API::Impl::edge#2#fff@staged_ext ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.2 'this', Rhs.1, InverseAppend("getMember(\"","\")",Rhs.1)
1034 ~460% {4} r9 = SELECT r8 ON In.3 != "new"
1034 ~4549% {2} r10 = SCAN r9 OUTPUT In.1 'this', In.0 'node'
6995104 ~4503% {2} r11 = r5 UNION r10
5213851 ~4683% {3} r12 = JOIN r11 WITH ApiGraphs::API::Node::getACall_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'hashClass', Lhs.1 'node', Lhs.0 'this'
6478480 ~4646% {6} r13 = JOIN r12 WITH ApiGraphs::API::Impl::edge#2#fff_201#join_rhs ON FIRST 1 OUTPUT "hashlib", Rhs.1, Lhs.1 'node', Lhs.2 'this', Lhs.0 'hashClass', Rhs.2
1410 ~4693% {5} r14 = JOIN r13 WITH ApiGraphs::API::Impl::MkModuleImport#ff@staged_ext ON FIRST 2 OUTPUT Lhs.2 'node', Lhs.3 'this', Lhs.4 'hashClass', Lhs.5, InverseAppend("getMember(\"","\")",Lhs.5)
1222 ~4540% {5} r15 = SELECT r14 ON In.4 'hashName' != "new"
1222 ~4540% {4} r16 = SCAN r15 OUTPUT In.1 'this', In.4 'hashName', In.2 'hashClass', In.0 'node'
```
By factoring out the insides, the biggest iteration now looks like
```
[2021-07-12 14:17:36] (0s) Tuple counts for Stdlib::Stdlib::HashlibDataPassedToHashClass#class#ffff/4@85bb21:
148810 ~0% {2} r1 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArg_dispred#fff ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.0 'this'
148810 ~0% {2} r2 = JOIN r1 WITH Stdlib::Stdlib::hashlibMember#ff#nonempty CARTESIAN PRODUCT OUTPUT Lhs.1 'this', Lhs.0 'node'
22 ~0% {2} r3 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArgByName_dispred#fff ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.0 'this'
22 ~0% {2} r4 = JOIN r3 WITH Stdlib::Stdlib::hashlibMember#ff#nonempty CARTESIAN PRODUCT OUTPUT Lhs.1 'this', Lhs.0 'node'
148832 ~0% {2} r5 = r2 UNION r4
110933 ~2% {3} r6 = JOIN r5 WITH ApiGraphs::API::Node::getACall_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'hashClass', Lhs.1 'node', Lhs.0 'this'
26 ~0% {4} r7 = JOIN r6 WITH Stdlib::Stdlib::hashlibMember#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.2 'this', Rhs.1 'hashName', Lhs.0 'hashClass', Lhs.1 'node'
return r7
```
(The tuple counts themselves are not directly comparable.)
2021-07-12 14:22:21 +00:00
Rasmus Wriedt Larsen
47f5c977cf
Python: Port py/stack-trace-exposure to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen
934007c811
Python: Port py/unsafe-deserialization to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen
7c71223f7f
Python: Port py/url-redirection to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen
b4c0b1b525
Python: Port py/reflective-xss to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen
62e4445f45
Python: Port py/command-line-injection to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen
7f53781ba7
Python: Port py/code-injection to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen
0be280c608
Python: Port py/sql-injection to use proper source/sink customization
2021-07-12 16:22:10 +02:00
Erik Krogh Kristensen
c4f5009917
make explicit calls to member predicates
...
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com >
2021-07-12 14:22:08 +02:00
Taus
1e79091120
Python: Fix typo
2021-07-12 11:33:52 +00:00
Taus
32062d83ad
Python: Make deprecation warning more prominent
2021-07-12 10:00:21 +00:00
Taus
200da983d9
Python: Add change note
2021-07-12 09:59:17 +00:00
Erik Krogh Kristensen
440e4b9a92
enable unicode support in the Python ReDoS query
2021-07-11 21:28:40 +02:00
Taus
a65d40e36f
Merge branch 'main' into python-add-typetrackingnode
2021-07-02 20:55:37 +02:00
Taus
55d822cc56
Python: Add TypeTrackingNode
...
Splits `ModuleVariableNode` away from `LocalSourceNode`, instead
creating a class `TypeTrackingNode` that encapsulates both of these.
This means we no longer have module variable nodes as part of
`LocalSourceNode` (which is good, since they have no "local" aspect to
them), and hence we can have `LocalSourceNode` inherit directly from
`ExprNode` (which makes the API a bit nicer).
Unfortunately these are breaking changes, so we can't actually fulfil
the above two desiderata until the `track` and `backtrack` methods on
`LocalSourceNode` have been fully deprecated. For this reason, we
preserve the present implementation of `LocalSourceNode`, and instead
lay the foundation for switching over in the future, by deprecating
`track` and `backtrack` on `LocalSourceNode`.
2021-07-02 18:00:33 +00:00
CodeQL CI
1d56748eed
Merge pull request #6200 from yoff/pythonJS-make-expbtlib-private
...
Approved by RasmusWL, esbena
2021-07-02 09:09:18 -07:00
CodeQL CI
a25933aa56
Merge pull request #5926 from RasmusWL/small-cleanups
...
Approved by tausbn
2021-07-02 04:59:54 -07:00
Rasmus Wriedt Larsen
81fab487a4
Python: Apply suggestions from code review
...
Co-authored-by: Taus <tausbn@github.com >
2021-07-02 13:27:41 +02:00
Rasmus Wriedt Larsen
22c155687e
Python: Fix code after removing getPostUpdateNode
2021-07-02 13:25:25 +02:00
Rasmus Wriedt Larsen
7a6eee50ff
Revert "Python: Add getPostUpdateNode to DataFlow::Node"
...
This reverts commit 9137f04bd3 .
2021-07-02 13:23:02 +02:00
Rasmus Wriedt Larsen
e56dfe75bd
Python: AttrRef getOjbect/1 -> accesses/2
...
See this thread for discussion:
https://github.com/github/codeql/pull/5926#discussion_r635384981
2021-07-02 13:21:12 +02:00
Rasmus Lerchedahl Petersen
6f2642607e
Python: make the import of RedosUtil public
...
This mirrors `SuperlinearBacktracking.qll`
An alternative is to keep it private and import it again
in the query files.
2021-07-02 12:32:04 +02:00
Rasmus Lerchedahl Petersen
77c329fb0f
Python/JS: Make much more private
2021-07-02 12:13:52 +02:00
Rasmus Lerchedahl Petersen
1fc9638486
Python: port redos .qhelp from js
2021-07-02 11:36:46 +02:00
Taus
a9c1d3ba86
Python: Clean up LocalSourceNode charpred
...
This results in the same set of nodes, but is a bit more clear about
the reasons why. For instance, `ModuleVariableNode`s are included
directly, and not in a roundabout way by virtue of not having flow to
them. This should hopefully be a bit more robust as well.
2021-07-01 19:12:18 +00:00
Taus
f151338def
Merge pull request #6198 from RasmusWL/fix-cleartext-logging
...
Python: Some minor fixes to `py/clear-text-logging-sensitive-data`
2021-07-01 18:28:25 +02:00
Taus
336c0662ef
Python: Remove pointless LocalSourceNodes
...
This gets rid of a large number of nodes that seemingly have no impact.
2021-07-01 15:02:31 +00:00
Rasmus Lerchedahl Petersen
eee56e0156
Python/JS: Make most of the new library private
2021-07-01 15:34:06 +02:00
Anders Schack-Mulligen
37f8794d01
Merge pull request #6165 from edoardopirovano/fix-regression
...
Performance: Improve join order in data flow library
2021-07-01 14:13:18 +02:00
Rasmus Wriedt Larsen
b0309dd321
Python: Limit SensitiveDataSources to prevent _some_ cross-talk
2021-07-01 12:08:12 +02:00
Rasmus Wriedt Larsen
f64e58a21c
Python: Fix a QLDoc for SensitiveDataSources
2021-07-01 12:05:59 +02:00
Rasmus Wriedt Larsen
d7e3ebb15c
Python: Add tests showing sensitive data cross-talk
2021-07-01 12:05:51 +02:00
Rasmus Wriedt Larsen
d9e2f504f8
Python: Fix clear text logging sink
...
No need to restrict it to arguments that are calls
2021-06-30 20:31:17 +02:00
Taus
e4af14638b
Merge pull request #6175 from yoff/python-port-ReDoS
...
Python: port ReDoS queries from Javascript
2021-06-30 16:26:07 +02:00
yoff
6a77b890af
Merge pull request #6155 from RasmusWL/port-cleartext-queries
...
Python: Port cleartext queries
2021-06-30 15:52:34 +02:00
Rasmus Lerchedahl Petersen
a176e6ac30
Python: comment out temporarily unused predicate
2021-06-30 15:28:31 +02:00
Rasmus Lerchedahl Petersen
45e30b0c06
Python: comment out temporarily unused predicate
2021-06-30 15:04:37 +02:00
Rasmus Lerchedahl Petersen
c306cee04e
Python: mimic JS file hierarchy
2021-06-30 15:03:22 +02:00
Rasmus Lerchedahl Petersen
651f8abba0
Python: Avoid multiple results for toString
2021-06-30 14:39:49 +02:00
Rasmus Wriedt Larsen
c2708176b1
Python: Support %-style formatting for MarkupSafe
2021-06-30 14:15:41 +02:00
Rasmus Wriedt Larsen
0a4efd0e86
Python: Add %-style formatting tests for MarkupSafe
2021-06-30 14:13:59 +02:00
Rasmus Wriedt Larsen
c84658dff1
Python: Use MethodCallNode for MarkupSafe string-format
2021-06-30 13:58:09 +02:00
Rasmus Wriedt Larsen
d6e8fafdbd
Python: Proper sorting in Frameworks.qll
2021-06-30 13:55:26 +02:00
Rasmus Wriedt Larsen
075953860b
Merge branch 'main' into markupsafe-modeling
2021-06-30 13:55:08 +02:00
Rasmus Lerchedahl Petersen
72986e1e28
Python: Add some comments on the booelan sweep
...
pattern
2021-06-30 12:50:36 +02:00
Rasmus Lerchedahl Petersen
52d91917aa
Merge branch 'python-port-ReDoS' of github.com:yoff/codeql into python-port-ReDoS
2021-06-30 12:25:59 +02:00
Rasmus Lerchedahl Petersen
09e71cfdfd
Python: update test expectations
2021-06-30 12:25:29 +02:00
Rasmus Lerchedahl Petersen
6dfbf80494
Python: Disable use of toUnicode
...
until supporting CLI is released
2021-06-30 12:21:52 +02:00
Rasmus Wriedt Larsen
e5d65992b4
Python: Use DefinitionNode instead of Assign
...
Based on https://github.com/github/codeql/pull/6155#discussion_r660964666 :
> Hmm... Would it be better to do this using DefinitionNode instead of
> Assign? The latter is fairly limited in what it can represent, and also
> raises questions of whether this definition is sound with regard to
> control-flow splitting.
2021-06-30 12:08:32 +02:00
yoff
c19522e921
Apply suggestions from code review
...
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com >
2021-06-30 11:49:45 +02:00