Commit Graph

278 Commits

Author SHA1 Message Date
yoff
f10f053c36 Merge pull request #7228 from RasmusWL/fastapi-improvements
Python: FastAPI improvements
2021-12-02 12:58:53 +01:00
Anders Schack-Mulligen
3e914ef2ff Dataflow: Sync. 2021-11-30 13:52:52 +01:00
Anders Schack-Mulligen
00ee34c0a0 Merge pull request #7237 from hvitved/dataflow/consistency-config
Data flow: Introduce `ConsistencyConfiguration` class
2021-11-26 12:49:25 +01:00
Anders Schack-Mulligen
a06642944f Merge pull request #7232 from aschackmull/dataflow/perf
Data flow: Performance tuning
2021-11-25 15:01:01 +01:00
Tom Hvitved
6cb00992e8 Data flow: Introduce ConsistencyConfiguration class 2021-11-25 10:01:47 +01:00
Erik Krogh Kristensen
3bab8c6d1d Merge pull request #7173 from erik-krogh/getRubyInSync
JS/PY/RB: get ReDoSUtil in sync for ruby
2021-11-24 15:20:23 +01:00
Anders Schack-Mulligen
7ca3407c86 Dataflow: Sync. 2021-11-24 14:43:00 +01:00
Rasmus Wriedt Larsen
7dde52ced2 Merge pull request #7131 from RasmusWL/wsgiref.simple_server
Python: Model `wsgiref.simple_server` applications
2021-11-24 14:22:23 +01:00
Rasmus Wriedt Larsen
e2652591a5 Python: Change perf fix PoorMansFunctionResolution
Thanks @yoff, this leaves us with the following evaluation, which looks
very close to the one in the other fix (but with cleaner implementation)
-- both at 688k max tuples (although numbers are not exactly the same).

```
[2021-11-24 13:48:40] (14s) Tuple counts for PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass#ff/2@e5f05asv after 74ms:
                      47493  ~3%     {3} r1 = JOIN Class::Class::getAMethod_dispred#ff WITH py_Classes ON FIRST 1 OUTPUT Lhs.1, 0, Lhs.0
                      47335  ~0%     {2} r2 = JOIN r1 WITH AstGenerated::Function_::getArg_dispred#fff ON FIRST 2 OUTPUT Rhs.2, Lhs.2
                      46683  ~0%     {2} r3 = JOIN r2 WITH DataFlowPublic::ParameterNode::getParameter_dispred#fb_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1
                      259968 ~4%     {2} r4 = JOIN r3 WITH LocalSources::Cached::hasLocalSource#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1
                      161985 ~0%     {3} r5 = JOIN r4 WITH Attributes::AttrRef::accesses_dispred#bff_102#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result', Lhs.1, Rhs.2
                      161985 ~2%     {3} r6 = JOIN r5 WITH Attributes::AttrRead#class#f ON FIRST 1 OUTPUT Lhs.2, Lhs.1, Lhs.0 'result'
                      688766 ~0%     {3} r7 = JOIN r6 WITH Function::Function::getName_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1, Rhs.1 'func', Lhs.2 'result'
                      20928  ~0%     {2} r8 = JOIN r7 WITH Class::Class::getAMethod_dispred#ff ON FIRST 2 OUTPUT Lhs.1 'func', Lhs.2 'result'
                                     return r8
```
2021-11-24 13:52:05 +01:00
Rasmus Wriedt Larsen
1411804e58 Python: Allow custom fastapi.APIRouter subclasses 2021-11-24 13:46:38 +01:00
Rasmus Wriedt Larsen
47448d9efc Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-11-24 12:02:12 +01:00
Rasmus Wriedt Larsen
d493cfdf3a Python: Model FastAPI FileResponse as FileSystemAccess
This was an oversight from our initial FastAPI modeling work.
2021-11-24 11:44:51 +01:00
yoff
f9729bccef Merge pull request #7143 from RasmusWL/path-improvements
Python: Model `posixpath` and `os.stat`
2021-11-24 11:36:06 +01:00
Rasmus Wriedt Larsen
eaed870b31 Python: Fix performance problem in PoorMansFunctionResolution
Before these changes:

[2021-11-22 12:02:50] (8s) Tuple counts for PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass#ff/2@cbddf257 after 8.6s:
                      387565   ~0%     {3} r1 = JOIN Attributes::AttrRead#class#f WITH Attributes::AttrRef::accesses_dispred#bff ON FIRST 1 OUTPUT Rhs.2, Lhs.0 'result', Rhs.1
                      6548632  ~0%     {3} r2 = JOIN r1 WITH Function::Function::getName_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'func', Lhs.1 'result', Lhs.2
                      5640480  ~0%     {4} r3 = JOIN r2 WITH Class::Class::getAMethod_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'result', Lhs.2, Lhs.0 'func'
                      55660458 ~0%     {5} r4 = JOIN r3 WITH Class::Class::getAMethod_dispred#ff ON FIRST 1 OUTPUT Rhs.1, 0, Lhs.1 'result', Lhs.2, Lhs.3 'func'
                      55621412 ~0%     {4} r5 = JOIN r4 WITH AstGenerated::Function_::getArg_dispred#fff ON FIRST 2 OUTPUT Rhs.2, Lhs.2 'result', Lhs.3, Lhs.4 'func'
                      54467144 ~0%     {4} r6 = JOIN r5 WITH DataFlowPublic::ParameterNode::getParameter_dispred#fb_10#join_rhs ON FIRST 1 OUTPUT Lhs.2, Rhs.1, Lhs.1 'result', Lhs.3 'func'
                      20928    ~0%     {2} r7 = JOIN r6 WITH LocalSources::Cached::hasLocalSource#ff ON FIRST 2 OUTPUT Lhs.3 'func', Lhs.2 'result'
                                       return r7

With these changes:

[2021-11-22 11:54:25] (415s) Tuple counts for PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass_helper#fff/3@14db70a8 after 75ms:
                      388306 ~0%     {2} r1 = JOIN Attributes::AttrRead#class#f WITH Attributes::AttrRef::getObject_dispred#bf ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'read'
                      379420 ~4%     {2} r2 = JOIN r1 WITH LocalSources::Cached::hasLocalSource#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'read'
                      175082 ~0%     {2} r3 = JOIN r2 WITH DataFlowPublic::ParameterNode#class#fff ON FIRST 1 OUTPUT Rhs.2, Lhs.1 'read'
                      175082 ~2%     {3} r4 = JOIN r3 WITH Essa::ParameterDefinition::getParameter_dispred#ff ON FIRST 1 OUTPUT 0, Rhs.1, Lhs.1 'read'
                      166798 ~0%     {2} r5 = JOIN r4 WITH AstGenerated::Function_::getArg_dispred#fff_120#join_rhs ON FIRST 2 OUTPUT Rhs.2 'func', Lhs.2 'read'
                      162096 ~0%     {3} r6 = JOIN r5 WITH Class::Class::getAMethod_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.0 'func', Rhs.1 'cls', Lhs.1 'read'
                                     return r6

[2021-11-22 11:54:25] (415s) Tuple counts for PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass_helper2#ffff/4@2b60f0s9 after 63ms:
                      162046 ~0%     {3} r1 = SCAN PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass_helper#fff OUTPUT In.2 'read', In.0 'func', In.1 'cls'
                      162046 ~0%     {3} r2 = JOIN r1 WITH Attributes::AttrRead#class#f ON FIRST 1 OUTPUT Lhs.1 'func', Lhs.2 'cls', Lhs.0 'read'
                      162046 ~1%     {3} r3 = JOIN r2 WITH py_Functions ON FIRST 1 OUTPUT Lhs.1 'cls', Lhs.2 'read', Lhs.0 'func'
                      162046 ~0%     {3} r4 = JOIN r3 WITH py_Classes ON FIRST 1 OUTPUT Lhs.1 'read', Lhs.2 'func', Lhs.0 'cls'
                      161935 ~5%     {4} r5 = JOIN r4 WITH Attributes::AttrRef::getAttributeName_dispred#bf ON FIRST 1 OUTPUT Rhs.1, Lhs.0 'read', Lhs.1 'func', Lhs.2 'cls'
                      688526 ~1%     {4} r6 = JOIN r5 WITH Function::Function::getName_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.2 'func', Lhs.3 'cls', Lhs.1 'read', Rhs.1 'readFunction'
                                     return r6

[2021-11-22 11:54:25] (415s) Tuple counts for PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass#ff/2@f73ae6dq after 58ms:
                      688526 ~0%     {4} r1 = SCAN PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass_helper2#ffff OUTPUT In.1, In.0, In.3 'func', In.2 'result'
                      688526 ~0%     {3} r2 = JOIN r1 WITH Class::Class::getAMethod_dispred#ff ON FIRST 2 OUTPUT Rhs.0, Lhs.2 'func', Lhs.3 'result'
                      20913  ~0%     {2} r3 = JOIN r2 WITH Class::Class::getAMethod_dispred#ff ON FIRST 2 OUTPUT Lhs.1 'func', Lhs.2 'result'
                                     return r3

We need the `pragma[only_bind_into]` in getSimpleMethodReferenceWithinClass_helper2, otherwise the tuple counts would look like, which is needlessly big.

[2021-11-22 17:14:34] (2s) Tuple counts for PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass_helper2#ffff/4@5f0505h7 after 711ms:
                      13570510 ~3%     {2} r1 = JOIN Function::Function::getName_dispred#ff_10#join_rhs WITH Attributes::AttrRef::getAttributeName_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'read', Lhs.1 'readFunction'
                      688526   ~1%     {4} r2 = JOIN r1 WITH PoorMansFunctionResolution::getSimpleMethodReferenceWithinClass_helper#fff_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'func', Rhs.2 'cls', Lhs.0 'read', Lhs.1 'readFunction'
                                       return r2
2021-11-22 17:22:39 +01:00
Rasmus Wriedt Larsen
f09f1c4c50 Python: Minor refactor in PoorMansFunctionResolution 2021-11-22 11:11:29 +01:00
Erik Krogh Kristensen
ee858d840e get ReDoSUtil in sync for ruby 2021-11-18 16:49:34 +01:00
Erik Krogh Kristensen
1cca377e7d Merge pull request #6561 from erik-krogh/htmlReg
JS/Py/Ruby: add a bad-tag-filter query
2021-11-18 09:39:13 +01:00
Anders Schack-Mulligen
c70d384d28 Merge pull request #7045 from aschackmull/dataflow/hidden-ret-subpaths
Data flow: Support hidden return nodes in subpaths predicate
2021-11-16 15:04:51 +01:00
Rasmus Wriedt Larsen
a980f26fda Python: Model os.stat (and friends) 2021-11-16 10:45:32 +01:00
Rasmus Wriedt Larsen
9f4107d211 Python: Model posixpath, ntpath, and genericpath modules 2021-11-16 10:45:14 +01:00
Rasmus Wriedt Larsen
6b7abacc5f Merge pull request #7135 from RasmusWL/b32hexencode
Python: Model `b32hexencode`/`b32hexdecode`
2021-11-15 15:51:46 +01:00
Rasmus Wriedt Larsen
39927fa613 Python: Model b32hexencode/b32hexdecode
New in Python 3.10

See
- https://devdocs.io/python~3.10/library/base64#base64.b32hexencode
- https://devdocs.io/python~3.10/library/base64#base64.b32hexdecode
2021-11-15 15:23:49 +01:00
Rasmus Wriedt Larsen
cfdfcaa3e8 Python: Support Path.hardlink_to (new in 3.10)
See https://docs.python.org/3.10/library/pathlib.html#pathlib.Path.hardlink_to
2021-11-15 14:57:59 +01:00
Rasmus Wriedt Larsen
5d60975f65 Python: Support aiter and anext (new in 3.10)
See
- https://docs.python.org/3/whatsnew/3.10.html#other-language-changes
- https://docs.python.org/3.10/library/functions.html#aiter
- https://docs.python.org/3.10/library/functions.html#anext
2021-11-15 14:55:34 +01:00
Rasmus Wriedt Larsen
9e097f5430 Python: Improve PoorMansFunctionResolution 2021-11-15 13:40:19 +01:00
Rasmus Wriedt Larsen
6eb4525ab2 Python: Model wsgiref.simple_server applications 2021-11-15 13:34:39 +01:00
Taus
c17560f948 Merge pull request #7096 from tausbn/python-fix-more-bad-joins
Python: Fix a bunch of performance issues
2021-11-15 12:10:27 +01:00
yoff
9f614b1d98 Merge pull request #7016 from RasmusWL/django-rest-framework
Python: Model Django REST framework
2021-11-12 14:27:56 +01:00
Rasmus Wriedt Larsen
491f72bb2a Python: Adjust generated code to be more familiar 2021-11-12 13:30:03 +01:00
Rasmus Wriedt Larsen
de69e4c645 Python: Expand on SubclassFinder implementation note 2021-11-12 13:29:03 +01:00
Rasmus Wriedt Larsen
f7b53321b9 Python: Remove copy-pasted comment 2021-11-12 13:19:20 +01:00
Taus
55ea715ce9 Merge pull request #7033 from RasmusWL/flask-admin 2021-11-12 12:18:56 +01:00
Rasmus Wriedt Larsen
860b1a5cc3 Python: Other minor QLDoc adjustment 2021-11-12 11:46:45 +01:00
Rasmus Wriedt Larsen
99081ea7e0 Python: Minor adjustment in QLDoc 2021-11-12 11:42:36 +01:00
Rasmus Wriedt Larsen
5e4b866f2b Python: Model rest_framework.exceptions.APIException 2021-11-12 11:37:54 +01:00
Rasmus Wriedt Larsen
62e58b534c Python: SubclassFinder: reorder + comment 2021-11-12 11:11:13 +01:00
Rasmus Wriedt Larsen
f48ecb1dc8 Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-11-12 10:57:56 +01:00
Rasmus Wriedt Larsen
06cae3dac2 Merge pull request #7104 from yoff/python/model-aiomysql
Python: model aiomysql
2021-11-11 16:58:01 +01:00
Rasmus Lerchedahl Petersen
e2a2a42d59 Python: Fix api references 2021-11-11 13:20:57 +01:00
Erik Krogh Kristensen
b513033e0f Merge pull request #7021 from erik-krogh/cwe326
JS: Add insufficient key size query
2021-11-11 12:17:04 +01:00
Anders Schack-Mulligen
7ffd9b4f9e Dataflow: Include read/store steps when finding non-hidden return. 2021-11-11 11:26:21 +01:00
Anders Schack-Mulligen
6d9fb3ca43 Dataflow: Sync. 2021-11-10 15:11:13 +01:00
yoff
d23a920ed4 Merge branch 'main' into python/model-aiomysql 2021-11-10 14:32:36 +01:00
Rasmus Lerchedahl Petersen
57e7bfbdba Python: model aiomysql 2021-11-10 14:29:39 +01:00
Rasmus Wriedt Larsen
de926dc2a1 Merge pull request #7085 from yoff/python/model-aiopg
Python: model aiopg
2021-11-10 13:10:30 +01:00
Rasmus Lerchedahl Petersen
92a7114b72 Python: Add API references 2021-11-10 11:06:58 +01:00
Taus
33135e909a Python: Add magic to named_argument_transfer
This predicate was materialised as a _big_, _cached_ relation:

```
(169s) Tuple counts for PointsTo::InterProceduralPointsTo::named_argument_transfer#ffff#join_rhs/4@38ce07 after 53.4s:
25212     ~4%     {3} r1 = SCAN Function::Function::getArgByName_dispred#fff OUTPUT In.1, In.0 'arg1', In.2 'arg2'
159751200 ~0%     {4} r2 = JOIN r1 WITH Flow::CallNode::getArgByName_dispred#fff_102#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0', Lhs.1 'arg1', Lhs.2 'arg2', Rhs.2 'arg3'
                  return r2
```

... However it's only used in a single place (where it is immediately
joined with the points-to relation to relate the caller and argument),
none of these joins were ever larger than 2000 tuples. This made it
pretty clear that we could gain something by pushing in that points-to
join as a bit of manual magic.

However, doing so didn't actually fix anything, since the join-orderer
then decided to join `func.getArgByName(name)` with
`call.getArgByName(name)` on `name` as the first thing (which caused a
join of the same size as above).

Unbinding didn't work, since `name` would then be an unbound `string`,
so instead I factored out relating the function, parameter, and name
thereof into its own predicate. (I could also have done this with the
call, but I would expect there to be more calls than function
definitions in general.)

Overall, this resulted in going from

```
(709s)
Definitions.ql-7:PointsTo::InterProceduralPointsTo::named_argument_transfer#ffff#join_rhs ......... 53.5s
Definitions.ql-7:Instances::InstanceObject::initializer_dispred#fbf ............................... 35.3s (456 evaluations with max 136ms in Instances::InstanceObject::initializer_dispred#fbf/3@i110#0508e8)
Definitions.ql-10:DefinitionTracking::jump_to_defn_attribute#fbf .................................. 27s (100 evaluations with max 12.8s in DefinitionTracking::jump_to_defn_attribute#fbf/3@i1#fc1f7x)
Definitions.ql-7:PointsTo::PointsToInternal::pointsTo#ffff ........................................ 16.1s (681 evaluations with max 2.5s in PointsTo::PointsToInternal::pointsTo#ffff/4@i4#0508eg)
Definitions.ql-7:Constants::ConstantObjectInternal::attribute#ffff ................................ 13.4s (505 evaluations with max 50ms in Constants::ConstantObjectInternal::attribute#ffff/4@i153#0508e5)
Definitions.ql-10:DefinitionTracking::assignment_jump_to_defn_attribute#fbf ....................... 12.4s (99 evaluations with max 11.8s in DefinitionTracking::assignment_jump_to_defn_attribute#fbf/3@i2#fc1f
7z)
...
```

to

```
(668s)
Definitions.ql-7:Instances::InstanceObject::initializer_dispred#fbf ................... 35.4s (456 evaluations with max 140ms in Instances::InstanceObject::initializer_dispred#fbf/3@i110#bf4328)
Definitions.ql-10:DefinitionTracking::jump_to_defn_attribute#fbf ...................... 27.4s (100 evaluations with max 13.3s in DefinitionTracking::jump_to_defn_attribute#fbf/3@i1#679d7x)
Definitions.ql-7:PointsTo::PointsToInternal::pointsTo#ffff ............................ 16.1s (681 evaluations with max 2.5s in PointsTo::PointsToInternal::pointsTo#ffff/4@i4#bf432g)
Definitions.ql-7:Constants::ConstantObjectInternal::attribute#ffff .................... 14.4s (505 evaluations with max 51ms in Constants::ConstantObjectInternal::attribute#ffff/4@i140#bf4325)
Definitions.ql-10:DefinitionTracking::assignment_jump_to_defn_attribute#fbf ........... 12.3s (99 evaluations with max 11.7s in DefinitionTracking::assignment_jump_to_defn_attribute#fbf/3@i2#679d
7z)
...
```
2021-11-09 21:39:32 +00:00
Taus
e2f79d8516 Python: Fix several bad getScope joins
It seems the optimiser has started getting the wrong end of the stick
whenever we write `foo.getScope() = bar.getScope()` for some expressions
`foo` and `bar`.

This lead to things like

```
(196s) Tuple counts for Definitions::ModuleVariable::global_variable_callnode#ff/2@5ab278 after 2m33s:
2952757013 ~0%     {2} r1 = JOIN Definitions::ModuleVariable::global_variable_callnode#ff#shared WITH Variables::Variable::getScope_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'this', Lhs.1 'result'
495693     ~0%     {2} r2 = JOIN r1 WITH Variables::GlobalVariable#class#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1 'result'
453589     ~0%     {2} r3 = JOIN r2 WITH Definitions::ModuleVariable#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1 'result'
                   return r3
```

and

```
(315s) Tuple counts for Definitions::SsaSourceVariable::getAUse_dispred#ff/2@a39328 after 1m57s:
...
1785275    ~3%       {2} r24 = Definitions::ModuleVariable::global_variable_callnode#ff#shared UNION Definitions::SsaSourceVariable::getAUse_dispred#ff#shared
3008614987 ~0%       {2} r25 = JOIN r24 WITH Variables::Variable::getScope_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'this', Lhs.1 'result'
127        ~1%       {2} r26 = JOIN r25 WITH Definitions::NonLocalVariable#class#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1 'result'
127        ~1%       {2} r27 = JOIN r26 WITH Variables::LocalVariable#f ON FIRST 1 OUTPUT Lhs.0 'this', Lhs.1 'result'
...
```

(Note the timings: 2m33s and 1m57s.)

Now we have the much more reasonable

```
(38s) Tuple counts for Definitions::ModuleVariable::global_variable_callnode#ff/2@c53031 after 42ms:
453589 ~0%     {2} r1 = JOIN Definitions::ModuleVariable::global_variable_callnode#ff#shared WITH Definitions::ModuleVariable::scope_as_global_variable#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'this', Lhs.1 'result'
               return r1
```

and

```
(46s) Tuple counts for Definitions::SsaSourceVariable::getAUse_dispred#ff/2@4b19de after 375ms:
...
```
2021-11-09 20:54:41 +00:00
Rasmus Wriedt Larsen
1e31416049 Merge pull request #7031 from yoff/python/taint-through-with
Python: Taint through `async with`
2021-11-09 14:08:07 +01:00
Rasmus Lerchedahl Petersen
a58c47b07b Python: model aiopg.sa 2021-11-09 12:49:57 +01:00