Commit Graph

9277 Commits

Author SHA1 Message Date
Rasmus Lerchedahl Petersen
27301edc28 Python: address more review comments 2024-06-27 16:05:21 +02:00
yoff
c2141b62e0 Apply suggestions from code review
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2024-06-27 14:53:03 +02:00
Rasmus Lerchedahl Petersen
da03237b32 Python: fix typo pointed out in review but missed by me 2024-06-27 11:21:28 +02:00
Rasmus Lerchedahl Petersen
a3076f4f72 Python: fix test expectations, add missing sanitizer 2024-06-26 13:27:32 +02:00
Rasmus Lerchedahl Petersen
b261145f43 Python: fix compilation 2024-06-26 10:46:38 +02:00
Joe Farebrother
6538d22d3f Fix tornado model of httheaders.add. 2024-06-26 09:21:53 +01:00
Rasmus Lerchedahl Petersen
571be8be3e Python: model more loggers 2024-06-26 01:00:38 +02:00
Rasmus Lerchedahl Petersen
eb32cbe8a5 Python: codecs.open 2024-06-26 00:57:59 +02:00
Rasmus Lerchedahl Petersen
bdc48088e6 Python: MaD summary models
Two of the generated summaries have been excluded:
 - ["re", "Member[split]", "Argument[0,pattern:]", "ReturnValue", "taint"]
   From the documentation, it is not clear why pattern should figure in the return value, as that is the part denoting split point and thus all those instances are filtered out.
   From the implementation
     Spit function: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L199
     _compile function being called by split: https://github.com/python/cpython/blob/3.12/Lib/re/__init__.py#L280
   We see that in case the pattern is already a compiled `Pattern`, it is returned directly from _compile and could thus be part of the return value from split. This is probably not possible to arrange for an attacker, and so an FP in practice.

 - ["urllib2", "Member[unquote]", "Argument[0,string:]", "ReturnValue", "taint"]
   urllib2 seems to be only in Python2 (e.g. https://docs.python.org/2.7/library/urllib2.html) and I cannot locate the function unquote.
2024-06-26 00:39:30 +02:00
yoff
58b6b3f601 Merge pull request #16789 from yoff/python/document-models-as-data
python: Document MaD format
2024-06-25 15:46:28 +02:00
Rasmus Lerchedahl Petersen
bc551174f9 Python: model copy.deepcopy as a value step 2024-06-25 14:53:06 +02:00
Rasmus Lerchedahl Petersen
501cda4e8c Python: model fnmatch.filter 2024-06-25 14:44:39 +02:00
Rasmus Lerchedahl Petersen
2118f233b9 Python: model optparse.OptionParser.parse_arg 2024-06-25 14:40:23 +02:00
Rasmus Lerchedahl Petersen
b80a711b27 python: undo changes to qlpack 2024-06-25 14:13:59 +02:00
Rasmus Lerchedahl Petersen
1e97600c4a Python: move models 2024-06-25 14:13:56 +02:00
Rasmus Lerchedahl Petersen
d410136852 python: compress models 2024-06-25 14:13:52 +02:00
Rasmus Lerchedahl Petersen
c004ffaca8 python: move model to Stdlib.yml
There is already a model there so we add to that one.

We did observe that this existing model was blocked by the external MaD model.
This is concerning and needs to be cleared up.
2024-06-25 14:13:48 +02:00
Rasmus Lerchedahl Petersen
281ac05868 python: add modelling for urlib.parse
- `quote` together with `re.compile` recover regex injection alerts on haiwen/seahub
- `quote_plus` recovers the URL redirection alert on DemocracyClub/EveryElection
- `unquote` recovers path injection alerts on `cloudera/hue`
- it was tedious finding justifications for the rest..
2024-06-25 14:13:44 +02:00
Rasmus Lerchedahl Petersen
df406b4fca python: Start modelling using MaD
- empty models for now
- `summaryModel` of `codeql/python-all` will be added to shortly.
2024-06-25 14:13:41 +02:00
Rasmus Lerchedahl Petersen
aa4fd1992e Python: compact types in type models 2024-06-25 11:59:55 +02:00
Rasmus Lerchedahl Petersen
b902dd5680 Python: add change note 2024-06-25 11:54:30 +02:00
github-actions[bot]
fd385736e6 Post-release preparation for codeql-cli-2.17.6 2024-06-25 06:39:45 +00:00
Joe Farebrother
0901b3d0a6 Add change note 2024-06-24 21:43:09 +01:00
Joe Farebrother
d0f735ac28 Update tests for restframework 2024-06-24 20:52:09 +01:00
Joe Farebrother
c404f00a9b Add additional header write models for aiohttp and tornado + added qldoc 2024-06-24 17:27:25 +01:00
Joe Farebrother
79c0ed6074 Add additional fastapi mheader write models 2024-06-24 17:27:21 +01:00
Joe Farebrother
5ced5c010c Add django header writes 2024-06-24 17:27:15 +01:00
Joe Farebrother
7704801e47 Change fastapi raw cookie header models to header write models 2024-06-24 17:27:12 +01:00
Joe Farebrother
a0201e9c4f Update tests for new cookie write from headers 2024-06-24 17:27:06 +01:00
Joe Farebrother
6b8080a5b3 Update concept tests for header writes 2024-06-24 17:27:02 +01:00
Joe Farebrother
d11f58f768 Add cookie header write concept from experimental. 2024-06-24 17:26:56 +01:00
Joe Farebrother
b71ba7c30f Move Header Write derrived concepts to Concepts 2024-06-24 17:26:51 +01:00
github-actions[bot]
e32a587078 Release preparation for version 2.17.6 2024-06-24 14:33:10 +00:00
Anders Schack-Mulligen
8c23e21073 Dataflow: Cache compatibleTypes. 2024-06-24 13:35:48 +02:00
Rasmus Lerchedahl Petersen
00fbada41d Python: recognize fabric.operations 2024-06-24 10:54:59 +02:00
Taus
4a448f445e Merge pull request #15715 from am0o0/am0o0-python-codeExec
Python: New command execution sinks
2024-06-21 14:26:33 +02:00
Taus
6db7e72fb8 Python: Fix bad join in DataFlowDispatch
A case of bad magic. Rather than evaluating separately whether a class
has a method of some name, the compiler opted to magick in the fact
that this was done as part of the `findFunctionAccordingToMro`
predicate. Hilarity ensued.

However, _we_ know that magic really isn't needed in this case (the
number of results is bounded by `Class.getAMethod` since methods have
only a single name), so by factoring it out into a helper predicate, we
can help the join-orderer along.

Before
```
(377s) Starting to evaluate predicate _DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_DataFlowDispatch::getNextClassInMro/1#__#shared/3@i6#L3#f893bw2h (iteration 6)
(377s) Tuple counts for _DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_DataFlowDispatch::getNextClassInMro/1#__#shared/3@i6#L3#f893bw2h after 16ms:
33363  ~0%     {2} r1 = SCAN `DataFlowDispatch::getNextClassInMro/1#e1ee596a#prev_delta` OUTPUT In.1, In.0 'arg1'
159696 ~4%     {3}    | JOIN WITH `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev` ON FIRST 1 OUTPUT Rhs.1 'arg0', Lhs.1 'arg1', Rhs.2 'arg2'
               return r1
(377s) Starting to evaluate predicate _Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs/3@i6#L4#f893bw2h (iteration 6)
(382s) Tuple counts for _Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs/3@i6#L4#f893bw2h after 4.4s:
1770825904 ~4%     {4} r1 = JOIN `_DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_DataFlowDispatch::getNextClassInMro/1#__#shared` WITH `Function::Function.getName/0#dispred#033700ef_10#join_rhs` ON FIRST 1 OUTPUT Lhs.1 'arg0', Rhs.1, Lhs.0 'arg1', Lhs.2 'arg2'
34558      ~3%     {3}    | JOIN WITH `Class::Class.getAMethod/0#dispred#66416e47` ON FIRST 2 OUTPUT Lhs.0 'arg0', Lhs.2 'arg1', Lhs.3 'arg2'
                   return r1
...
(382s) Starting to evaluate predicate DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3/3@i6#f893b1xh (iteration 6)
(382s)                     - DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3_delta has 125138 rows (order for disjuncts: delta=<standard>).
(382s) Tuple counts for DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3/3@i6#f893b1xh after 12ms:
33363  ~0%     {2} r1 = SCAN `DataFlowDispatch::getNextClassInMro/1#e1ee596a#prev_delta` OUTPUT In.1, In.0 'cls'
159696 ~0%     {3}    | JOIN WITH `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev` ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.1 'name', Rhs.2 'result'
125138 ~1%     {3}    | AND NOT `_Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs`(FIRST 3)

0      ~0%     {3} r2 = JOIN `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_delta` WITH `DataFlowDispatch::getNextClassInMro/1#e1ee596a#reorder_1_0#prev` ON FIRST 1 OUTPUT Lhs.1 'name', Lhs.2 'result', Rhs.1 'cls'
               {3}    | AND NOT `_Class::Class.getAMethod/0#dispred#66416e47_DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#__#antijoin_rhs`(FIRST 3)
0      ~0%     {3}    | SCAN OUTPUT In.2 'cls', In.0 'name', In.1 'result'

125138 ~1%     {3} r3 = r1 UNION r2
125138 ~1%     {3}    | AND NOT `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev`(FIRST 3)
               return r3
```

And now
```
(18s) Tuple counts for DataFlowDispatch::class_has_method/2#0d2ae9c0/2@ff66c1lr after 18ms:
202279 ~1%     {2} r1 = JOIN `Class::Class.getAMethod/0#dispred#66416e47_10#join_rhs` WITH `Function::Function.getName/0#dispred#033700ef` ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.1 'name'
               return r1
...
(490s) Tuple counts for DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3/3@i6#48b6c1xi after 54ms:
0      ~0%     {3} r1 = JOIN `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev_delta` WITH `DataFlowDispatch::getNextClassInMro/1#e1ee596a#reorder_1_0#prev` ON FIRST 1 OUTPUT Rhs.1 'cls', Lhs.1 'name', Lhs.2 'result'
0      ~0%     {3}    | AND NOT `DataFlowDispatch::class_has_method/2#0d2ae9c0`(FIRST 2)

33363  ~0%     {2} r2 = SCAN `DataFlowDispatch::getNextClassInMro/1#e1ee596a#prev_delta` OUTPUT In.1, In.0 'cls'
159696 ~0%     {3}    | JOIN WITH `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev` ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.1 'name', Rhs.2 'result'
125138 ~1%     {3}    | AND NOT `DataFlowDispatch::class_has_method/2#0d2ae9c0`(FIRST 2)

125138 ~1%     {3} r3 = r1 UNION r2
125138 ~1%     {3}    | AND NOT `DataFlowDispatch::findFunctionAccordingToMro/2#a610c0a3#prev`(FIRST 3)
               return r3
```
2024-06-21 12:16:27 +00:00
Rasmus Lerchedahl Petersen
280a9b4408 Python: Support Model Editor 2024-06-21 11:47:51 +02:00
Rasmus Lerchedahl Petersen
f0e68887d4 Python: autoformat 2024-06-20 10:59:39 +02:00
Rasmus Lerchedahl Petersen
5cb37f5c4c python: Document MaD format
- add a few tests reflecting the documentation
- make the mentioned sink-kinds have an effect on relevant queries
2024-06-19 17:00:15 +02:00
Paolo Tranquilli
b7a2ea8981 CI: accept other diagnostic format related test changes 2024-06-19 11:33:50 +02:00
am0o0
ccb923a436 fix formatting 2024-06-18 18:31:29 +02:00
am0o0
1f99559e9f Revert "update id of the query file"
This reverts commit 1f112467ce.
2024-06-18 17:33:07 +02:00
am0o0
8a7fdfa6fe fix conflict 2024-06-18 17:18:59 +02:00
Paolo Tranquilli
daea773fce Python: tests with false positives around match 2024-06-14 17:28:35 +02:00
Taus
b7b0f84e8b Python: Handle @pytest.fixture decorations with arguments as well
Not the prettiest of solutions, but it seems to work well enough.
2024-06-14 15:11:25 +00:00
Paolo Tranquilli
1046d03486 Python: update unused import test case for pytest 2024-06-14 16:55:05 +02:00
Taus
2f00a0d323 Python: Also test pytest fixture factories 2024-06-14 13:11:00 +00:00
Taus
78729180ad Python: Fix pytest fixture unused import FPs 2024-06-14 12:05:55 +00:00
Taus
f3a9c9a9dc Python: Add tests for pytest fixture unused import FPs 2024-06-14 12:03:43 +00:00