Commit Graph

4612 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
08f6d1ab80 Python: Clearer sourceType for client response body 2021-12-13 11:24:38 +01:00
Rasmus Wriedt Larsen
5de79b4ffe Python: Add HTTP::Client::Request concept
Taken from Ruby, except that `getURL` member predicate was changed to
`getUrl` to keep consistency with the rest of our concepts, and stick
to our naming convention.
2021-12-13 11:09:09 +01:00
Rasmus Wriedt Larsen
1e45fa9ed4 JS/Py/Ruby: Add more CWEs to bad-tag-filter queries
CWE-185: Incorrect Regular Expression

The software specifies a regular expression in a way that causes data to
be improperly matched or compared.

https://cwe.mitre.org/data/definitions/185.html

CWE-186: Overly Restrictive Regular Expression

> A regular expression is overly restrictive, which prevents dangerous values from being detected.
>
> (...) [this CWE] is about a regular expression that does not match all
> values that are intended. (...)

https://cwe.mitre.org/data/definitions/186.html

From my understanding,
CWE-625: Permissive Regular Expression, is not applicable. (since this
is about accepting a regex match where there should not be a match).
2021-12-13 10:23:24 +01:00
Andrew Eisenberg
66c1629974 Merge pull request #7285 from github/post-release-prep-2.7.3-ddd4ccbb
Post-release preparation 2.7.3
2021-12-10 09:59:45 -08:00
yoff
d8857c7ce8 Merge pull request #7246 from tausbn/python/import-star-flow
Python: Support flow through `import *`
2021-12-10 16:34:32 +01:00
Rasmus Wriedt Larsen
bd9b96e154 Merge pull request #7331 from tausbn/python-fix-bad-callsite-points-to-join
Python: Fix bad `callsite_points_to` join
2021-12-10 15:39:49 +01:00
Rasmus Wriedt Larsen
8ee020f79c Merge pull request #7332 from tausbn/python-fix-bad-scope-entry-points-to-join
Python: Fix bad `scope_entry_points_to` join
2021-12-10 15:33:13 +01:00
Taus
6d247bfdf9 Merge pull request #7330 from tausbn/python-fix-bad-adjacentuseuse-join
Python: Fix bad join in SSA
2021-12-09 20:55:45 +01:00
yoff
8e11c2c476 Merge pull request #7259 from RasmusWL/even-more-path-injection-sinks
Python: Add more path-injection sinks from `os` and `tempfile` modules
2021-12-09 14:46:41 +01:00
Taus
b871342e83 Python: A small further performance improvement
Unrolling the transitive closure had slightly better performance here.

Also, we exclude names of builtins, since those will be handled by a
separate case of `isDefinedLocally`.
2021-12-09 10:29:55 +00:00
Taus
8517eff0f7 Python: Fix bad performance
A few changes, all bundled together:

- We were getting a lot of magic applied to the predicates in the
  `ImportStar` module, and this was causing needless re-evaluation.
  To address this, the easiest solution was to simply cache the entire
  module.
- In order to separate this from the dataflow analysis and make it
  dependent only on control flow, `potentialImportStarBase` was changed
  to return a `ControlFlowNode`.
- `isDefinedLocally` was defined on control flow nodes, which meant we
  were duplicating a lot of tuples due to control flow splitting, to no
  actual benefit.

Finally, there was a really bad join in `isDefinedLocally` that was
fixed by separating out a helper predicate. This is a case where we
could use a three-way join, since the join between the `Scope`, the
`name` string and the `Name` is big no matter what.

If we join `scope_defines_name` with `n.getId()`, we'll get `Name`s
belonging to irrelevant scopes.

If we join `scope_defines_name` with the enclosing scope of the `Name`
`n`, then we'll get this also for `Name`s that don't share their `getId`
with the local variable defined in the scope.

If we join `n.getId()` with `n.getScope()...` then we'll get all
enclosing scopes for each `Name`.

The last of these is what we currently have. It's not terrible, but not
great either. (Though thankfully it's rare to have lots of enclosing
scopes.)
2021-12-08 22:53:45 +00:00
Anders Schack-Mulligen
38d0bb4a60 Merge pull request #7260 from hvitved/dataflow/argument-parameter-matching
Data flow: Introduce `ParameterPosition` and `ArgumentPosition`
2021-12-08 12:49:08 +01:00
Tom Hvitved
283173ad02 Address review comments 2021-12-08 11:26:44 +01:00
Tom Hvitved
490872173a Data flow: Sync files 2021-12-07 20:29:18 +01:00
Taus
e7c298d903 Python: Fix bad scope_entry_points_to join
From `pritomrajkhowa/LoopBound`:

```
Definitions.ql-7:PointsTo::PointsToInternal::scope_entry_points_to#ffff#antijoin_rhs#2 ........... 55.1s
```

specifically

```
(443s) Tuple counts for PointsTo::PointsToInternal::scope_entry_points_to#ffff#antijoin_rhs#2/3@74a7cart after 55.1s:
184070    ~0%        {3} r1 = JOIN PointsTo::PointsToInternal::scope_entry_points_to#ffff#shared#1 WITH Variables::GlobalVariable#class#f ON FIRST 1 OUTPUT Lhs.0 'arg2', Lhs.1 'arg0', Lhs.2 'arg1'
184070    ~0%        {3} r2 = STREAM DEDUP r1
919966523 ~2%        {4} r3 = JOIN r2 WITH Essa::EssaDefinition::getSourceVariable_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'arg0', Lhs.2 'arg1', Lhs.0 'arg2'
4281779   ~2293%     {3} r4 = JOIN r3 WITH Essa::EssaVariable::getScope_dispred#ff ON FIRST 2 OUTPUT Lhs.1 'arg0', Lhs.2 'arg1', Lhs.3 'arg2'
                    return r4
```

First, this is an `antijoin`, so there's likely some negation involved.
Also, there's mention of `GlobalVariable`, `getScope`, and
`getSourceVariable`, none of which appear in `scope_entry_points_to`, so
it's likely that something got inlined.

Taking a closer look at the predicates mentioned in the body, we spot
`undefined_variable` as a likely culprit.

Evaluating this predicate in isolation reveals that it's not terribly
big, so we could try just marking it with `pragma[noinline]` (I opted
for the slightly more solid `nomagic`) and see how that fares. I also
checked that `builtin_not_in_outer_scope` was similarly small, and
made that one un-inlineable as well.

The result? Well, I can't even show you. Both `scope_entry_points_to`
and `undefined_variable` are so fast that they don't appear in the
clause timing report (so they can at most take 3.5s each to evaluate, as
that is the smallest timing in the list).
2021-12-07 18:51:44 +00:00
Taus
b502ca1ea7 Python: Fix bad callsite_points_to join
From `pritomrajkhowa/LoopBound`:

```
Definitions.ql-7:PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#join_rhs#3 ........... 5m53s
```

specifically

```
(767s) Tuple counts for PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#join_rhs#3/3@f8f86764 after 5m53s:
832806293 ~0%     {4} r1 = JOIN PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared#1 WITH PointsTo::InterProceduralPointsTo::var_at_exit#fff ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'arg1', Rhs.1 'arg2', Rhs.2 'arg0'
832806293 ~0%     {3} r2 = JOIN r1 WITH Essa::TEssaNodeRefinement#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.3 'arg0', Lhs.1 'arg1', Lhs.2 'arg2'
                return r2
```

This one is a bit tricky to unpack. Where is this `shared#1` defined?

```
EVALUATE NONRECURSIVE RELATION:
SYNTHETIC PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared#1(int arg0, numbered_tuple arg1) :-
    SENTINEL PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared
    SENTINEL Definitions::EscapingAssignmentGlobalVariable#class#f
    SENTINEL Essa::TEssaNodeRefinement#ffff_03#join_rhs
    {2} r1 = JOIN PointsTo::InterProceduralPointsTo::callsite_points_to#ffff#shared WITH Definitions::EscapingAssignmentGlobalVariable#class#f ON FIRST 1 OUTPUT Lhs.0 'arg0', Lhs.1 'arg1'
    {2} r2 = STREAM DEDUP r1
    {2} r3 = JOIN r2 WITH Essa::TEssaNodeRefinement#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.0 'arg0', Lhs.1 'arg1'
    {2} r4 = STREAM DEDUP r3
    return r4
```

Looking at `callsite_points_to`, we see a likely candidate in `srcvar`.
It is guarded with an `instanceof` check for
`EscapingAssignmentGlobalVariable` (which lines up nicely with the
sentinel on its charpred) and `getSourceVariable` is just a projection
of `TEssaNodeRefinement`.

So let's try unbinding `srcvar` to prevent an early join.

The timing is now:

```
Definitions.ql-7:PointsTo::InterProceduralPointsTo::callsite_points_to#ffff ...................... 31.3s (2554 evaluations with max 101ms in PointsTo::InterProceduralPointsTo::callsite_points_to#ffff/4@i516#581fap5w)
```
(Showing the tuple counts doesn't make sense here, since all of the
`shared` and `join_rhs` predicates have been smooshed around.)
2021-12-07 18:25:53 +00:00
Taus
a716482c1f Python: Fix bad join in SSA
On `pritomrajkhowa/LoopBound`:

```
Definitions.ql-3:SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUse#ff ................. 4m35s
```

specifically

```
(376s) Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUse#ff/2@be04e9kp after 4m58s:
388843     ~0%     {4} r1 = JOIN Essa::TPhiFunction#fff_2#join_rhs WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#ffff ON FIRST 1 OUTPUT Rhs.1, Lhs.0, Rhs.2, Rhs.3
3629812090 ~1%     {7} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::variableUse#ffff ON FIRST 1 OUTPUT Lhs.0, Rhs.2, Rhs.3, Lhs.2, Lhs.3, Lhs.1, Rhs.1 'use1'
0          ~0%     {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#fffff ON FIRST 5 OUTPUT Lhs.5, Lhs.6 'use1'
0          ~0%     {2} r4 = JOIN r3 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#ff ON FIRST 1 OUTPUT Lhs.1 'use1', Rhs.1 'use2'

897141     ~0%     {2} r5 = SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUseSameVar#ff UNION r4
                    return r5
```

Clearly we do not want to join on the variable so soon. So we unbind it
and get

```
(78s) Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUse#ff/2@40e0e6uv after 434ms:
3377959 ~2%     {4} r1 = SCAN SsaCompute::SsaComputeImpl::variableUse#ffff OUTPUT In.0, In.2, In.3, In.1 'use1'
1026855 ~2%     {4} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#fffff ON FIRST 3 OUTPUT Lhs.0, Rhs.3, Rhs.4, Lhs.3 'use1'
129484  ~0%     {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#ffff_1230#join_rhs ON FIRST 3 OUTPUT Rhs.3, Lhs.3 'use1'
0       ~0%     {2} r4 = JOIN r3 WITH Essa::TPhiFunction#fff_2#join_rhs ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'use1'
0       ~0%     {2} r5 = JOIN r4 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#ff ON FIRST 1 OUTPUT Lhs.1 'use1', Rhs.1 'use2'

897141  ~0%     {2} r6 = SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentUseUseSameVar#ff UNION r5
                return r6
```
2021-12-07 18:19:47 +00:00
Taus
59bac04d8f Python: Fix Python 2 failures 2021-12-07 18:00:46 +00:00
Taus
ffc858e34d Python: Add missing file 2021-12-07 17:29:35 +00:00
Taus
7437cd4d85 Python: Fix syntax error locations 2021-12-07 16:51:33 +00:00
Erik Krogh Kristensen
3c59aa319e Merge pull request #7245 from erik-krogh/explicit-this-all-the-places
All langs: apply the explicit-this patch to all remaining code
2021-12-07 10:40:26 +01:00
Taus
7cd9369d91 Python: Autoformat 2021-12-07 09:29:24 +00:00
Taus
33a9f86f54 Python: Change integer in trois.py 2021-12-07 08:54:07 +00:00
Taus
dd33f4f4d2 Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-12-07 09:48:53 +01:00
Taus
7f44cebed7 Python: Add missing hidden flow
The easiest way to implement this was to change the definition of
`module_export` to account for chains of `import *`. We reuse the
machinery from `ImportStar.qll` for this, naturally.
2021-12-02 17:11:56 +00:00
Taus
4138296ec6 Python: Add test for "hidden" import * flow
TL;DR: We were missing out on flow in the following situation:

`mod1.py`:
```python
foo = SOURCE
```

`mod2.py`:
```python
from mod1 import *
```

`test.py`:
```python
from mod2 import foo
SINK(foo)
```

This is because there's no node at which a read of `foo` takes place
within `test.py`, and so the added reads make no difference.

Unfortunately, this means the previous test was a bit too simplistic,
since it only looks for module variable reads and writes. Because of
this, we change the test to be a more traditional "all flow" style
(though restricted to `CfgNode`s).
2021-12-02 17:05:54 +00:00
Nick Rolfe
05415768c9 Merge remote-tracking branch 'origin/main' into nickrolfe/regexp_g_anchor 2021-12-02 12:07:13 +00:00
yoff
f10f053c36 Merge pull request #7228 from RasmusWL/fastapi-improvements
Python: FastAPI improvements
2021-12-02 12:58:53 +01:00
yoff
4609b2060a Merge pull request #7217 from RasmusWL/more-path-injection-fps
Python: Add `x in <var>` test for StringConstCompare
2021-12-02 12:35:33 +01:00
github-actions[bot]
87b968f337 Post-release preparation 2.7.3 2021-12-02 00:46:55 +00:00
Anders Schack-Mulligen
cde853c095 Merge pull request #7270 from aschackmull/dataflow/stage2-refactor
Dataflow: Stage 2 refactor
2021-12-01 11:09:08 +01:00
Tom Hvitved
e410244fe0 Python: Implement ParameterPosition et al 2021-12-01 08:51:22 +01:00
github-actions[bot]
337ce65fe5 Release preparation for version 2.7.3 2021-11-30 20:39:35 +00:00
Tom Hvitved
540ecf3c21 Data flow: Sync files 2021-11-30 15:20:20 +01:00
Anders Schack-Mulligen
3e914ef2ff Dataflow: Sync. 2021-11-30 13:52:52 +01:00
Dave Bartolomeo
9f6c0991cf Catch up with recent change notes 2021-11-29 16:41:18 -05:00
Dave Bartolomeo
5ed9029143 Move change notes to correct directories 2021-11-29 16:31:11 -05:00
Dave Bartolomeo
cd8a10d0a5 Python change notes 2021-11-29 16:17:05 -05:00
Dave Bartolomeo
d0dac03bad Manually bump versions 2021-11-29 14:21:08 -05:00
Dave Bartolomeo
2dfcd1dd9c Add groups property
Also removed versions from test packs
2021-11-29 14:15:53 -05:00
Rasmus Wriedt Larsen
d557f6fd2e Merge pull request #7101 from RasmusWL/python-ids
Python: Fix some query-ids
2021-11-29 16:12:57 +01:00
yoff
41b7922c7d Merge pull request #7089 from RasmusWL/redos-cwe-1333
Python/C#: Add CWE-1333 to redos queries
2021-11-29 16:09:39 +01:00
yoff
19802ccb73 Merge pull request #7046 from RasmusWL/django-own-json-response
Python: Add test with custom django json response (FP)
2021-11-29 16:05:20 +01:00
yoff
e63f9141e5 Merge pull request #7233 from RasmusWL/fix-cleartext-logging-cwes
JS/Py: Fix cleartext logging CWEs
2021-11-29 15:58:10 +01:00
Rasmus Wriedt Larsen
cbd7434a7e Python: Add modeling of tempfile module 2021-11-29 15:08:36 +01:00
Rasmus Wriedt Larsen
b68538376c Python: Add tests of tempfile module 2021-11-29 15:08:36 +01:00
Rasmus Wriedt Larsen
3bcf6d68ce Python: Refactor os FileSystemAccess change-note
I think it's more readable to have only one to cover all of these
changes, even though they came in through different PRs.
2021-11-29 15:08:18 +01:00
Rasmus Wriedt Larsen
58f92764f7 Python: Model more file access from os module 2021-11-29 14:54:02 +01:00
Rasmus Wriedt Larsen
fd23fa94a5 Python: Remove dubious fstat* modeling
These operate on file descriptors, and not on paths. file descriptors
doesn't fit into the rest of our modeling, so I would rather remove them
than to make it look like it's properly handled.

I also did not include any of the functions that work on file
descriptors when looking through all of `os`. So this keeps everything
consistent at least ;)
2021-11-29 14:54:02 +01:00
Rasmus Wriedt Larsen
e79b8f3e23 Python: Treat os.exec*, os.spawn*, and os.posix_spawn* as FileSystemAccess 2021-11-29 14:54:02 +01:00