Commit Graph

618 Commits

Author SHA1 Message Date
Asger F
3335d48154 Sync files 2024-04-16 20:26:41 +02:00
Asger F
be64daf265 Merge branch 'main' into js/graph-export 2024-04-16 20:23:33 +02:00
Tom Hvitved
e7dc120456 Add deprecation comments 2024-04-12 13:40:15 +02:00
Tom Hvitved
ceb5b4c56e Python: No longer use models-as-data CSV interface 2024-04-12 13:40:15 +02:00
Tom Hvitved
fdb77457b3 Sync files 2024-04-12 13:40:14 +02:00
Anders Schack-Mulligen
a8fc100108 Python: Add alert provenance plumbing. 2024-04-12 09:20:08 +02:00
Rasmus Wriedt Larsen
4fed3cf12d Python: Fix RemoteFlowSourceFromCsv 2024-04-10 11:31:34 +02:00
Asger F
f5355cfa98 Dynamic: Sync ApiGraphModels.qll 2024-04-09 14:37:20 +02:00
Tom Hvitved
1dc13cc169 Merge pull request #15923 from hvitved/shared-xml-impl
Properly shared `XML.qll` implementation
2024-04-03 11:39:50 +02:00
Rasmus Wriedt Larsen
bfa8515b28 Python: Apply suggestions from code review
Co-authored-by: Anders Schack-Mulligen <aschackmull@users.noreply.github.com>
2024-03-21 14:51:45 +01:00
Rasmus Wriedt Larsen
2aa5ae41fb Python: Fix join-order problem in SqlAlchemy
No major performance impact, more of a learning example for myself (had +3000 join order badness).

Initial tuple counts

```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g in 1ms on iteration 1 (delta size: 4).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g on iteration 1 running pipeline base with tuple counts:
        37793   ~0%    {3} r1 = JOIN `ApiGraphs::API::Node.getACall/0#dispred#312deb92_10#join_rhs` WITH DataFlowPublic::CallCfgNode#b8ddbf81 ON FIRST 1 OUTPUT Lhs.1, Lhs.0, Rhs.1
            0   ~0%    {2}    | JOIN WITH `SqlAlchemy::SqlAlchemy::Connection::classRef/0#565fc3ad` ON FIRST 1 OUTPUT Lhs.1, Lhs.2

           30   ~0%    {5} r2 = JOIN DataFlowPublic::CallCfgNode#b8ddbf81 WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb` ON FIRST 1 OUTPUT Lhs.0, Lhs.1, Rhs.1, Rhs.2, _
                       {4}    | REWRITE WITH NOT [NOT [Tmp.4 := "begin", TEST InOut.3 = Tmp.4], NOT [Tmp.4 := "connect", TEST InOut.3 = Tmp.4]] KEEPING 4
           21   ~0%    {3}    | SCAN OUTPUT In.2, In.0, In.1
            4   ~0%    {2}    | JOIN WITH `SqlAlchemy::SqlAlchemy::Engine::instance/0#1828baef` ON FIRST 1 OUTPUT Lhs.1, Lhs.2

            4   ~0%    {2} r3 = r1 UNION r2
                       return r3
```

which is fixed by the only_bind_out

```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@49effxtg in 0ms on iteration 1 (delta size: 0).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@49effxtg on iteration 1 running pipeline base with tuple counts:
        0  ~0%    {1} r1 = JOIN `SqlAlchemy::SqlAlchemy::Connection::classRef/0#565fc3ad` WITH `ApiGraphs::API::Node.getACall/0#dispred#312deb92` ON FIRST 1 OUTPUT Rhs.1
        0  ~0%    {2}    | JOIN WITH DataFlowPublic::CallCfgNode#b8ddbf81 ON FIRST 1 OUTPUT Lhs.0, Rhs.1
                  return r1
```

We also had this initial problem

```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g in 1ms on iteration 4 (delta size: 0).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0@594cfx2g on iteration 4 running pipeline standard with tuple counts:
        48722   ~6%    {2} r1 = DataFlowPublic::CallCfgNode#b8ddbf81 AND NOT SqlAlchemy::SqlAlchemy::Connection::ConnectionConstruction#45e716e0#prev(FIRST 2)

        48722   ~3%    {3} r2 = SCAN r1 OUTPUT In.0, _, In.1
        48722   ~1%    {3}    | REWRITE WITH Out.1 := "connect"
           16   ~0%    {3}    | JOIN WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb_021#join_rhs` ON FIRST 2 OUTPUT Rhs.2, Lhs.0, Lhs.2
            0   ~0%    {2}    | JOIN WITH `SqlAlchemy::SqlAlchemy::Connection::instance/0#5ed87c17#prev_delta` ON FIRST 1 OUTPUT Lhs.1, Lhs.2

        48722   ~3%    {3} r3 = SCAN r1 OUTPUT In.0, _, In.1
        48722   ~2%    {3}    | REWRITE WITH Out.1 := "execution_options"
            9   ~0%    {3}    | JOIN WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb_021#join_rhs` ON FIRST 2 OUTPUT Rhs.2, Lhs.0, Lhs.2
            0   ~0%    {2}    | JOIN WITH `SqlAlchemy::SqlAlchemy::Connection::instance/0#5ed87c17#prev_delta` ON FIRST 1 OUTPUT Lhs.1, Lhs.2

            0   ~0%    {2} r4 = r2 UNION r3
                       return r4
```

which is fixed by `connectionConstruction_helper`

```
Evaluated recursive predicate SqlAlchemy::SqlAlchemy::Connection::helper/0#62cfc178#b@4f295yef in 1ms on iteration 4 (delta size: 0).
Evaluated relational algebra for predicate SqlAlchemy::SqlAlchemy::Connection::helper/0#62cfc178#b@4f295yef on iteration 4 running pipeline standard with tuple counts:
         4  ~0%    {1} r1 = JOIN `SqlAlchemy::SqlAlchemy::Connection::instance/1#029b4c87#prev_delta` WITH `TypeTrackingImpl::TypeTracker::end/0#2ac2cfd4` ON FIRST 1 OUTPUT Lhs.1
        16  ~0%    {1}    | JOIN WITH `LocalSources::Cached::hasLocalSource/2#8b3ee0ec_10#join_rhs` ON FIRST 1 OUTPUT Rhs.1
         0  ~0%    {3}    | JOIN WITH `DataFlowPublic::MethodCallNode.calls/2#dispred#1dd1e0f4#ffb_102#join_rhs` ON FIRST 1 OUTPUT Rhs.1, Rhs.2, _
         0  ~0%    {2}    | REWRITE WITH NOT [NOT [Tmp.2 := "connect", TEST InOut.1 = Tmp.2], NOT [Tmp.2 := "execution_options", TEST InOut.1 = Tmp.2]] KEEPING 2
         0  ~0%    {1}    | JOIN WITH DataFlowPublic::CallCfgNode#b8ddbf81 ON FIRST 1 OUTPUT Lhs.0
         0  ~0%    {1}    | AND NOT `SqlAlchemy::SqlAlchemy::Connection::helper/0#62cfc178#b#prev`(FIRST 1)
                   return r1
```
2024-03-21 11:55:49 +01:00
Tom Hvitved
2e370e2ded Python: Switch to shared XML.qll implementation 2024-03-19 13:17:53 +01:00
yoff
adbcbefaa9 Merge pull request #15551 from yoff/python/avoid-duplicate-model-inclusions
python: Remove `TaintStepFromSummary`
2024-03-11 13:52:20 +01:00
Rasmus Wriedt Larsen
cbb9a64bbb Merge pull request #15457 from RasmusWL/psycopg
Python: Model the `psycopg` package
2024-02-12 15:59:16 +01:00
Rasmus Lerchedahl Petersen
45bb4a0ee5 python: remove TaintStepFromSummary
as it should be covered by `SummarizedCallableFromModel`

Also move things around, to look more like the Ruby code.
2024-02-08 12:48:15 +01:00
Rasmus Wriedt Larsen
c265c15f3f Merge pull request #15398 from RasmusWL/html-escape
Python: Add `html.escape` as HTML sanitizer
2024-01-30 16:06:01 +01:00
Rasmus Wriedt Larsen
c70b32f7eb Python: Require quote escaping for html.escape 2024-01-30 12:17:01 +01:00
Rasmus Wriedt Larsen
3f0dc2b022 Python: Model the psycopg package 2024-01-29 14:30:20 +01:00
Erik Krogh Kristensen
f1d6f56621 Merge pull request #15393 from erik-krogh/deps-jan-2024
All: delete outdated deprecations
2024-01-23 13:52:38 +01:00
Rasmus Wriedt Larsen
cbed6e861d Python: Add html.escape as HTML sanitizer 2024-01-22 17:32:28 +01:00
Max Schaefer
17e3a45ad7 Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2024-01-22 13:36:12 +00:00
Max Schaefer
98178458d0 Python: Add support for more URL redirect sanitisers.
Since some sanitisers don't handle backslashes correctly, I updated the data-flow configuration to incorporate a flow state tracking whether or not backslashes have been eliminated or converted to forward slashes.
2024-01-22 13:24:18 +00:00
erik-krogh
8be7eadace delete outdated deprecations 2024-01-22 09:11:35 +01:00
Rasmus Wriedt Larsen
72687e0368 Merge branch 'main' into automated-subclass-models 2023-12-19 17:08:25 +01:00
Rasmus Wriedt Larsen
9863309631 Python: auto subclass capture
(locally done with split + 5 x modeling runs + join, but squashed into one commit)
2023-12-19 17:07:40 +01:00
Rasmus Wriedt Larsen
de2a563a8e Python: Delete old auto subclass capture files
In the final git history this only deletes one file, but when working
locally I deleted ALL files.
2023-12-19 17:07:21 +01:00
Rasmus Wriedt Larsen
a78f13cb2e Python: Ignore known subclass models 2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
24a3a23c9c Python: Regenerate rest_framework models 2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
5c89c38c92 Python: Add the rest_framework models for demonstration purposes
Although it might be hidden by github UI by default, it could be
interesting for a reviewer to notice the effect changes in the modeling
query has to the results in this file.
2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
13c2378b58 Python: Update a few QLdocs 2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
937af906fd Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
e050f2e998 Python: Adjust subclass finder to no ESSA nodes
But the new test results looks very strange indeed!
2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
60b784a919 Python: Don't filter subclass tests away 2023-12-19 17:07:01 +01:00
Tom Hvitved
c8b4a215bc Merge pull request #14573 from hvitved/flow-summary-impl-param
Move `FlowSummaryImpl.qll` to `dataflow` pack
2023-12-14 12:24:15 +01:00
fossilet
1cc2f073c4 Fix typo in qll. 2023-12-14 16:05:14 +08:00
Tom Hvitved
a46964dfe8 Address review comments 2023-12-12 13:55:52 +01:00
Tom Hvitved
faaa558ed9 Python: Use FlowSummaryImpl from dataflow pack 2023-12-10 11:25:44 +01:00
Rasmus Wriedt Larsen
dc90411809 Python: Don't include docs/ folder 2023-12-08 11:27:53 +01:00
Rasmus Wriedt Larsen
004bb50ef2 Python: Disallow invalid path component 2023-12-08 11:27:53 +01:00
Rasmus Wriedt Larsen
6ce8cd38d8 Python: Disallow examples 2023-12-08 11:27:53 +01:00
Rasmus Wriedt Larsen
b24e565128 SubclassFinder: don't include site-packages 2023-12-08 11:27:53 +01:00
Rasmus Wriedt Larsen
aa5eee1eac Python: Revert manual pickle modeling
This reverts commit 62910f0cab525ca4d4901c4c27f6e6b22c3375fc.
This reverts commit 75a8197879ec47094d9b18f3dab7bcc1c1cdba28.

We don't find `kombu.serialization.pickle_load` since we respect
`__all__`. I think that was an attempt to not flood the captured
modeling with useless re-exports, but I think we've ended up doing that
anyway... we should consider to remove that restriction!

see 21d7df29c7/kombu/serialization.py (L29)
2023-12-08 11:27:53 +01:00
Rasmus Wriedt Larsen
f74581ad09 Revert "Python: Model owslib.etree.etree directly"
This reverts commit 1213e786519a11142746fd3a725c874181f3a42b.

By fixing a few bugs in the SubclassFinder + manually running Find.ql on the geonode DB from DCA, I found that the installed version of owslib had both: https://github.com/geopython/OWSLib/blob/0.27.2/owslib/etree.py
2023-12-08 11:27:53 +01:00
Rasmus Wriedt Larsen
6ef9a2b11e Python: Fix problem if import is used
I fixed it in both predicates... I think we might still be able to remove
`newDirectAlias` -- but with it being better, it will allow us to better test if `newImportAlias` actually cover everything we need!
2023-12-08 11:27:52 +01:00
Rasmus Wriedt Larsen
f1fd9b4c7a Python: Fix underlying problem of not using Alias 2023-12-08 11:27:52 +01:00
Taus
fa6aec7ae2 Python: Model owslib.etree.etree directly
Somehow, this alias did not get picked up by the tooling.
2023-12-08 11:27:52 +01:00
Taus
6d40e7e0fc Python: Add extensible modelling for lxml.etree 2023-12-08 11:27:52 +01:00
Taus
5b9d56774b Python: Refactor references to ElementTree
This would probably be better as a module, but I wanted to verify
first that this would yield the right results.
2023-12-08 11:27:52 +01:00
Taus
d29879a844 Python: Model kombu.serialization
More `pickle` wrappers.
2023-12-08 11:27:52 +01:00
Taus
a6dc6f3e42 Python: Add model for flask.restful
Not subclass-related -- just an alias.
2023-12-08 11:27:52 +01:00