Commit Graph

2002 Commits

Author SHA1 Message Date
am0o0
b20b733172 better structure for pandas DataFrame, it is now much better readable and also we can find much more DataFrame objects 2024-02-27 09:38:43 +04:00
amammad
ab219902a9 add jsonpickle and pexpect libs in case of unsafe decoding and secondary command execution, add proper test cases 2024-02-25 17:15:35 +04:00
amammad
385c3ba7ff continue to convert paramiko query to a more general query,
the proxy command is not a secondary command execution
so we can add proxy command to SystemCommandExecution::Range, update QLDocs,
add a proper Paramiko test case
fix a typo
2024-02-25 01:18:34 +04:00
amammad
d234a53c50 update Fabric models, add new sink to Fabric, add proper test cases 2024-02-24 17:43:51 +04:00
amammad
076faa3a4e add pyTorch :) code execution sinks, add proper tests 2024-02-24 15:55:33 +04:00
amammad
3d7db0e46b add panas code execution sinks, add proper tests 2024-02-24 14:44:06 +04:00
yoff
d3ee5f65db Merge pull request #15550 from yoff/python/remove-pointsto-from-module-getAnExport
python: remove a use of points-to
2024-02-20 19:04:46 +01:00
Rasmus Lerchedahl Petersen
22e72d2fed python: Move the rewrite out to Scope.qll 2024-02-20 10:39:29 +01:00
Rasmus Lerchedahl Petersen
de727bf1b5 Revert "python: remove a use of points-to"
This reverts commit 5cb71ce7e5.
2024-02-20 10:23:31 +01:00
Rasmus Wriedt Larsen
cbb9a64bbb Merge pull request #15457 from RasmusWL/psycopg
Python: Model the `psycopg` package
2024-02-12 15:59:16 +01:00
Tom Hvitved
1ea7717714 Capture flow: Take overwrites in nested scopes into account 2024-02-09 14:49:23 +01:00
Anders Schack-Mulligen
817aa7655f Python: Remove redundant IncludePostUpdateFlow and PhaseDependentFlow application. 2024-02-09 11:32:08 +01:00
Rasmus Lerchedahl Petersen
5cb71ce7e5 python: remove a use of points-to
This is used by `Scope::isPublic` which in turn is called by the framework model for `setuptools`.

On my current quesry, this had a dramatic effect on the most expensive predicates:

Before
```
Most expensive predicates for completed query FindUses.ql:
        time  | evals |   max @ iter | predicate
        ------|-------|--------------|----------
         1m9s |  2933 | 123ms @ 422  | PointsTo::Expressions::equalityEvaluatesTo/4#ebe72212@cab7d3xr
        43.1s |       |              | FlowSummaryImpl::Private::Steps::summaryLocalStep/3#900fb25e#ffb@8aa78a38
        41.3s |  2936 |  2.1s @ 409  | PointsTo::InterProceduralPointsTo::scope_entry_value_transfer_from_earlier/4#acb2199d@cab7ddxr
        30.2s |  2946 |  67ms @ 847  | PointsTo::PointsToInternal::multi_assignment_points_to/4#28782e93@cab7d0yr
        29.7s |  2930 |  1.9s @ 30   | Extensions::ReModulePointToExtension.pointsTo_helper/1#a84effde@cab7dn4w
        24.9s |  2933 |  84ms @ 414  | PointsTo::Expressions::inequalityEvaluatesTo/4#f0ecfab4@cab7d2xr
        17.9s |  2582 | 306ms @ 31   | MRO::ClassListList.getItem/1#b6c27115#reorder_2_0_1@cab7dw6r
         9.4s |   661 | 991ms @ 1    | SsaCompute::AdjacentUses::varBlockReaches/3#1824ad86@2b6af692
         9.2s |  2738 |  26ms @ 664  | MRO::ClassList.containsSpecial/0#c967dabb#fb@cab7dg4w
         8.9s |  2946 |  12ms @ 917  | PointsTo::Types::getBase/2#0ab04984@cab7du1w
         7.4s |  2946 | 287ms @ 3    | PointsTo::PointsToInternal::points_to_candidate/4#0a587a42@cab7d80w
         7.1s |  2934 |  14ms @ 2    | Constants::ConstantObjectInternal.attribute/3#6d9e12fc@cab7d6zr
         6.8s |  2946 |   9ms @ 48   | PointsTo::InterProceduralPointsTo::callsite_points_to/4#72419c70@cab7dqxr
         6.6s |   234 | 341ms @ 17   | ApiGraphs::API::Impl::rhs/3#2255afc6@a41b31w3
         6.6s |  2946 |  86ms @ 5    | PointsTo::Types::six_add_metaclass/4#f926a4cb@cab7da0w
         6.2s |  2930 | 341ms @ 30   | Extensions::RangeIterationVariableFact.pointsTo/3#662720c9#cpe#124@cab7di2w
         5.9s |   287 |  61ms @ 4    | DataFlowDispatch::TrackAttrReadInput::start/2#67f26627@cc7b56yn
         5.8s |       |              | DataFlowImplCommon::LambdaFlow::viableParamNonLambda/3#3123cc52_201#join_rhs@415f35h0
         5.6s |       |              | FlowSummaryImpl::Private::Steps::viableParam/4#49c13ab8@2c1fcdq1
         5.3s |       |              | FlowSummaryImpl::Private::Steps::viableParam/4#49c13ab8@22590ca9
         5.2s |   233 | 276ms @ 21   | ApiGraphs::API::Impl::use/3#e6c88b66@a41b30w3
         5.1s |  2945 | 177ms @ 4    | PointsTo::PointsToInternal::pointsTo/4#d99f16c6@cab7dj0w
         4.7s |       |              | Flow::ControlFlowNode.toString/0#dispred#e1af144b@410c23a7
         4.6s |   277 |  2.2s @ 6    | DataFlowDispatch::getCallArg/5#21589076@cc7b5vxn
         4.5s |       |              | DataFlowImplCommon::Cached::viableParam/3#61239ead@cc05a1fv
         4.3s |       |              | DataFlowImplCommon::LambdaFlow::viableParamNonLambda/3#3123cc52@cb992b2h
         4.1s |       |              | _AstExtended::AstNode.getLocation/0#dispred#6b4dcb62_10#join_rhs_DataFlowPublic::Node.getLocation/0#__#shared@6ae639js
           4s |       |              | Files::Location.toString/0#dispred#7e7e0516@b72abbo2
         3.7s |       |              | locations_ast_234501#join_rhs@0859685o
         3.7s |    10 |  1.7s @ 1    | ObjectInternal::ObjectInternal.toString/0#dispred#0b2e9429@6e8a4yh7
         3.6s |  2942 |  63ms @ 94   | PointsTo::InterProceduralPointsTo::call_points_to_from_callee/4#394022a8@cab7d90w
         3.6s |   232 | 213ms @ 18   | ApiGraphs::API::Impl::trackDefNode/2#8e3c4e6d@a41b33w3
         3.6s |  2933 |   7ms @ 884  | PointsTo::Types::getInheritedMetaclass/2#097d39df#bff@cab7dr1w
         3.6s |  2946 |  1.3s @ 13   | PointsTo::PointsToInternal::ssa_node_refinement_points_to/4#8ea6486b@cab7dnxr
         3.5s |  1319 | 387ms @ 3    | SsaCompute::SsaDefinitions::reachesEndOfBlock/4#214bd902@fce54web
         3.5s |  1320 | 385ms @ 2    | SsaCompute::SsaDefinitions::reachesEndOfBlockRec/4#63bb2cd4@fce54xeb
         3.4s |  4861 | 478ms @ 2    | SsaCompute::SsaComputeImpl::ssaDefReachesRank/4#f19c6fee@cc8515rd
         3.3s |       |              | _AstExtended::AstNode.getLocation/0#dispred#6b4dcb62_10#join_rhs_DataFlowPublic::Node.getLocation/0#__#higher_order_body@47ba63n6
         3.3s |       |              | DataFlowPublic::Node.toString/0#dispred#af9c307a@4d16e7m6
         3.3s |  2946 |  28ms @ 3    | PointsTo::PointsToInternal::reachableEdge/3#d3f53c12@cab7do7w
         2.9s |   233 | 110ms @ 19   | ApiGraphs::API::Impl::trackUseNode/2#a0b4384d@a41b32w3
         2.8s |    31 |  2.2s @ 9    | _Class::Class.getAMethod/0#dispred#66416e47_DataFlowDispatch::findFunctionAccordingToMroKnownStartin__#antijoin_rhs@L6#cc7b5
         2.8s |  2737 |  21ms @ 444  | MRO::ClassListList.removedClassParts/4#de59b06f#reorder_2_3_4_0_1@cab7d06w
         2.8s |  1322 | 462ms @ 4    | SsaCompute::Liveness::liveAtExit/2#b6aa63f4@6fd4cx73
         2.8s |  2946 | 187ms @ 5    | PointsTo::Expressions::builtinCallPointsTo/5#3aa7f48b@cab7dwwr
         2.8s |  2939 |  41ms @ 7    | PointsTo::PointsToInternal::use_points_to/4#ff1d0edd@cab7df0w
         2.7s |  2946 |  20ms @ 92   | PointsTo::Conditionals::evaluates/5#736734b2#fbffff#reorder_5_0_2_1_3_4@cab7dp5w
         2.6s |  2946 | 152ms @ 5    | Constants::callToBool/2#0b9b1e8d@cab7dn7w
         2.5s |   287 |  24ms @ 4    | DataFlowDispatch::resolveClassInstanceCall/3#6e09c292@cc7b53xn
         2.4s |  2946 |  31ms @ 5    | PointsTo::AttributePointsTo::variableAttributePointsTo/5#60adcc49@cab7dpwr

[2024-02-08 10:44:37] Total evaluation times for this run:
        * Wall-clock duration of evaluation run: 1231.1 seconds
        * Total time spent evaluating predicates: 1167.1 seconds
```

After
```
Most expensive predicates for completed query FindUses.ql:
        time  | evals |   max @ iter | predicate
        ------|-------|--------------|----------
        41.6s |       |              | FlowSummaryImpl::Private::Steps::summaryLocalStep/3#900fb25e#ffb@85aaaac1
         9.2s |   661 | 905ms @ 1    | SsaCompute::AdjacentUses::varBlockReaches/3#1824ad86@2b6af692
         7.6s |   234 | 502ms @ 19   | ApiGraphs::API::Impl::rhs/3#2255afc6@ce6d11wc
         6.7s |       |              | DataFlowImplCommon::LambdaFlow::viableParamNonLambda/3#3123cc52_201#join_rhs@fd1dc5mi
           6s |   287 |  80ms @ 113  | DataFlowDispatch::TrackAttrReadInput::start/2#67f26627@925826yr
         5.7s |       |              | FlowSummaryImpl::Private::Steps::viableParam/4#49c13ab8@851052bl
         5.6s |   233 | 289ms @ 21   | ApiGraphs::API::Impl::use/3#e6c88b66@ce6d10wc
         5.4s |       |              | FlowSummaryImpl::Private::Steps::viableParam/4#49c13ab8@f2c42d17
         4.8s |   277 |  2.4s @ 6    | DataFlowDispatch::getCallArg/5#21589076@92582vxr
         4.7s |       |              | DataFlowImplCommon::Cached::viableParam/3#61239ead@ac08e0nf
         4.7s |       |              | DataFlowImplCommon::LambdaFlow::viableParamNonLambda/3#3123cc52@82ff50ql
         4.6s |       |              | Files::Location.toString/0#dispred#7e7e0516@b72abbo2
         4.3s |       |              | Flow::ControlFlowNode.toString/0#dispred#e1af144b@410c23a7
         4.2s |   232 | 249ms @ 19   | ApiGraphs::API::Impl::trackDefNode/2#8e3c4e6d@ce6d13wc
         3.8s |       |              | _AstExtended::AstNode.getLocation/0#dispred#6b4dcb62_10#join_rhs_DataFlowPublic::Node.getLocation/0#__#shared@0ac73425
         3.6s |  1319 | 354ms @ 1    | SsaCompute::SsaDefinitions::reachesEndOfBlock/4#214bd902@fce54web
         3.6s |  1320 | 381ms @ 2    | SsaCompute::SsaDefinitions::reachesEndOfBlockRec/4#63bb2cd4@fce54xeb
         3.4s |       |              | _AstExtended::AstNode.getLocation/0#dispred#6b4dcb62_10#join_rhs_DataFlowPublic::Node.getLocation/0#__#higher_order_body@9e946ea8
         3.4s |  4861 | 474ms @ 2    | SsaCompute::SsaComputeImpl::ssaDefReachesRank/4#f19c6fee@cc8515rd
         3.1s |    31 |  2.5s @ 9    | _Class::Class.getAMethod/0#dispred#66416e47_DataFlowDispatch::findFunctionAccordingToMroKnownStartin__#antijoin_rhs@L6#92582
           3s |    53 | 114ms @ 48   | DataFlowDispatch::TrackAttrReadInput::start/2#67f26627@9ab38jw0
           3s |   233 | 126ms @ 20   | ApiGraphs::API::Impl::trackUseNode/2#a0b4384d@ce6d12wc
           3s |       |              | locations_ast_234501#join_rhs@0859685o
           3s |       |              | DataFlowPublic::Node.toString/0#dispred#af9c307a@a2145cqf
         2.8s |   234 | 206ms @ 21   | _ApiGraphs::API::Impl::MkDef#51c2f877#prev_ApiGraphs::API::Impl::trackDefNode/1#7e78e336#prev_delta___#antijoin_rhs#1@L9#ce6d1
         2.8s |  1322 | 447ms @ 4    | SsaCompute::Liveness::liveAtExit/2#b6aa63f4@6fd4cx73
         2.7s |   230 | 176ms @ 28   | ApiGraphs::API::Impl::MkDef#51c2f877@ce6d1w9c
         2.5s |   287 |  50ms @ 112  | DataFlowDispatch::resolveClassInstanceCall/3#6e09c292@925823xr
         2.4s |   234 | 246ms @ 19   | _ApiGraphs::API::Impl::MkDef#51c2f877#prev_ApiGraphs::API::Impl::trackDefNode/1#7e78e336#prev_delta___#antijoin_rhs@L4#ce6d1
         2.3s |       |              | TaintTrackingPrivate::localAdditionalTaintStep/2#a2ec8c9d@e31201hd
         2.2s |    53 |  72ms @ 15   | DataFlowDispatch::TrackAttrReadInput::start/2#67f26627@96b28jwo
         2.2s |       |              | SensitiveDataSources::SensitiveDataModeling::sensitiveString/1#fdc3ad40@41f6ee2g
           2s |       |              | DataFlowImplCommon::Cached::viableParamArg/3#4c55eddb@8f7f25oq
           2s |       |              | Flow::ControlFlowNode.getExprChild/1#e757d179#bbf@db51e8ed
         1.9s |       |              | project#FlowSummaryImpl::Private::Steps::viableParam/4#49c13ab8#2@e36c2dr8
         1.9s |       |              | DataFlowPublic::Node.hasLocationInfo/5#dispred#b79d995f@6e929dfv
         1.7s |    15 | 433ms @ 1    | PoorMansFunctionResolution::poorMansFunctionTracker/2#75430e01@e5202dnv
         1.7s |       |              | #ImportResolution::ImportResolution::allowedEssaImportStep/2#f4117c61Plus#swapped@60d9daea
         1.7s |    29 | 633ms @ 6    | _Class::Class.getAMethod/0#dispred#66416e47_Function::Function.getName/0#dispred#033700ef_10#join_rh__#antijoin_rhs@L4#92582
         1.5s |   233 |  79ms @ 24   | ApiGraphs::API::Impl::trackUseNode/1#1af3a9ea@ce6d16wc
         1.5s |       |              | ApiGraphs::API::Impl::edge/3#8453bf65@1bd8a6ja
         1.5s |       |              | ApiGraphs::API::Node.getAValueReachableFromSource/0#dispred#9a406fb1@5dbb806u
         1.3s |  1323 | 178ms @ 13   | SsaCompute::Liveness::liveAtEntry/2#bab3ea7c@6fd4cw73
         1.3s |       |              | SsaCompute::SsaComputeImpl::defUseRank/4#782a2f48@0f27919s
         1.3s |       |              | DataFlowDispatch::LibraryCallable.getACall/0#dispred#66a01171#fb@96b65frd
         1.3s |       |              | ApiGraphs::API::Node.getAValueReachableFromSource/0#dispred#9a406fb1_10#join_rhs@c1dd43nv
         1.3s |       |              | FlowSummaryImpl::Private::SummaryNode.toString/0#dispred#d499e234@63bd684g
         1.2s |       |              | DataFlowDispatch::LibraryCallable.getACall/0#dispred#66a01171#fb@eaebb27g
         1.2s |       |              | _DataFlowPublic::Node#da3b6093_DataFlowPublic::Node.asExpr/0#dispred#2845197a_py_exprs#antijoin_rhs@fcd8c3kj
         1.2s |       |              | #ImportResolution::ImportResolution::allowedEssaImportStep/2#f4117c61Plus#swapped@c3f634us

[2024-02-08 11:43:50] Total evaluation times for this run:
        * Wall-clock duration of evaluation run: 636.9 seconds
        * Total time spent evaluating predicates: 562.4 seconds
```
2024-02-08 12:20:56 +01:00
James Ockers
eb5e0123d6 exclude certification from maybeCertificate() regexes 2024-01-30 13:16:18 -08:00
Rasmus Wriedt Larsen
c265c15f3f Merge pull request #15398 from RasmusWL/html-escape
Python: Add `html.escape` as HTML sanitizer
2024-01-30 16:06:01 +01:00
Rasmus Wriedt Larsen
c70b32f7eb Python: Require quote escaping for html.escape 2024-01-30 12:17:01 +01:00
Rasmus Wriedt Larsen
3f0dc2b022 Python: Model the psycopg package 2024-01-29 14:30:20 +01:00
yoff
391ca5d8a6 Merge pull request #15390 from Marcono1234/marcono1234/python-ascii-regex-flag 2024-01-29 14:27:50 +01:00
Marcono1234
1ad08efe08 Python: Support a (ASCII) inline regex flag 2024-01-26 22:18:49 +01:00
Erik Krogh Kristensen
f1d6f56621 Merge pull request #15393 from erik-krogh/deps-jan-2024
All: delete outdated deprecations
2024-01-23 13:52:38 +01:00
Rasmus Wriedt Larsen
cbed6e861d Python: Add html.escape as HTML sanitizer 2024-01-22 17:32:28 +01:00
Max Schaefer
17e3a45ad7 Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2024-01-22 13:36:12 +00:00
Max Schaefer
98178458d0 Python: Add support for more URL redirect sanitisers.
Since some sanitisers don't handle backslashes correctly, I updated the data-flow configuration to incorporate a flow state tracking whether or not backslashes have been eliminated or converted to forward slashes.
2024-01-22 13:24:18 +00:00
erik-krogh
8be7eadace delete outdated deprecations 2024-01-22 09:11:35 +01:00
Tom Hvitved
f90201eb56 Data flow: Remove column from mayBenefitFromCallContext 2024-01-09 11:34:43 +01:00
Rasmus Wriedt Larsen
95c24275f2 Merge pull request #15044 from RasmusWL/automated-subclass-models
Python: Automated subclass models
2024-01-05 10:43:48 +01:00
Rasmus Lerchedahl Petersen
0f89f69555 Python: fix VariableWrite and remove unneded step 2023-12-20 15:45:18 +01:00
Rasmus Lerchedahl Petersen
215b146f06 Python: remove unused member predicate 2023-12-20 14:45:00 +01:00
Rasmus Lerchedahl Petersen
491ca3f1e6 Python: hide synthetic variable node 2023-12-20 14:42:45 +01:00
Rasmus Lerchedahl Petersen
afb3d1da6f Python: move capture node to DataFlowPrivate 2023-12-20 14:41:17 +01:00
Rasmus Lerchedahl Petersen
3cea46fe7b Python: fix typos 2023-12-20 14:35:10 +01:00
Rasmus Lerchedahl Petersen
f8417b0dd8 Merge branch 'main' of https://github.com/github/codeql into python/captured-variables-basic 2023-12-20 13:16:42 +01:00
Rasmus Lerchedahl Petersen
07c88dc0be Python: remove unnecessary post-processing
also, it is slightly incorrect...
2023-12-20 12:09:15 +01:00
Rasmus Lerchedahl Petersen
169d7a3c98 Python: Add scope entry definition nodes
otherwise we confuse captured variables
in the single scope entry cfg node. Now
we have one for each defined variable.
2023-12-20 12:09:00 +01:00
Rasmus Wriedt Larsen
72687e0368 Merge branch 'main' into automated-subclass-models 2023-12-19 17:08:25 +01:00
Rasmus Wriedt Larsen
56d86f9980 Revert "NEVER MERGE: Ensure we don't use site-packages stuff"
This reverts commit 0ed363bd79f9d3f9e9a905c1192adfe88f1faffb.
2023-12-19 17:07:40 +01:00
Rasmus Wriedt Larsen
9863309631 Python: auto subclass capture
(locally done with split + 5 x modeling runs + join, but squashed into one commit)
2023-12-19 17:07:40 +01:00
Rasmus Wriedt Larsen
ca7b69ec1f NEVER MERGE: Ensure we don't use site-packages stuff 2023-12-19 17:07:40 +01:00
Rasmus Wriedt Larsen
de2a563a8e Python: Delete old auto subclass capture files
In the final git history this only deletes one file, but when working
locally I deleted ALL files.
2023-12-19 17:07:21 +01:00
Rasmus Wriedt Larsen
a78f13cb2e Python: Ignore known subclass models 2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
24a3a23c9c Python: Regenerate rest_framework models 2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
5c89c38c92 Python: Add the rest_framework models for demonstration purposes
Although it might be hidden by github UI by default, it could be
interesting for a reviewer to notice the effect changes in the modeling
query has to the results in this file.
2023-12-19 17:07:02 +01:00
Rasmus Wriedt Larsen
13c2378b58 Python: Update a few QLdocs 2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
937af906fd Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2023-12-19 17:07:01 +01:00
Rasmus Lerchedahl Petersen
c563c7fbe4 Python: remove control flow nodes
for module entry definitions from the dataflow graph.
2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
e050f2e998 Python: Adjust subclass finder to no ESSA nodes
But the new test results looks very strange indeed!
2023-12-19 17:07:01 +01:00
Rasmus Wriedt Larsen
60b784a919 Python: Don't filter subclass tests away 2023-12-19 17:07:01 +01:00
yoff
a60c52b8b7 Merge branch 'main' into python/captured-variables-basic 2023-12-18 23:44:46 +01:00
Rasmus Lerchedahl Petersen
78c484faab Python: remove support for capturing callbacks
This will be added in a follow-up PR instead.
2023-12-18 23:24:57 +01:00
Rasmus Lerchedahl Petersen
6e4011d2ae Python: rename sythetic nodes
Avoid the term "closure" as it is somewhat academic.
2023-12-18 23:16:51 +01:00