Commit Graph

4678 Commits

Author SHA1 Message Date
Taus
4a29095e3b Python: Fix bad join order in TPythonTuple
TL;DR: Something introduced the following bad join order:
```
(227s) Tuple counts for dom#TObject::TPythonTuple#ff/2@i2#8f58670w after 3m46s:
25000      ~0%     {2} r1 = SCAN PointsToContext::PointsToContext::appliesToScope_dispred#ff#prev_delta OUTPUT In.1, In.0 'context'
24000      ~1%     {2} r2 = JOIN r1 WITH @py_scope#f ON FIRST 1 OUTPUT Lhs.1 'context', Lhs.0
1076876712 ~6%     {3} r3 = JOIN r2 WITH Flow::TupleNode#class#f CARTESIAN PRODUCT OUTPUT Rhs.0, Lhs.0 'context', Lhs.1
870129666  ~0%     {3} r4 = JOIN r3 WITH Flow::ControlFlowNode::isLoad_dispred#f ON FIRST 1 OUTPUT Lhs.1 'context', Lhs.2, Lhs.0 'origin'
870129000  ~0%     {3} r5 = r4 AND NOT dom#TObject::TPythonTuple#ff#prev(Lhs.2 'origin', Lhs.0 'context')
870129000  ~1%     {3} r6 = SCAN r5 OUTPUT In.2 'origin', In.1, In.0 'context'
9000       ~0%     {2} r7 = JOIN r6 WITH Flow::ControlFlowNode::getScope_dispred#ff ON FIRST 2 OUTPUT Lhs.0 'origin', Lhs.2 'context'
                    return r7
```
(...the above being the tuple counts _at the point when I cancelled the
query_!)

Rewriting the code to force a join between `TupleNode#class` and
`getScope` results in the following join orders:

```
(0s) Tuple counts for TObject::scope_loads_tuplenode#ff/2@b3cf0bo5 after 13ms:
37369 ~3%     {1} r1 = JOIN Flow::TupleNode#class#f WITH Flow::ControlFlowNode::isLoad_dispred#f ON FIRST 1 OUTPUT Lhs.0 'origin'
37369 ~3%     {2} r2 = JOIN r1 WITH Flow::ControlFlowNode::getScope_dispred#ff ON FIRST 1 OUTPUT Rhs.1 's', Lhs.0 'origin'
            return r2
```
and
```
(78s) Tuple counts for dom#TObject::TPythonTuple#ff/2@i53#121c440w after 6ms:
34736 ~3%     {2} r1 = SCAN PointsToContext::PointsToContext::appliesToScope_dispred#ff#prev_delta OUTPUT In.1, In.0 'context'
7370  ~5%     {2} r2 = JOIN r1 WITH TObject::scope_loads_tuplenode#ff ON FIRST 1 OUTPUT Lhs.1 'context', Rhs.1 'origin'
7370  ~5%     {2} r3 = r2 AND NOT dom#TObject::TPythonTuple#ff#prev(Lhs.1 'origin', Lhs.0 'context')
7370  ~1%     {2} r4 = SCAN r3 OUTPUT In.1 'origin', In.0 'context'
            return r4
```
the latter being the largest iteration of `dom#TPythonTuple` throughout
the log.

No other major performance issues were observed.
2022-01-31 16:59:50 +00:00
Taus
47a57e0c0a Merge pull request #7635 from github/python/support-match
Python/support match
2022-01-28 11:55:46 +01:00
yoff
74d57bbb1a Update python/ql/lib/semmle/python/dataflow/new/internal/DataFlowPrivate.qll
Co-authored-by: Taus <tausbn@github.com>
2022-01-28 11:38:29 +01:00
Dave Bartolomeo
cca74e925f Merge pull request #7724 from github/aeisenberg/examples-groups
Add new groups for examples packs
2022-01-27 12:11:26 -05:00
Rasmus Lerchedahl Petersen
c60df7d69c Merge branch 'main' of github.com:github/codeql into python/support-match 2022-01-27 16:45:17 +01:00
yoff
4632c14280 Merge pull request #7654 from RasmusWL/remove-old-pointsto-queries
Python: Cleanup: Remove old points-to versions of queries
2022-01-27 16:39:01 +01:00
Rasmus Lerchedahl Petersen
b93c04bb79 python: Add reverse flow in some patterns
Particularly in value and literal patterns.
This is getting a little bit into the guards aspect of matching.
We could similarly add reverse flow in terms of
sub-patterns storing to a sequence pattern,
a flow step from alternatives to an-or-pattern, etc..
It does not seem too likely that sources are embedded in patterns
to begin with, but for secrets perhaps?

It is illustrated by the literal test. The value test still fails.
I believe we miss flow in general from the static attribute.
2022-01-27 15:20:23 +01:00
Rasmus Lerchedahl Petersen
cb52ab669e python: address review comments
The comment about `py_scopes` was simply removed
2022-01-27 11:17:00 +01:00
yoff
e28669e487 Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2022-01-27 10:31:43 +01:00
Andrew Eisenberg
a7f755cf12 Add new groups for examples packs
Also, remove version numbers. Will make it easier to avoid publishing
the examples packs.
2022-01-26 14:49:18 -08:00
Rasmus Lerchedahl Petersen
47af3a69a5 Merge branch 'main' of github.com:github/codeql into python/support-match 2022-01-26 11:39:46 +01:00
Edoardo Pirovano
1b539eb4dc Merge branch rc/3.4 into main 2022-01-25 16:22:01 +00:00
Rasmus Wriedt Larsen
301318020f Merge pull request #7455 from haby0/py/add-shutil-module-path-injection-sinks
Python: Add shutil module sinks for path injection query
2022-01-24 20:06:36 +01:00
Erik Krogh Kristensen
a235f8f023 remove redundant inline type casts 2022-01-21 11:46:33 +01:00
Erik Krogh Kristensen
ddfc3bc00f use set literals instead of big disjunctions 2022-01-21 11:46:33 +01:00
yoff
5b9ae9cede Merge pull request #7659 from RasmusWL/move-regex-injection-files
Python: Move regex injection configuration files
2022-01-21 11:42:06 +01:00
yoff
4fd0ada9a8 Merge pull request #7652 from RasmusWL/cleartext-remove-fps
Python: Remove usernames as sensitive source for cleartext queries
2022-01-21 11:30:40 +01:00
Rasmus Wriedt Larsen
f53dce3a83 Python: Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2022-01-20 14:20:15 +01:00
github-actions[bot]
ab218421da Post-release preparation for codeql-cli-2.7.6 2022-01-20 12:59:20 +00:00
Erik Krogh Kristensen
4e8e3a7420 simplify expressions that could be type-casts 2022-01-20 10:41:35 +01:00
github-actions[bot]
4ce8ccc52b Release preparation for version 2.7.6 2022-01-20 08:21:18 +00:00
Rasmus Lerchedahl Petersen
32cbeae05f python: missing start tag for relation 2022-01-20 08:56:12 +01:00
Rasmus Lerchedahl Petersen
d10ad3bdd4 python: update stats for tables 2022-01-20 08:42:32 +01:00
Rasmus Lerchedahl Petersen
7e9a9e3d9a python: remove compiler warnings 2022-01-19 18:01:58 +01:00
Rasmus Wriedt Larsen
b9ee2960e2 Python: Add change-note 2022-01-19 17:24:53 +01:00
Rasmus Wriedt Larsen
aa10ad6a8a Python: Fix RegexInjection query, add old deprecated versions 2022-01-19 17:22:44 +01:00
Rasmus Wriedt Larsen
e82ea7ad17 Python: move regex injection configuration files
I did not notice that these went to the wrong location in
https://github.com/github/codeql/pull/6693. They should be in the
dataflow folder with the rest of the data-flow configurations files, the
injection folder is for old points-to based modeling.
2022-01-19 17:21:46 +01:00
Rasmus Lerchedahl Petersen
a0e79c1d7a update stats for types
- should still update stats for tables
2022-01-19 16:38:19 +01:00
Rasmus Wriedt Larsen
93b3cd669a Python: Cleanup: Remove old points-to versions of queries
Since we've internally agreed that we've reached the same or better set
of results.
2022-01-19 15:30:12 +01:00
Rasmus Wriedt Larsen
e82e648ca1 Python: Remove usernames as sensitive source for cleartext queries
Closes #6363, #6927, #6726, #7497, #7116
2022-01-19 15:25:21 +01:00
Rasmus Lerchedahl Petersen
db253e8939 python: upgrade and downgrade scripts 2022-01-19 15:22:57 +01:00
Rasmus Wriedt Larsen
f3daff4e5a Python: Add FP tests for cleartext logging 2022-01-19 15:13:06 +01:00
Rasmus Lerchedahl Petersen
ef9fb0873f python: tools for writing upgrades and downgrade
adapted from [the ruby instructions](https://github.com/github/codeql/blob/main/ruby/doc/prepare-db-upgrade.md)
2022-01-19 14:29:58 +01:00
Rasmus Lerchedahl Petersen
36e18d5d80 python: dataflow for match
- also update `validTest.py`, but commented out for now
  otherwise CI will fail until we force it to run with Python 3.10
- added debug utility for dataflow (`dataflowTestPaths.ql`)
2022-01-19 14:29:58 +01:00
Rasmus Lerchedahl Petersen
bb210f4172 pythos: SSA for match
- new SSA definition `PatternCaptureDefinition`
- new SSA definition `PatternAliasDefinition`
- implement `hasDefiningNode`
2022-01-19 14:29:58 +01:00
Rasmus Lerchedahl Petersen
de8ecb214f python: Wrappers for database classes
- new syntactic category `Pattern` (in `Patterns.qll`)
- subpatterns available on statments
- new statements `MatchStmt` and `Case`
  (`Match` would conflict with the shared ReDoS library)
- new expression `Guard`
- support for pattern lists
2022-01-19 14:29:58 +01:00
Rasmus Lerchedahl Petersen
b17f844f35 python: New generated files 2022-01-19 13:36:32 +01:00
Rasmus Wriedt Larsen
95e935e9c1 Python: Support SQLAlchemy scoped_session 2022-01-18 14:34:31 +01:00
Anders Schack-Mulligen
fff3b5c5b4 Dataflow: Add qldoc. 2022-01-18 10:39:55 +01:00
Anders Schack-Mulligen
71e39353ca Dataflow: Sync. 2022-01-18 10:36:52 +01:00
Anders Schack-Mulligen
dfa79f6119 Dataflow: Sync. 2022-01-18 10:30:09 +01:00
Chris Smowton
2c37885f6e Sync dataflow 2022-01-18 10:30:09 +01:00
Andrew Eisenberg
fbb5d7196f Merge branch 'main' into post-release-prep/codeql-cli-2.7.5 2022-01-14 08:23:43 -08:00
Anders Schack-Mulligen
0b24af901d Merge pull request #7349 from aschackmull/dataflow/state
Dataflow: Add support for flow state
2022-01-14 09:12:38 +01:00
Andrew Eisenberg
4ffd8c62ac Merge pull request #7579 from github/aeisenberg/changenote-upgrades-removal
Changenotes: Add changenotes for upgrades refactoring
2022-01-13 09:09:06 -08:00
Anders Schack-Mulligen
c44cf29992 Merge pull request #7587 from owen-mc/add-default-taint-sanitizer-guard
Dataflow: Add default taint sanitizer guard
2022-01-13 14:44:55 +01:00
Anders Schack-Mulligen
f7cf327e71 Dataflow: Sync 2022-01-13 13:28:43 +01:00
Erik Krogh Kristensen
89bab6ae12 Merge pull request #7097 from erik-krogh/railsReDoS
JS/PY/RB: support a limited number of ranges for ReDoS analysis
2022-01-13 11:04:36 +01:00
Andrew Eisenberg
e435a3e9c3 Changenotes: Add changenotes for upgrades refactoring 2022-01-12 11:36:31 -08:00
Owen Mansel-Chan
8e8278764b Add predicate defaultTaintSanitizerGuard for each language
This was done manually, as these files are not synced by sync-files.py.
2022-01-12 14:44:56 +00:00