Commit Graph

5331 Commits

Author SHA1 Message Date
Rasmus Lerchedahl Petersen
e51ba6f421 python: rename test directory 2022-02-08 11:20:10 +01:00
Rasmus Lerchedahl Petersen
e52dca0a35 python: move tests 2022-02-08 11:19:28 +01:00
Rasmus Wriedt Larsen
a8edd44a3c Python: Update .expected 2022-02-08 11:12:34 +01:00
Rasmus Wriedt Larsen
eb109828c0 Merge pull request #7252 from museljh/feature/cwe-338
Python: CWE-338 insecureRandomness
2022-02-07 19:30:06 +01:00
Rasmus Wriedt Larsen
62702d0ca9 Python: Fix setStoreStep to use SetElementContent 2022-02-07 13:18:36 +01:00
Rasmus Wriedt Larsen
b276b2d48c Python: Clean up taint steps for attributes 2022-02-07 13:12:31 +01:00
Rasmus Wriedt Larsen
59160eeb24 Python: Add test showing taint for attr store
In `x.arg = TAINTED_STRING` there is a store step to the attribute `arg`
of `x`. In our taint modeling, we allow _any_ store step with the code
below. This means that we also say there is a taint-step directly from
`TAINTED_STRING` to `x` :|

```codeql
  // construction by literal
  // TODO: Not limiting the content argument here feels like a BIG hack, but we currently get nothing for free :|
  DataFlowPrivate::storeStep(nodeFrom, _, nodeTo)
```
2022-02-07 13:12:28 +01:00
Rasmus Wriedt Larsen
32cd7d6fa7 Add groups to all consistency-queries/qlpack.yml
as discussed in PR review
2022-02-07 11:15:48 +01:00
github-actions[bot]
b4ab86c020 Post-release preparation for codeql-cli-2.8.0 2022-02-06 23:34:07 +00:00
jorgectf
43fde3561f Merge branch 'jorgectf/python/deserialization' of https://github.com/jorgectf/codeql into jorgectf/python/deserialization 2022-02-04 16:32:11 +01:00
Jorge
d96eb01b9c Merge branch 'github:main' into jorgectf/python/deserialization 2022-02-04 16:32:01 +01:00
yoff
182c62f5c3 Merge pull request #7838 from tausbn/python-fix-charset-performance-problem
Python: Fix performance issue in `charSet`
2022-02-04 14:18:13 +01:00
Taus
67be20f368 Python: Remove implied inequalities
Also gets rid of `inner_end`, since we're already doing `end - 1 = ...`
in the other fix (and so this is more consistent).
2022-02-04 12:46:06 +00:00
Rasmus Wriedt Larsen
c817ba5718 Python: Add consistency-queries/qlpack.yml
But no queries yet
2022-02-04 12:08:54 +01:00
Rasmus Wriedt Larsen
2e788ea86e Python: Accept deprecation warnings for old tests 2022-02-04 12:02:09 +01:00
Rasmus Wriedt Larsen
438a01e911 Python: Deprecate old bottle points-to extension 2022-02-04 12:02:09 +01:00
Rasmus Wriedt Larsen
c9e36aaf72 Python: Fix deprecated deprecated 2022-02-04 12:02:09 +01:00
Rasmus Wriedt Larsen
9ec531f040 Python: Add deprecation change-note 2022-02-04 12:02:09 +01:00
Rasmus Wriedt Larsen
84fdd8a739 Python: Add non-deprecated httpVerb to Concepts 2022-02-04 12:02:09 +01:00
Rasmus Wriedt Larsen
5a032d6f84 Python: deprecate old taint-tracking related predicates 2022-02-04 12:02:08 +01:00
Rasmus Wriedt Larsen
dba6b60c80 Python: Deprecate old library modeling 2022-02-04 12:02:08 +01:00
Rasmus Wriedt Larsen
a40fdf7a7c Python: Deprecate old web modeling 2022-02-04 12:02:08 +01:00
Rasmus Wriedt Larsen
14a1aa0c11 Python: Add change-note
I went with `minorAnalysis` instead of `majorAnalysis`, since I don't
think the impact of this change will be major (but that's just my gut
feeling).
2022-02-04 12:00:49 +01:00
Rasmus Wriedt Larsen
b2ce0fcb72 Python: Add post-update nodes to args of unresolved calls
Besides solving the problem with `setattr`, it also solved some old
problems with json library modeling (yay).
2022-02-04 11:51:53 +01:00
Rasmus Wriedt Larsen
e9b496ba73 Merge pull request #7831 from RasmusWL/printast-remove-regexp
Python: Remove `RegExpTerm` from PrintAST
2022-02-04 11:38:58 +01:00
Erik Krogh Kristensen
5e23da813f rename named-parameters to keyword-parameters 2022-02-03 23:10:39 +01:00
Erik Krogh Kristensen
e434f075fa introduce, and use, API::APICallNode 2022-02-03 23:10:39 +01:00
Erik Krogh Kristensen
3801a158a8 remove module exporst nodes from API graphs 2022-02-03 23:10:39 +01:00
Erik Krogh Kristensen
c3f4a851f0 remove some TODOs I won't do 2022-02-03 23:10:39 +01:00
Erik Krogh Kristensen
3be3da2eb6 add recursive API-graph test 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
ef5818e243 support import * in ApiGraphs 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
16774ba285 add support for named parameters in API graphs 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
095c73f1fe redo the ApiGraph testing framework 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
66fd43fc3b add def edge for function returns 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
d8eea7ba4c property writes are def nodes 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
a908b219e9 more backtracking of def nodes, and lots of tests 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
038b032a43 get basic module exports to work in API-graphs 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
df9efbe778 get mimimal def nodes to work in python 2022-02-03 23:10:38 +01:00
Erik Krogh Kristensen
52ca0d168b move API-graph tests out of the experimental test folder 2022-02-03 23:10:37 +01:00
Erik Krogh Kristensen
89786d9ce2 rename pr to ref in memberFromRef 2022-02-03 23:10:37 +01:00
Harry Maclean
ab7fd89653 Merge pull request #7663 from github/hmac/api-graph-subclass
Ruby: Add basic subclassing support to API Graphs
2022-02-04 10:19:07 +13:00
Taus
22aa4c9379 Python: Fix performance issue in charSet
Observed on `mozilla/bugbug` on the 2.8.0 CLI branch, we had the
following line in the timing report:
```
FullServerSideRequestForgery.ql-17:regex::RegexString::charSet_dispred#fff#antijoin_rhs ............... 1m13s
```

Inspecting the logs, we see the following join:

```
(644s) Tuple counts for regex::RegexString::charSet_dispred#fff#antijoin_rhs/5@f295d1bk after 1m13s:
1         ~0%         {1} r1 = CONSTANT(unique string)["]"]
2389      ~4%         {3} r2 = JOIN r1 WITH regex::RegexString::nonEscapedCharAt_dispred#fff_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0', Rhs.2 'arg1', (Rhs.2 'arg1' + 1)
668873    ~0%         {6} r3 = JOIN r2 WITH regex::RegexString::char_set_start_dispred#fff ON FIRST 1 OUTPUT Lhs.0 'arg0', "]", Lhs.1 'arg1', Lhs.2 'arg2', Rhs.1 'arg3', Rhs.2 'arg4'
537501371 ~4%         {7} r4 = JOIN r3 WITH regex::RegexString::nonEscapedCharAt_dispred#fff_021#join_rhs ON FIRST 2 OUTPUT Lhs.0 'arg0', Lhs.2 'arg1', Lhs.3 'arg2', Lhs.4 'arg3', Lhs.5 'arg4', "]", Rhs.2
269085087 ~0%         {7} r5 = SELECT r4 ON In.6 > In.4 'arg4'
89583155  ~3%         {7} r6 = SELECT r5 ON In.6 < In.1 'arg1'
89583155  ~26634%     {5} r7 = SCAN r6 OUTPUT In.0 'arg0', In.1 'arg1', In.2 'arg2', In.3 'arg3', In.4 'arg4'
                    return r7
```
Now, this is problematic not just because of the large intermediary join
but also because of the large number of tuples being materialised at the
end. The culprit in this case turns out to be this bit of `charSet`:
```
not exists(int mid | this.nonEscapedCharAt(mid) = "]" | mid > inner_start and mid < inner_end)
```

Rewriting this to instead look for the minimum index at which a `]`
appears resulted in a much nicer join.

I also fixed up a similar issue surrounding the `\N` unicode escape.
Not that I think this will necessarily be relevant, but the `min`-based
solution is more robust either way.
2022-02-03 20:42:04 +00:00
Chuan-kai Lin
c8bc5cfa75 Merge pull request #7825 from github/cklin/python-downgrade-scripts
Python: adjust downgrade script location and format
2022-02-03 11:40:07 -08:00
Rasmus Wriedt Larsen
8386b36217 Python: Apply suggestions from code review
Co-authored-by: Nick Rolfe <nickrolfe@github.com>
2022-02-03 15:00:04 +01:00
Rasmus Wriedt Larsen
5cd08b8e8c Python: Ignore .isAbsent() from ClassCall
This means that DataFlowCall is only for resolvable calls, which might not seem
like a big thing in itself, but enables the next commit to actually work :P
2022-02-03 14:58:30 +01:00
Rasmus Wriedt Larsen
a5c2341204 Python: Add simple test of DataFlowCall
Notice the strange thing with treating `mypkg.foo(42)` as a ClassCall,
but completely ignoring `mypkg.subpkg.bar(43)` -- due to having the two
`ClassValue`s:

- `Missing module attribute mypkg.foo`
- `Missing module attribute mypkg.subpkg`

But not `Missing module attribute mypkg.subpkg` with the current import
structure.
2022-02-03 14:58:30 +01:00
Rasmus Wriedt Larsen
48aa07d67a Python: Handle SyntheticPreUpdateNode in PrintNode 2022-02-03 14:58:30 +01:00
Rasmus Wriedt Larsen
49b5d60229 Python: Use AttrRead/AttrWrite for attr read/store steps
Note that this doesn't actually add the desired flow from setattr, due
to missing post-update note. This will be fixed in later commit.
2022-02-03 14:58:30 +01:00
Rasmus Wriedt Larsen
5774459dfb Python: restrict AttrRead with AttrNode.isLoad() 2022-02-03 14:58:23 +01:00
Rasmus Wriedt Larsen
cf68148316 Python: Add change-note 2022-02-03 14:29:02 +01:00