Commit Graph

3068 Commits

Author SHA1 Message Date
jorgectf
b207929e0a RegexExecution restructuring 2021-04-27 19:54:16 +02:00
jorgectf
3daec8e6a2 Enclose Sinks and ReMethods in a module 2021-04-27 19:54:15 +02:00
jorgectf
caaf5436c6 Attempt to restructuring ReMethods and RegexExecution's modules 2021-04-27 19:54:14 +02:00
jorgectf
6d5a0f2f84 Limit Sanitizer to re.escape(arg) 2021-04-27 19:54:13 +02:00
jorgectf
a1b5cc3bc6 Typo 2021-04-27 19:54:13 +02:00
jorgectf
e4736d064e Typo 2021-04-27 19:54:12 +02:00
jorgectf
f45307f990 Apply rebase 2021-04-27 19:54:12 +02:00
jorgectf
5dae920783 Edit filenames to match consistent naming 2021-04-27 19:54:11 +02:00
jorgectf
63f708dd57 Apply suggestions 2021-04-27 19:54:10 +02:00
Jorge
6cc714464c Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-04-27 19:54:09 +02:00
jorgectf
21f8135fa6 Move to experimental folder 2021-04-27 19:54:08 +02:00
jorgectf
afc4f51e9c Remove CWE references 2021-04-27 19:54:07 +02:00
jorgectf
bd3d2ec686 Update to match consistent naming across languages 2021-04-27 19:54:07 +02:00
jorgectf
7adc3c2fba Upload ReDoS query, qhelp and tests 2021-04-27 19:54:05 +02:00
Tom Hvitved
914184f3dd Data flow: Sync files 2021-04-27 19:06:39 +02:00
Rasmus Wriedt Larsen
523ed8272d Python: Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2021-04-27 14:42:05 +02:00
yoff
0509a12790 Merge pull request #5770 from tausbn/python-small-api-graph-fix
Python: Use only `TApiNode` in `API::Impl`
2021-04-27 14:06:09 +02:00
Chris Smowton
64a2320be7 Merge pull request #5757 from smowton/smowton/admin/fix-dead-qhelp-links
Fix all dead qhelp links
2021-04-27 12:17:08 +01:00
Rasmus Wriedt Larsen
37db21d269 Merge pull request #5284 from yoff/python-port-insecure-protocol
Python: port py/insecure-protocol
2021-04-27 09:30:18 +02:00
thank_you
62f3e8d64a Add sanitizer for ObjectId
ObjectId is a sanitizer used to sanitize strings into valid MongoDB ids. During research we've found that this method is used.

ObjectId returns a string representing an id. If at any time ObjectId can't parse it's input (like when a tainted dict in passed in), then ObjectId will throw an error preventing the query from running.
2021-04-26 15:35:42 -04:00
Taus
3889c8afec Python: Use only TApiNode in API::Impl
This ensures that changes to `API::Node` does not invalidate the cached
`module Impl`. At present, I don't expect this to have any effect (as
the `Node` class is also fairly static, though not explicitly cached),
but I can imagine us making some of the `Node` methods have
user-extensible behaviour, in which case we definitely do not want this
to result in reevaluation of `API::Impl`.
2021-04-26 13:10:15 +00:00
Rasmus Lerchedahl Petersen
7cc97836a9 Python: More cleanup from reviewer suggestions 2021-04-23 20:26:13 +02:00
Chris Smowton
455b840712 Fix all dead qhelp links
For those documents with no obvious new home I've pointed the links to the Internet Archive.
2021-04-23 15:20:21 +01:00
Rasmus Wriedt Larsen
deb3db3f95 Python: Add non-alert data for extractor diagnostics
This is basically just a port of the C++/JS queries added in:

- https://github.com/github/codeql/pull/5414 (C++)
- https://github.com/github/codeql/pull/5656 (JS)

SyntaxError should capture all errors we have information about. At least in
`python/ql/src/semmlecode.python.dbscheme` the only match for `error` is
`py_syntax_error_versioned` (which `SyntaxError` is based on).
2021-04-23 13:29:44 +02:00
Rasmus Wriedt Larsen
354dee1b09 Python: Add non-alert data for lines of code
`py/summary/lines-of-code` is just a port of the C++/JS queries added in:

- https://github.com/github/codeql/pull/5271 (C++)
- https://github.com/github/codeql/pull/5304 (JS)

We are the first to implement the `lines-of-user-code` query, so nothing to
compare with in other languages -- but it makes a lot of sense to do for Python 👍
2021-04-23 13:22:18 +02:00
yoff
1954c0ba84 Apply suggestions from code review
Co-authored-by: Taus <tausbn@github.com>
2021-04-23 10:20:18 +02:00
Rasmus Wriedt Larsen
f9383a31bf Python: Fix BrokenCryptoAlgorithm.qhelp 2021-04-22 15:58:28 +02:00
Rasmus Wriedt Larsen
222c087e8c Python: Remove type-tracking performance workaround
Since we shouldn't need it anymore (yay)
2021-04-22 15:31:49 +02:00
Rasmus Wriedt Larsen
fc1a6d0e32 Python: Say salting is not part of py/weak-sensitive-data-hashing 2021-04-22 15:23:41 +02:00
Rasmus Wriedt Larsen
ac83c695ad Python: Add py/weak-sensitive-data-hashing query 2021-04-22 15:23:41 +02:00
Rasmus Wriedt Larsen
794a86a6b0 Python: Add SensitiveDataSource 2021-04-22 15:23:39 +02:00
Rasmus Wriedt Larsen
56c409737d Python: Port py/weak-cryptographic-algorithm
The other query (py/weak-sensitive-data-hashing) is added in future commit
2021-04-22 15:23:38 +02:00
Rasmus Wriedt Larsen
1616975e06 Python: Model hashlib from standard library 2021-04-22 15:23:37 +02:00
Rasmus Lerchedahl Petersen
5a4e661e60 Merge branch 'main' of github.com:github/codeql into python-support-pathlib 2021-04-22 15:04:21 +02:00
Rasmus Wriedt Larsen
fa88f22453 Python: Model hashing operations in cryptography package 2021-04-22 14:51:20 +02:00
Rasmus Wriedt Larsen
c5f826580b Python: Model encrypt/decrypt in cryptography package
I introduced a InternalTypeTracking module, since the type-tracking code got so
verbose, that it was impossible to get an overview of the relevant predicates.
(this means the "first" type-tracking predicate that is usually private, cannot
be marked private anymore, since it needs to be exposed in the private module.
2021-04-22 14:51:19 +02:00
Rasmus Wriedt Larsen
23140dfb76 Python: Add CryptographicOperation modeling for Cryptodome 2021-04-22 14:51:17 +02:00
Rasmus Wriedt Larsen
a8de2aba3b Python: Move CryptoAlgorithms implementation 2021-04-22 14:51:15 +02:00
Rasmus Wriedt Larsen
65c8d9605e Python: Add CryptographicOperation Concept
I considered using `getInput` like in JS, but things like signature verification
has multiple inputs (message and signature).

Using getAnInput also aligns better with Decoding/Encoding.
2021-04-22 14:51:14 +02:00
Rasmus Lerchedahl Petersen
b724e51cab Python: Improvements from review suggestions 2021-04-22 10:40:42 +02:00
Rasmus Wriedt Larsen
5a9e27c6fc Merge branch 'main' into django-3.2 2021-04-21 17:15:47 +02:00
Taus
71780228ae Python: Rename TypeTrackerPrivate.qll 2021-04-21 13:08:26 +00:00
Rasmus Wriedt Larsen
2302c8d5fa Python: Model new alias method on django QuerySets 2021-04-21 13:52:38 +02:00
yoff
a19373ab54 Merge pull request #5727 from tausbn/python-use-localsource-in-stepsummary
Python: Use `LocalSourceNode` in `StepSummary::step`
2021-04-21 13:50:31 +02:00
Taus
489e1e94e4 Python: Prevent bad joins
Adds a few unbinds to prevent bad joins from occurring.

Firstly, we never want to join `StepSummary::step` with
`TypeTracker::append` on `summary` as the first join, as the resulting
relation is absolutely massive. So we decouple the two occurrences of
`summary` by unbinding each of them.

Secondly, in some cases the node we're stepping to (`nodeTo` for type
trackers, `nodeFrom` for type backtrackers) will get joined eagerly
with the typetracker one is defining, and again this produces an
uncomfortably large intermediate join. A bit of unbinding prevents this
as well.
2021-04-21 11:44:34 +00:00
Taus
9e95f6e7c1 Python: Remove typePreservingStep
This requires a bit of explanation, so strap in.

Firstly, because we use `LocalSourceNode`s as the start and end points
of our `StepSummary::step` relation, there's no need to include
`simpleLocalFlowStep` (via `typePreservingStep`) in `smallstep`. Indeed,
since the successor node for a `step` is a `LocalSourceNode`, and local
sources never have incoming flow, this is entirely futile -- we can find
values for `mid` and `nodeTo` that satisfy the body of `step`, but
`nodeTo` will never be a `LocalSourceNode`.

With this in mind, we can simplify `smallstep` to only refer to
`jumpStep`.

This then brings the other uses of `typePreservingStep` into question.
The only other place we use this predicate is in the `TypeTracker` and
`TypeBackTracker` `smallstep` predicates. Note, however, that here we
no longer need `jumpStep` to be part of `typeTrackingStep` (as it is
already accounted for in `StepSummary::smallstep`) so we can simplify
to `simpleLocalFlowStep`. At this point, `typePreservingStep` is unused.

Finally, because of the way `smallstep` is used in `step` (inside
`StepSummary`), `nodeTo` must always be a `LocalSourceNode`, so I have
propagated this restriction to `smallstep` as well. We can always lift
this restriction later, but for now it seems like it's likely to cause
fewer surprises to have made this explicit.
2021-04-21 11:12:06 +00:00
Rasmus Wriedt Larsen
775ed41592 Python: Update SensitiveDataHeuristics with newer JS version
which also prompted me to rewrite the QLDoc for `nameIndicatesSensitiveData`
2021-04-21 11:34:01 +02:00
Rasmus Wriedt Larsen
16b62486e9 Python: Extract SensitiveDataHeuristics to be shared with JS
Initially I had called `nameIndicatesSensitiveData` for `maybeSensitiveName`,
which made the relationship with `maybeSensitive` and `notSensitive` quite
strange -- and therefore I added the more informative `maybeSensitiveRegexp` and
`notSensitiveRegexp`.

Although I'm no longer using `maybeSensitiveName`, and I no longer have a strong
argument for making this name change, I still like it. If someone thinks this is
a terrible idea, I'm happy to change it though 👍
2021-04-21 11:31:28 +02:00
yoff
0c4181178d Update python/ql/src/semmle/python/frameworks/Stdlib.qll
Co-authored-by: Taus <tausbn@github.com>
2021-04-20 22:15:09 +02:00
yoff
ef0ea247c4 Merge pull request #5679 from tausbn/python-fix-bad-points-to-joins
Python: Fix bad points-to joins
2021-04-20 21:19:32 +02:00