Commit Graph

3478 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
65c526df86 Python: Model CookieWrite for tornado 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
9340d658a4 Python: Model CookieWrite for django 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
930ed0a712 Python: Minor django fixup 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
226425e831 Python: Model CookieWrite for aiohttp 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
e1af1f11ee Python: Add HTTP::Server::CookieWrite concept
along with tests, but no implementations (to ease reviewing).

---

I've put quite some thinking into what to call our concept for this.

[JS has `CookieDefinition`](581f4ed757/javascript/ql/src/semmle/javascript/frameworks/HTTP.qll (L148-L187)), but I couldn't find a matching concept in any other languages.

We used to call this [`CookieSet`](f07a7bf8cf/python/ql/src/semmle/python/web/Http.qll (L76)) (and had a corresponding `CookieGet`).

But for headers, [Go calls this `HeaderWrite`](cd1e14ed09/ql/src/semmle/go/concepts/HTTP.qll (L97-L131)) and [JS calls this `HeaderDefinition`](581f4ed757/javascript/ql/src/semmle/javascript/frameworks/HTTP.qll (L23-L46))

I think it would be really cool if we have a naming scheme that means the name for getting the value of a header on a incoming request is obvious. I think `HeaderWrite`/`HeaderRead` fulfils this best. We could go with `HeaderSet`/`HeaderGet`, but they feel a bit too vague to me. For me, I'm so used to talking about def-use, that I would immediately go for `HeaderDefinition` and `HeaderUse`, which could work, but is kinda strange.

So in the end that means I went with `CookieWrite`, since that allows using a consistent naming scheme for the future :)
2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
902b450b12 Python: Also model pathlib.Path().open().write()
And this transition to type-trackers also helped fix the missing path
through function calls 👍
2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
39ec8701ca Python: Add FileSystemWriteAccess concept
I made `FileSystemWriteAccess` be a subclass of `FileSystemAccess` (like in [JS](64001cc02c/javascript/ql/src/semmle/javascript/Concepts.qll (L68-L74))), but then I started wondering about how I could  give a good result for `getAPathArgument`, and what would a good result even be? The argument to the `open` call, or the object that the `write` method is called on? I can't see how doing either of these enables us to do anything useful...

So I looked closer at how JS uses `FileSystemWriteAccess`:

1. as sink for zip-slip: 7c51dff0f7/javascript/ql/src/semmle/javascript/security/dataflow/ZipSlipCustomizations.qll (L121)
2. as sink for downloading unsafe files (identified through their extension) through non-secure connections: 89ef6ea4eb/javascript/ql/src/semmle/javascript/security/dataflow/InsecureDownloadCustomizations.qll (L134-L150)
3. as sink for writing untrusted data to a local file  93b1e59d62/javascript/ql/src/semmle/javascript/security/dataflow/HttpToFileAccessCustomizations.qll (L43-L46)

for the 2 first sinks, it's important that `getAPathArgument` has a proper result... so that solves the problem, and highlights that it _can_ be important to give proper results for `getAPathArgument` (if possible).

So I'm trying to do best effort for `f = open(...); f.write(...)`, but with this current code we won't always be able to give a result (as highlighted by the tests). It will also be the case that there are multiple `FileSystemAccess` with the same path-argument, which could be a little strange.

overall, I'm not super confident about the way this new concept and implementation turned out, but it also seems like the best I could come up with right now...

The obvious alternative solution is to NOT make `FileSystemWriteAccess` a subclass of `FileSystemAccess`, but I'm not very tempted to go down this path, given the examples of this being useful above, and just the general notion that we should be able to model writes as being a specialized kind of `FileSystemAccess`.
2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
6a6d6fbe92 Python: Add leading space in some inline tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
13609b2888 Python: Move pathlib tests to Python 3 only tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
e2facd0981 Python: Expand cleartext query tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
5506365b0e Python: Split cleartext tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
c0964617d7 Merge pull request #6111 from tausbn/python-a-few-minor-cleanups
Python: A few minor bits of cleanup
2021-06-23 10:42:41 +02:00
Taus
317c6867aa Python: Fix sneaky semantic change
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2021-06-22 16:46:54 +02:00
Rasmus Wriedt Larsen
5db627042f Merge pull request #6091 from tausbn/python-exclude-main-py-files
Python: Avoid `__main__.py` files as entry points.
2021-06-22 11:29:02 +02:00
Rasmus Wriedt Larsen
e05d6e71b8 Merge pull request #6064 from tausbn/python-add-get-method-call
Python: Add `getAMethodCall` to `LocalSourceNode`
2021-06-22 11:16:39 +02:00
Taus
ba6ab8ff3d Python: Expand __main__.py comment
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
2021-06-21 18:14:03 +02:00
Taus
768cab3642 Python: Address review comments
- changes `getReceiver` to `getObject`
- fixes `calls` to avoid unwanted cross-talk
- adds some more documentation to highlight the above issue
2021-06-21 14:57:19 +00:00
CodeQL CI
565af1a879 Merge pull request #6071 from RasmusWL/fix-input-cwe
Approved by calumgrant, tausbn
2021-06-21 06:23:18 -07:00
yoff
baf8d0a990 Merge pull request #6045 from RasmusWL/twisted
Python: Model twisted
2021-06-21 14:52:57 +02:00
Anders Schack-Mulligen
9110dfaeb3 Merge pull request #6095 from hvitved/dataflow/local-cc-join
Data flow: Fix `getLocalCallContext` join-order
2021-06-21 12:53:38 +02:00
Rasmus Wriedt Larsen
d6ec4d30fc Python: Twisted refactor of getRequestParamIndex 2021-06-21 10:54:28 +02:00
Rasmus Wriedt Larsen
8208aebd7e Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-06-21 10:43:25 +02:00
Taus
3aea270e10 Python: Autoformat 2021-06-18 18:30:27 +00:00
Taus
aeac03663f Python: Remove old ClickHouseDriver.qll
The merge must've gone wrong some way, as this file is not supposed to
exist in `experimental` anymore.
2021-06-18 17:41:09 +00:00
Taus
348b20ca9d Merge branch 'main' of https://github.com/github/codeql into python-a-few-minor-cleanups 2021-06-18 17:38:43 +00:00
Taus
9351688da8 Python: asCfgNode cleanup 2021-06-18 17:22:42 +00:00
Taus
c386f4a009 Python: Clean up py/insecure-protocol
Going all the way to the AST layer seemed excessive to me, so I rewrote
it to do most of the logic at the data-flow layer. In principle this
_could_ result in more names being computed (due to splitting), but in
practice I don't expect this make a big difference.
2021-06-18 17:22:42 +00:00
Taus
f24a9a46d9 Python: add getAnAttributeWrite 2021-06-18 17:22:42 +00:00
Taus
c78ba476cf Python: Clean up a few verbose casts 2021-06-18 17:22:42 +00:00
Calum Grant
32f6a465b0 Merge pull request #6080 from github/calumgrant/security-severities
Update security-severity scores
2021-06-18 09:40:40 +01:00
Tom Hvitved
eb86bceb4d Address review comments 2021-06-18 10:18:47 +02:00
Anders Schack-Mulligen
b173b4141d Merge pull request #6096 from smowton/smowton/fix/inline-expectations-missing-prefix
Inline expectation tests: accept // $MISSING: and // $SPURIOUS:
2021-06-17 11:41:15 +02:00
Chris Smowton
558813acf7 Inline expectation tests: accept // $MISSING: and // $SPURIOUS:
Previously there had to be a space after the $ token, unlike ordinary expectations (i.e., // $xss was already accepted)
2021-06-17 09:44:39 +01:00
Tom Hvitved
0febf5a592 Merge pull request #6094 from hvitved/dataflow/consistency-compiler-too-smart
Data flow: Workaround for too clever compiler in consistency queries
2021-06-17 10:23:31 +02:00
Tom Hvitved
ffb2350a54 Data flow: Fix getLocalCallContext join-order 2021-06-17 10:02:31 +02:00
Tom Hvitved
cc383e0f6a Data flow: Workaround for too clever compiler in consistency queries 2021-06-17 09:43:36 +02:00
CodeQL CI
bcafe532ac Merge pull request #5944 from RasmusWL/async-api-graph-tests
Approved by tausbn
2021-06-16 08:46:26 -07:00
yoff
0ddeb7a8c1 Merge pull request #5950 from RasmusWL/promote-clickhouse
Python: Promote ClickHouse SQL models
2021-06-16 13:38:41 +02:00
Taus
e647403948 Python: Avoid __main__.py files as entry points.
According to the official documentation, the purpose of `__main__.py`
files is that their presence in a package (say, `foo`) means one can
execute the package directly using `python -m foo` (which will run the
aforementioned `foo/__main__.py` file).

In principle this means that adding `if __name__ == "__main__"` in these
files is superfluous, as they are only intended to be executed (and not
imported by some other file).

However, in practice people often _do_ include the above construct.
Here are some instances of this on LGTM.com:
https://lgtm.com/query/7521266095072095777/

In particular, 10 out of 33 files in `cpython` have this construct.

This causes some confusion in our module naming, as we usually see the
presence of `__name__ == "__main__"` as an indication that a file may
be run directly (and hence with "absolute import" semantics). However,
when run with `python -m`, the interpreter uses the usual package
semantics, and this leads to modules getting multiple names.

For this reason, I think it makes sense to simply exclude `__main__.py`
files from consideration. Note that if there is a `#!` line mentioning
the Python interpreter, then they will still be included as entry
points.
2021-06-16 10:59:56 +00:00
Taus
96d8fc78f8 Merge pull request #6078 from hvitved/type-tracker-caching
Python: Move cached predicates in type tracker library to same stage
2021-06-16 10:45:02 +02:00
Taus
359bc5eff9 Python: Autoformat 2021-06-15 15:56:40 +00:00
Taus
b55c034502 Python: Fix up getAMethodCall
Now that we have a `MethodCallNode` class, it would be silly not to use
that as the return type.
2021-06-15 15:13:54 +00:00
Taus
41ee325bc9 Python: Clean up Stdlib.qll
Not as many opportunities to clean stuff up here.
2021-06-15 15:04:30 +00:00
Taus
e90ec807ef Python: Clean up Ssl.qll 2021-06-15 15:04:29 +00:00
Taus
82fab3ba75 Python: Clean up Cryptography.qll 2021-06-15 15:04:29 +00:00
Taus
d4b05547ba Python: Add MethodCallNode class
Roughly patterned after the JS equivalent.
2021-06-15 15:04:29 +00:00
Taus
87ee7849a9 Merge pull request #6077 from RasmusWL/fix-pypi-names
Python: Fixup for names of supported PyPI packages
2021-06-15 15:01:35 +02:00
yoff
b19d64f173 Merge pull request #6013 from RasmusWL/sensitive-improvements
Python: Improve sensitive data modeling
2021-06-15 14:45:40 +02:00
Calum Grant
771e686946 Update security-severity scores 2021-06-15 13:25:17 +01:00
Tom Hvitved
c03ee32f02 Python: Move cached predicates in type tracker library to same stage 2021-06-15 13:42:43 +02:00