Commit Graph

2229 Commits

Author SHA1 Message Date
Rasmus Wriedt Larsen
9573048ee8 Python: Port py/clear-text-logging-sensitive-data 2021-06-25 14:35:31 +02:00
Rasmus Wriedt Larsen
68cfeb0b5c Python: Model logging from the logging module 2021-06-25 14:26:35 +02:00
Rasmus Wriedt Larsen
36c9ceb13b Python: Add Logging concept 2021-06-25 14:26:35 +02:00
Rasmus Wriedt Larsen
a9469b73d9 Python: Port py/clear-text-storage-sensitive-data 2021-06-24 17:39:08 +02:00
Rasmus Wriedt Larsen
7017beca47 Python: Model CookieWrite for twisted
Had to split the call to `request.cookies.append` since inline
expectation tests didn't like the expectation that contained `=` :(
2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
4606444b85 Python: Model CookieWrite for flask 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
65c526df86 Python: Model CookieWrite for tornado 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
9340d658a4 Python: Model CookieWrite for django 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
226425e831 Python: Model CookieWrite for aiohttp 2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
e1af1f11ee Python: Add HTTP::Server::CookieWrite concept
along with tests, but no implementations (to ease reviewing).

---

I've put quite some thinking into what to call our concept for this.

[JS has `CookieDefinition`](581f4ed757/javascript/ql/src/semmle/javascript/frameworks/HTTP.qll (L148-L187)), but I couldn't find a matching concept in any other languages.

We used to call this [`CookieSet`](f07a7bf8cf/python/ql/src/semmle/python/web/Http.qll (L76)) (and had a corresponding `CookieGet`).

But for headers, [Go calls this `HeaderWrite`](cd1e14ed09/ql/src/semmle/go/concepts/HTTP.qll (L97-L131)) and [JS calls this `HeaderDefinition`](581f4ed757/javascript/ql/src/semmle/javascript/frameworks/HTTP.qll (L23-L46))

I think it would be really cool if we have a naming scheme that means the name for getting the value of a header on a incoming request is obvious. I think `HeaderWrite`/`HeaderRead` fulfils this best. We could go with `HeaderSet`/`HeaderGet`, but they feel a bit too vague to me. For me, I'm so used to talking about def-use, that I would immediately go for `HeaderDefinition` and `HeaderUse`, which could work, but is kinda strange.

So in the end that means I went with `CookieWrite`, since that allows using a consistent naming scheme for the future :)
2021-06-24 17:34:43 +02:00
Rasmus Wriedt Larsen
686638a65f Merge pull request #6049 from RasmusWL/jmespath
Python: Add modeling of `jmespath`
2021-06-24 11:13:19 +02:00
CodeQL CI
469e709113 Merge pull request #6055 from RasmusWL/rsa-modeling
Approved by yoff
2021-06-23 08:35:25 -07:00
Rasmus Wriedt Larsen
447099a1df Python: Update jmespath tests 2021-06-23 13:32:19 +02:00
Rasmus Wriedt Larsen
902b450b12 Python: Also model pathlib.Path().open().write()
And this transition to type-trackers also helped fix the missing path
through function calls 👍
2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
39ec8701ca Python: Add FileSystemWriteAccess concept
I made `FileSystemWriteAccess` be a subclass of `FileSystemAccess` (like in [JS](64001cc02c/javascript/ql/src/semmle/javascript/Concepts.qll (L68-L74))), but then I started wondering about how I could  give a good result for `getAPathArgument`, and what would a good result even be? The argument to the `open` call, or the object that the `write` method is called on? I can't see how doing either of these enables us to do anything useful...

So I looked closer at how JS uses `FileSystemWriteAccess`:

1. as sink for zip-slip: 7c51dff0f7/javascript/ql/src/semmle/javascript/security/dataflow/ZipSlipCustomizations.qll (L121)
2. as sink for downloading unsafe files (identified through their extension) through non-secure connections: 89ef6ea4eb/javascript/ql/src/semmle/javascript/security/dataflow/InsecureDownloadCustomizations.qll (L134-L150)
3. as sink for writing untrusted data to a local file  93b1e59d62/javascript/ql/src/semmle/javascript/security/dataflow/HttpToFileAccessCustomizations.qll (L43-L46)

for the 2 first sinks, it's important that `getAPathArgument` has a proper result... so that solves the problem, and highlights that it _can_ be important to give proper results for `getAPathArgument` (if possible).

So I'm trying to do best effort for `f = open(...); f.write(...)`, but with this current code we won't always be able to give a result (as highlighted by the tests). It will also be the case that there are multiple `FileSystemAccess` with the same path-argument, which could be a little strange.

overall, I'm not super confident about the way this new concept and implementation turned out, but it also seems like the best I could come up with right now...

The obvious alternative solution is to NOT make `FileSystemWriteAccess` a subclass of `FileSystemAccess`, but I'm not very tempted to go down this path, given the examples of this being useful above, and just the general notion that we should be able to model writes as being a specialized kind of `FileSystemAccess`.
2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
6a6d6fbe92 Python: Add leading space in some inline tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
13609b2888 Python: Move pathlib tests to Python 3 only tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
e2facd0981 Python: Expand cleartext query tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
5506365b0e Python: Split cleartext tests 2021-06-23 10:50:04 +02:00
Rasmus Wriedt Larsen
0b767bb853 Merge branch 'main' into small-cleanups 2021-06-22 15:01:53 +02:00
Rasmus Wriedt Larsen
e05d6e71b8 Merge pull request #6064 from tausbn/python-add-get-method-call
Python: Add `getAMethodCall` to `LocalSourceNode`
2021-06-22 11:16:39 +02:00
Taus
768cab3642 Python: Address review comments
- changes `getReceiver` to `getObject`
- fixes `calls` to avoid unwanted cross-talk
- adds some more documentation to highlight the above issue
2021-06-21 14:57:19 +00:00
Rasmus Wriedt Larsen
1c48aca630 Merge branch 'main' into jmespath 2021-06-21 15:26:45 +02:00
yoff
baf8d0a990 Merge pull request #6045 from RasmusWL/twisted
Python: Model twisted
2021-06-21 14:52:57 +02:00
Rasmus Wriedt Larsen
8208aebd7e Python: Apply suggestions from code review
Co-authored-by: yoff <lerchedahl@gmail.com>
2021-06-21 10:43:25 +02:00
jorgectf
eac5254a88 Resolve merge conflict 2021-06-18 02:12:49 +02:00
jorgectf
1d7ddce8db Update .expected 2021-06-17 18:10:43 +02:00
jorgectf
eb16018446 Update .expected 2021-06-17 15:45:05 +02:00
jorgectf
8e3d5ff3f9 Rename mongoclient tests 2021-06-17 15:43:01 +02:00
Anders Schack-Mulligen
b173b4141d Merge pull request #6096 from smowton/smowton/fix/inline-expectations-missing-prefix
Inline expectation tests: accept // $MISSING: and // $SPURIOUS:
2021-06-17 11:41:15 +02:00
Chris Smowton
558813acf7 Inline expectation tests: accept // $MISSING: and // $SPURIOUS:
Previously there had to be a space after the $ token, unlike ordinary expectations (i.e., // $xss was already accepted)
2021-06-17 09:44:39 +01:00
jorgectf
8527ccc6d6 Update .expected 2021-06-16 23:19:14 +02:00
jorgectf
81505fbd76 Normalize tests 2021-06-16 23:18:38 +02:00
Rasmus Wriedt Larsen
498703fc81 Python: Escaping only valid with both input/output defined
Problematic part is

```codeql
  /** A escape from string format with `markupsafe.Markup` as the format string. */
  private class MarkupEscapeFromStringFormat extends MarkupSafeEscape, Markup::StringFormat {
    override DataFlow::Node getAnInput() {
      result in [this.getArg(_), this.getArgByName(_)] and
      not result = Markup::instance()
    }

    override DataFlow::Node getOutput() { result = this }
  }
```

since the char-pred still holds even if `getAnInput` has no results...

I will say that doing it this way feels kinda dirty, and we _could_ fix
this by including the logic in `getAnInput` in the char-pred as well.
But as I see it, that would just lead to a lot of code duplication,
which isn't very nice.
2021-06-16 19:09:00 +02:00
Rasmus Wriedt Larsen
6539df6422 Python: Add ConceptsTest for MarkupSafe 2021-06-16 19:09:00 +02:00
Rasmus Wriedt Larsen
14de3bffb7 Python: Model MarkupSafe PyPI package
Since expectation tests had so many changes from ConceptsTest, I'm going
to do the changes for that on in a separate commit. The important part
is the changes to taint-tracking, which is highlighted in this commit.
2021-06-16 19:09:00 +02:00
Rasmus Wriedt Larsen
bcef8d19e6 Python: Add Escaping concept 2021-06-16 19:09:00 +02:00
Rasmus Wriedt Larsen
d18b9a2704 Python: Add markupsafe tests 2021-06-16 19:09:00 +02:00
CodeQL CI
bcafe532ac Merge pull request #5944 from RasmusWL/async-api-graph-tests
Approved by tausbn
2021-06-16 08:46:26 -07:00
yoff
0ddeb7a8c1 Merge pull request #5950 from RasmusWL/promote-clickhouse
Python: Promote ClickHouse SQL models
2021-06-16 13:38:41 +02:00
jorgectf
5123b8f4e3 Update .expected 2021-06-15 20:29:33 +02:00
jorgectf
e61cf9a58d Simplify tests 2021-06-15 19:32:02 +02:00
yoff
b19d64f173 Merge pull request #6013 from RasmusWL/sensitive-improvements
Python: Improve sensitive data modeling
2021-06-15 14:45:40 +02:00
Rasmus Wriedt Larsen
156b10cb59 Merge branch 'main' into promote-clickhouse 2021-06-15 11:30:19 +02:00
jorgectf
c948970181 resolve merge conflicts 2021-06-15 01:24:04 +02:00
jorgectf
1662c5d113 resolve merge conflict 2021-06-15 01:22:11 +02:00
Rasmus Wriedt Larsen
cc311ac4cd Python: Re-introduce syntactic handling of str/bytes/unicode (again)
This reverts commit 870389addb.
2021-06-14 14:23:12 +02:00
Rasmus Wriedt Larsen
870389addb Revert "Python: Re-introduce syntactic handling of str/bytes/unicode"
This reverts commit c4987e94e0.

Hoping that our new handling of builtins would solve this problem... but
it did not :|
2021-06-14 14:22:40 +02:00
Rasmus Wriedt Larsen
af13064f6a Merge branch 'main' into pr/RasmusWL/5926 2021-06-14 14:17:33 +02:00
Rasmus Wriedt Larsen
53f7633662 Python: Model await request.post() as MultiDictProxy
as highlight as being quite easy to do by @yoff 👍
2021-06-11 14:53:30 +02:00