codeql

mirror of https://github.com/github/codeql.git synced 2025-12-17 17:23:36 +01:00

Author	SHA1	Message	Date
Rasmus Wriedt Larsen	51b543c67c	Python: Model taint for django request methods	2021-07-21 16:35:09 +02:00
Rasmus Wriedt Larsen	bced467a88	Python: Refactor django additional step handling So it matches the new style we're using in aiohttp/twisted/...	2021-07-21 16:35:09 +02:00
Rasmus Wriedt Larsen	ce4b192caa	Python: Improve usefulness of RemoteFlowSourcesReach meta query Before, results from `dca` would look something like ## + py/meta/alerts/remote-flow-sources-reach - django/django@c2250cf_cb8f: tests/messages_tests/urls.py:38:16:38:48 reachable with taint-tracking from RemoteFlowSource - django/django@c2250cf_cb8f: tests/messages_tests/urls.py:38:9:38:12 reachable with taint-tracking from RemoteFlowSource now it should make it easier to spot _what_ it is that actually changed, since we pretty-print the node.	2021-07-21 16:35:09 +02:00
Rasmus Wriedt Larsen	6aabbf0b9a	Python: Add some alert meta queries Intended for use with dca	2021-07-21 14:53:01 +02:00
Taus	233ae5a54b	Python: Fix FP in `py/unused-local-variable` This is only a temporary fix, as indicated by the TODO comment. The real underlying issue is the fact that `isUnused` is defined in terms of the underlying SSA variables (as these are only created for variables that are actually used), and the fact that annotated assignments are always considered to redefine their targets, which may not actually be the case. Thus, the correct fix would be to change the extractor to _disregard_ mere type annotations for the purposes of figuring out whether an SSA variable should be created or not. However, in the short term the present fix is likely sufficient.	2021-07-20 12:13:44 +00:00
Taus	8b3fa789da	Python: Add `AnnAssign` `DefinitionNode` This was a source of false positives for the `py/uninitialized-local-variable` query, as exemplified by the test case.	2021-07-20 11:57:26 +00:00
Porcuiney Hairs	c6c925d67a	Python : Improve Xpath Injection Query	2021-07-20 03:31:30 +05:30
Sam Havron	733e5b45bf	Fix qhelp typo in RequestWithoutValidation	2021-07-19 16:01:06 -04:00
thank_you	9e01338500	Query only vulnerable methods	2021-07-18 17:13:10 -04:00
Rasmus Wriedt Larsen	a07de3faae	Merge branch 'main' into emptyRedos	2021-07-15 18:21:29 +02:00
CodeQL CI	d282f6a356	Merge pull request #6218 from tausbn/python-add-typetrackingnode Approved by RasmusWL	2021-07-15 07:04:50 -07:00
Taus	dd03d8102b	Merge pull request #6300 from RasmusWL/redos-tests Python: Fix `py/polynomial-redos`	2021-07-15 15:59:01 +02:00
Rasmus Wriedt Larsen	900cbc9a2f	Merge pull request #6265 from tausbn/python-performance-fixes Python: Fix a few performance issues.	2021-07-15 14:19:37 +02:00
Rasmus Wriedt Larsen	a5834c4d78	Python: Fix `py/polynomial-redos`	2021-07-15 14:16:19 +02:00
Anders Schack-Mulligen	8ccdd4fb9f	Merge pull request #6211 from aschackmull/dataflow/refactor-call-context-check Dataflow: Refactor call context check	2021-07-15 12:27:23 +02:00
Erik Krogh Kristensen	383b5f2ff2	implement RegExpSubPattern.getOperand in the Python regexp implementation	2021-07-15 09:41:53 +02:00
Erik Krogh Kristensen	de8f64c5be	sync with python	2021-07-14 23:40:06 +02:00
Taus	fb57c5f6f0	Merge pull request #6143 from RasmusWL/concepts-private-import-python Python: Make `import python` private in Concepts.qll	2021-07-14 17:49:06 +02:00
Taus	5c5ee85332	Merge pull request #6122 from RasmusWL/mention-mysqlclient Python: Mention modeling of `mysqlclient` PyPI package	2021-07-14 17:48:40 +02:00
Taus	30d61045d2	Python: Mention `nameIndicatesSensitiveData` Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>	2021-07-14 17:33:39 +02:00
Taus	2bb44d49d9	Python: Perform more deduplication This cut the evaluation time on `django` down from 1.2 seconds to ~0.8 seconds (but the impact will likely be greater on bigger projects).	2021-07-14 13:38:05 +00:00
Taus	09993406f1	Python: Add explanatory QLDoc comment	2021-07-14 10:42:07 +00:00
Anders Schack-Mulligen	0ccb213ec5	Dataflow: Sync.	2021-07-14 10:36:09 +02:00
CodeQL CI	f6f7020388	Merge pull request #6250 from erik-krogh/python-redos-unicode Approved by RasmusWL	2021-07-14 01:09:26 -07:00
Taus	6aec7f2c49	Merge pull request #6264 from RasmusWL/customization-files-for-path-problems Python: Provide proper source/sink customization for most path queries	2021-07-13 15:09:33 +02:00
Rasmus Wriedt Larsen	9ed61e7663	Python: Port `py/polynomial-redos` to use proper source/sink customization I noticed the configuration/customization files are in the `performance` folder in JS, but I just kept them in place, since that seems correct to me.	2021-07-13 14:39:44 +02:00
Rasmus Wriedt Larsen	cea2f82be9	Python: Port `py/path-injection` to use proper source/sink customization	2021-07-13 14:09:02 +02:00
Rasmus Wriedt Larsen	bf214ac3bb	Python: Apply suggestions from code review Co-authored-by: Taus <tausbn@github.com>	2021-07-13 13:41:26 +02:00
Rasmus Wriedt Larsen	1a59c9b64a	Merge pull request #6204 from tausbn/python-ensmallen-localsourcenode Python: Clean up `LocalSourceNode` charpred	2021-07-13 13:27:38 +02:00
Taus	1decf23785	Python: Fix bad join order for sensitive data Not the prettiest of solutions, but it does the job. Basically, we were calculating (and re-calculating) the same big relation between strings and regexes and then checking whether the latter matched the former. This resulted in tuple counts like the following: ``` [2021-07-12 16:09:24] (12s) Tuple counts for SensitiveDataSources::SensitiveDataModeling::SensitiveVariableAssignment#class#ff#shared/4@7489c6: 4918074 ~0% {4} r1 = JOIN SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp#ff WITH Flow::NameNode::getId_dispred#ff CARTESIAN PRODUCT OUTPUT Lhs.0 'arg0', Lhs.1 'arg1', Rhs.0, Rhs.1 'arg3' 2654 ~0% {4} r2 = JOIN r1 WITH PRIMITIVE regexpMatch#bb ON Lhs.3 'arg3',Lhs.1 'arg1' return r2 ``` (The above being just the bit that handles `DefinitionNode` in `SensitiveVariableAssignment`, and taking 12 seconds to evaluate.) By applying a bit of manual inlining and magic, this becomes somewhat more manageable: ``` [2021-07-12 15:59:44] (1s) Tuple counts for SensitiveDataSources::SensitiveDataModeling::sensitiveString#ff/2@8830e2: 27671 ~2% {3} r1 = JOIN SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp#ff WITH SensitiveDataSources::SensitiveDataModeling::sensitiveParameterName#f CARTESIAN PRODUCT OUTPUT Lhs.0 'classification', Lhs.1, Rhs.0 334012 ~2% {3} r2 = JOIN SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp#ff WITH SensitiveDataSources::SensitiveDataModeling::sensitiveName#f CARTESIAN PRODUCT OUTPUT Lhs.0 'classification', Lhs.1, Rhs.0 361683 ~11% {3} r3 = r1 UNION r2 154644 ~0% {3} r4 = JOIN SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp#ff WITH SensitiveDataSources::SensitiveDataModeling::sensitiveFunctionName#f CARTESIAN PRODUCT OUTPUT Lhs.0 'classification', Lhs.1, Rhs.0 149198 ~1% {3} r5 = JOIN SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp#ff WITH SensitiveDataSources::SensitiveDataModeling::sensitiveStrConst#f CARTESIAN PRODUCT OUTPUT Lhs.0 'classification', Lhs.1, Rhs.0 124257 ~5% {3} r6 = JOIN SensitiveDataHeuristics::HeuristicNames::maybeSensitiveRegexp#ff WITH SensitiveDataSources::SensitiveDataModeling::sensitiveAttributeName#f CARTESIAN PRODUCT OUTPUT Lhs.0 'classification', Lhs.1, Rhs.0 273455 ~21% {3} r7 = r5 UNION r6 428099 ~30% {3} r8 = r4 UNION r7 789782 ~78% {3} r9 = r3 UNION r8 1121 ~77% {3} r10 = JOIN r9 WITH PRIMITIVE regexpMatch#bb ON Lhs.2 'result',Lhs.1 1121 ~70% {2} r11 = SCAN r10 OUTPUT In.0 'classification', In.2 'result' return r11 ``` (The above being the total for all the sensitive names we care about, taking only 1.2 seconds to evaluate.) Incidentally, you may wonder why this has _fewer_ results than before. The answer is control flow splitting -- every sensitively-named `DefinitionNode` would have been matched in isolation previously. By pre-matching on just the names of these, we can subsequently join against those names that are known to be sensitive, which is a much faster operation. (We also get the benefit of deduplicating the strings that are matched, before actually performing the match, so if, say, an attribute name and a variable name are identical, then we'll only match them once.) We also exclude all docstrings as relevant string constants, as these presumably don't actually flow anywhere.	2021-07-12 16:10:49 +00:00
Taus	a73e382dfe	Python: Prevent bad join in hashlib model I'm not entirely sure what triggered this bad join order, but some combination of the use of abstract classes and the exclusion of `new` caused this to go really wrong: ``` WeakSensitiveDataHashing.ql-15:Stdlib::Stdlib::HashlibDataPassedToHashClass#class#ffff ......... 15.5s ``` with the following tuple counts: ``` [2021-07-12 13:20:15] (16s) Tuple counts for Stdlib::Stdlib::HashlibDataPassedToHashClass#class#ffff/4@217901: 148810 ~3% {3} r1 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArg_dispred#fff ON FIRST 1 OUTPUT "hashlib", Lhs.1 'node', Lhs.0 'this' 148810 ~4% {3} r2 = JOIN r1 WITH ApiGraphs::API::Impl::MkModuleImport#ff@staged_ext ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'node', Lhs.2 'this' 7589310 ~486% {4} r3 = JOIN r2 WITH ApiGraphs::API::Impl::edge#2#fff@staged_ext ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.2 'this', Rhs.1, InverseAppend("getMember(\"","\")",Rhs.1) 6994070 ~490% {4} r4 = SELECT r3 ON In.3 != "new" 6994070 ~4503% {2} r5 = SCAN r4 OUTPUT In.1 'this', In.0 'node' 22 ~4% {3} r6 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArgByName_dispred#fff ON FIRST 1 OUTPUT "hashlib", Lhs.1 'node', Lhs.0 'this' 22 ~0% {3} r7 = JOIN r6 WITH ApiGraphs::API::Impl::MkModuleImport#ff@staged_ext ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'node', Lhs.2 'this' 1122 ~437% {4} r8 = JOIN r7 WITH ApiGraphs::API::Impl::edge#2#fff@staged_ext ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.2 'this', Rhs.1, InverseAppend("getMember(\"","\")",Rhs.1) 1034 ~460% {4} r9 = SELECT r8 ON In.3 != "new" 1034 ~4549% {2} r10 = SCAN r9 OUTPUT In.1 'this', In.0 'node' 6995104 ~4503% {2} r11 = r5 UNION r10 5213851 ~4683% {3} r12 = JOIN r11 WITH ApiGraphs::API::Node::getACall_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'hashClass', Lhs.1 'node', Lhs.0 'this' 6478480 ~4646% {6} r13 = JOIN r12 WITH ApiGraphs::API::Impl::edge#2#fff_201#join_rhs ON FIRST 1 OUTPUT "hashlib", Rhs.1, Lhs.1 'node', Lhs.2 'this', Lhs.0 'hashClass', Rhs.2 1410 ~4693% {5} r14 = JOIN r13 WITH ApiGraphs::API::Impl::MkModuleImport#ff@staged_ext ON FIRST 2 OUTPUT Lhs.2 'node', Lhs.3 'this', Lhs.4 'hashClass', Lhs.5, InverseAppend("getMember(\"","\")",Lhs.5) 1222 ~4540% {5} r15 = SELECT r14 ON In.4 'hashName' != "new" 1222 ~4540% {4} r16 = SCAN r15 OUTPUT In.1 'this', In.4 'hashName', In.2 'hashClass', In.0 'node' ``` By factoring out the insides, the biggest iteration now looks like ``` [2021-07-12 14:17:36] (0s) Tuple counts for Stdlib::Stdlib::HashlibDataPassedToHashClass#class#ffff/4@85bb21: 148810 ~0% {2} r1 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArg_dispred#fff ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.0 'this' 148810 ~0% {2} r2 = JOIN r1 WITH Stdlib::Stdlib::hashlibMember#ff#nonempty CARTESIAN PRODUCT OUTPUT Lhs.1 'this', Lhs.0 'node' 22 ~0% {2} r3 = JOIN DataFlowPublic::CallCfgNode#class#ff#shared WITH project#DataFlowPublic::CallCfgNode::getArgByName_dispred#fff ON FIRST 1 OUTPUT Lhs.1 'node', Lhs.0 'this' 22 ~0% {2} r4 = JOIN r3 WITH Stdlib::Stdlib::hashlibMember#ff#nonempty CARTESIAN PRODUCT OUTPUT Lhs.1 'this', Lhs.0 'node' 148832 ~0% {2} r5 = r2 UNION r4 110933 ~2% {3} r6 = JOIN r5 WITH ApiGraphs::API::Node::getACall_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'hashClass', Lhs.1 'node', Lhs.0 'this' 26 ~0% {4} r7 = JOIN r6 WITH Stdlib::Stdlib::hashlibMember#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.2 'this', Rhs.1 'hashName', Lhs.0 'hashClass', Lhs.1 'node' return r7 ``` (The tuple counts themselves are not directly comparable.)	2021-07-12 14:22:21 +00:00
Rasmus Wriedt Larsen	47f5c977cf	Python: Port `py/stack-trace-exposure` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen	934007c811	Python: Port `py/unsafe-deserialization` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen	7c71223f7f	Python: Port `py/url-redirection` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen	b4c0b1b525	Python: Port `py/reflective-xss` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen	62e4445f45	Python: Port `py/command-line-injection` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen	7f53781ba7	Python: Port `py/code-injection` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Rasmus Wriedt Larsen	0be280c608	Python: Port `py/sql-injection` to use proper source/sink customization	2021-07-12 16:22:10 +02:00
Erik Krogh Kristensen	c4f5009917	make explicit calls to member predicates Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>	2021-07-12 14:22:08 +02:00
Taus	1e79091120	Python: Fix typo	2021-07-12 11:33:52 +00:00
Taus	32062d83ad	Python: Make deprecation warning more prominent	2021-07-12 10:00:21 +00:00
Erik Krogh Kristensen	440e4b9a92	enable unicode support in the Python ReDoS query	2021-07-11 21:28:40 +02:00
haby0	e8d0827916	Add tornado source	2021-07-05 10:42:15 +08:00
Taus	a65d40e36f	Merge branch 'main' into python-add-typetrackingnode	2021-07-02 20:55:37 +02:00
Taus	55d822cc56	Python: Add `TypeTrackingNode` Splits `ModuleVariableNode` away from `LocalSourceNode`, instead creating a class `TypeTrackingNode` that encapsulates both of these. This means we no longer have module variable nodes as part of `LocalSourceNode` (which is good, since they have no "local" aspect to them), and hence we can have `LocalSourceNode` inherit directly from `ExprNode` (which makes the API a bit nicer). Unfortunately these are breaking changes, so we can't actually fulfil the above two desiderata until the `track` and `backtrack` methods on `LocalSourceNode` have been fully deprecated. For this reason, we preserve the present implementation of `LocalSourceNode`, and instead lay the foundation for switching over in the future, by deprecating `track` and `backtrack` on `LocalSourceNode`.	2021-07-02 18:00:33 +00:00
CodeQL CI	1d56748eed	Merge pull request #6200 from yoff/pythonJS-make-expbtlib-private Approved by RasmusWL, esbena	2021-07-02 09:09:18 -07:00
CodeQL CI	a25933aa56	Merge pull request #5926 from RasmusWL/small-cleanups Approved by tausbn	2021-07-02 04:59:54 -07:00
haby0	b866f1b21e	Add CWE-348 ClientSuppliedIpUsedInSecurityCheck	2021-07-02 19:30:33 +08:00
Rasmus Wriedt Larsen	81fab487a4	Python: Apply suggestions from code review Co-authored-by: Taus <tausbn@github.com>	2021-07-02 13:27:41 +02:00
Rasmus Wriedt Larsen	22c155687e	Python: Fix code after removing `getPostUpdateNode`	2021-07-02 13:25:25 +02:00

... 3 4 5 6 7 ...

3072 Commits