codeql

mirror of https://github.com/github/codeql.git synced 2026-02-11 20:51:06 +01:00

Author	SHA1	Message	Date
yoff	4445cf152a	python: various fixes - compilation - alerts - some review comments	2022-05-11 12:28:58 +00:00
yoff	f67be52b99	python: fix compilation by making client code use the "new" class. Really, this part of the split class should have the old name, to minimise disruptions to clients. Same goes for the other split classes.	2022-05-10 12:53:13 +00:00
yoff	db008f1939	python: summaries may `allowParameterReturnInSelf`	2022-05-10 12:48:42 +00:00
yoff	238c578f5a	python: Add `LocalSourceParameterNode` This can be used when one wants to consider a (source) parameter node as a local source.	2022-05-10 12:48:42 +00:00
yoff	28b239a9a4	python: add qldoc	2022-05-10 12:48:42 +00:00
yoff	da3634188d	python: variaous fixes - sync summary files - format files - fix compilation	2022-05-10 12:48:42 +00:00
yoff	f14ee0e794	python: Flow summaries based on type tracking Two classes have been inserted into the hierarchies: - `NonLibraryDataFlowCallable` with a method `getACall2`. This method implements "get a call, not considering flow summaries". For `NonLibraryDataFlowCallable`s, `getACall` will defer to `getACall2`. While you could have a synthesised call to such a callable, it would not correspond to a `CallNode`. - `NonLibraryDataFlowSourceCall` with methods `getArg2` and `getCallable2`. These also refer to a call graph that does not consider flow summaries. `getArg2` is used to synthesise pre-update nodes for arguments. `getCallable2` is used in `connects` to compute argument passing. This is used to define data flow nodes for overflow arguments. `getACall2` ensures that `LibraryCallableValue::getACall` is not called when the charpred of `FunctionCall` is evaluated.	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	506efcf051	python: refactor `TDataFlowCall` - Branch predicates are made simple. In particular, they do not try to detect library calls. - All branches based on `CallNode`s are gathered into one. - That branch has been given a class `NonSpecialCall`, which is the new parent of call classes based on `CallNode`s. (Those classes now have more involved charpreds.) - A new such class, 'LambdaCall` has been split out from `FunctionCall` to allow the latter to replace its general `CallNode` field with a specific `FunctionValue` one. - `NonSpecialCall` is not an abstract class, but it has some abstract overrides. Therefor, it is not considered a resolved call in the test `UnresolvedCalls.qll`.	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	d85844bb89	python: type tracking uses source nodes	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	81ca479ca9	Python: local flow for type tracking summary flow is excluded from the local flow relation used for typetracking, but included in the one used for global data flow.	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	177dea5307	python: use new syntax for flow summaries also convert to inline tests	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	4024ce4777	python: some summary flows	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	8c263b349f	python: add summary flow steps	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	828db3a392	python: Add summary nodes allowing more `OutNode`s (not restricting to `CallNode`s), gives more flow in the `classesCallGraph` test	2022-05-10 12:48:42 +00:00
Rasmus Lerchedahl Petersen	80175a9af5	Python: Compiles and mostly pass tests - add flowsummaries shared files - register in indentical files - fix initial non-monotonic recursions - add DataFlowSourceCall - add resolvedCall - add SourceParameterNode failing tests: - 3/library-tests/with/test.ql	2022-05-10 12:48:42 +00:00
yoff	6c3e2db7fd	Merge branch 'main' into python/simple-csrf	2022-05-10 10:55:28 +02:00
yoff	b6605bc330	Merge pull request #8634 from RasmusWL/promote-xxe Python: Promote XXE and XML-bomb queries	2022-05-09 21:54:55 +02:00
Rasmus Wriedt Larsen	4a6789182d	Python: Apply suggestions from code review Co-authored-by: yoff <lerchedahl@gmail.com>	2022-05-09 16:37:12 +02:00
Anders Schack-Mulligen	f24364d951	Merge pull request #9045 from hvitved/dataflow/subpaths-perf-take2 Data flow: Speedup `subpaths` predicate (take 2)	2022-05-09 15:39:11 +02:00
Rasmus Wriedt Larsen	36349222a9	Python: Fix casing of `XMLDomParsing`	2022-05-09 11:00:25 +02:00
Rasmus Wriedt Larsen	f22bd039f3	Python: Slight refactor of `LxmlParsing`	2022-05-09 10:56:39 +02:00
yoff	6169ac6122	Merge pull request #7776 from RasmusWL/django-filefield-uploadto Python: Support Django FileField.upload_to	2022-05-05 14:25:08 +02:00
Tom Hvitved	d9d5372f28	Data flow: Sync files	2022-05-05 13:36:26 +02:00
yoff	0c7184952b	Merge pull request #9023 from RasmusWL/positional-docs Python: Clarify `getArg` is about positional arguments	2022-05-05 11:28:17 +02:00
Tom Hvitved	66a9759329	Merge pull request #8870 from hvitved/dataflow/expect-content Data flow: Introduce `expectsContent`	2022-05-05 09:01:40 +02:00
Tom Hvitved	8e33653d25	Merge pull request #9017 from hvitved/dataflow/subpaths-perf Data flow: Speedup `subpaths` predicate	2022-05-04 16:37:52 +02:00
Tom Hvitved	9cb63c0a5e	Data flow: Sync files	2022-05-04 14:49:26 +02:00
Tom Hvitved	74e99302d6	Address review comments	2022-05-04 09:57:59 +02:00
Tom Hvitved	da72ba46d4	Data flow: Add stub `expectsContent` for all languages	2022-05-04 09:57:59 +02:00
Tom Hvitved	6e2e8440eb	Data flow: Sync files	2022-05-04 09:57:59 +02:00
Rasmus Wriedt Larsen	d012eaa892	Python: Clarify `getArg` is about positional arguments	2022-05-03 14:26:23 +02:00
yoff	56ed68b3eb	Merge pull request #9001 from RasmusWL/files-refactoring Python: Flask: Improve `request.files` modeing	2022-05-03 12:19:55 +02:00
Tom Hvitved	e9c8f979f9	Data flow: Sync files	2022-05-03 11:46:51 +02:00
Rasmus Wriedt Larsen	de4390cdf6	Python: Improve Flask `request.files` handling even more	2022-05-02 14:19:45 +02:00
Rasmus Wriedt Larsen	fb0133d276	Python: Fix Flask `request.files` modeling	2022-05-02 14:14:58 +02:00
yoff	1d44694280	Merge pull request #8732 from RasmusWL/dataflow-imports Python: Don't re-export `python` under `DataFlow::`	2022-05-02 12:08:28 +02:00
Taus	231def026f	Merge pull request #8890 from tausbn/python-add-global-attribute-writes Python: Add support for global attribute writes	2022-05-02 12:03:41 +02:00
Rasmus Wriedt Larsen	714465bf39	Python: Refactor `SaxParserSetFeatureCall` Originally made by @erik-krogh in https://github.com/github/codeql/pull/8693/files#diff-9627c1fb9a1cc77fb93e6b7e31af1a4fa908f2a60362cfb34377d24debb97398 Could not be applied directly to this PR, since this PR deletes the file.	2022-05-02 11:29:54 +02:00
Rasmus Wriedt Larsen	5f01fc24e4	Merge branch 'main' into promote-xxe	2022-05-02 11:25:55 +02:00
Taus	95d235416c	Python: Fix bad antijoin in `getAKeyword` Before: ``` Tuple counts for Exprs::Call::getAKeyword_dispred#ff#antijoin_rhs/3@7bc202ij after 9s: 1 ~0% {1} r1 = CONSTANT(unique int)[2] 4244385 ~2% {1} r2 = JOIN r1 WITH py_dict_items_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0' 4244352 ~3% {3} r3 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg1', Lhs.0 'arg0', Rhs.2 'arg2' 66618690 ~3% {5} r4 = JOIN r3 WITH AstGenerated::Call_::getNamedArg_dispred#ffb ON FIRST 1 OUTPUT Lhs.1 'arg0', Lhs.0 'arg1', Lhs.2 'arg2', Rhs.1, Rhs.2 31187133 ~0% {5} r5 = SELECT r4 ON In.3 < In.2 'arg2' 31187133 ~1% {5} r6 = SCAN r5 OUTPUT In.4, 0, In.0 'arg0', In.1 'arg1', In.2 'arg2' 0 ~0% {3} r7 = JOIN r6 WITH py_dict_items ON FIRST 2 OUTPUT Lhs.2 'arg0', Lhs.3 'arg1', Lhs.4 'arg2' return r7 Tuple counts for Exprs::Call::getAKeyword_dispred#ff/2@1dc9468b after 421ms: 1 ~0% {1} r1 = CONSTANT(unique int)[2] 4244385 ~2% {1} r2 = JOIN r1 WITH py_dict_items_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result' 4244352 ~0% {3} r3 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Lhs.0 'result', Rhs.1 'this', Rhs.2 4244352 ~0% {3} r4 = r3 AND NOT Exprs::Call::getAKeyword_dispred#ff#antijoin_rhs(Lhs.0 'result', Lhs.1 'this', Lhs.2) 4244352 ~6% {2} r5 = SCAN r4 OUTPUT In.1 'this', In.0 'result' return r5 ``` Oof. All that work to produce zero tuples. Luckily we can improve matters somewhat. Basically, there's no reason to test _all_ dictionary unpackings, since we're only interested in a lower bound. Thus, we can use `min` instead which is much more efficient. For convenience I factored this into its own (private) helper predicate. Now the tuple counts look as follows: ``` Tuple counts for Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_range/2@39b0e9sm after 1ms: 246 ~0% {2} r1 = JOIN Keywords::DictUnpackingOrKeyword#class#f#shared WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0', Rhs.2 'arg1' return r1 Registering Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_range/2@39b0e9sm + with content 9ea2f123k8necpu015v6tpsc2t1 >>> Created relation Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_range/2@39b0e9sm with 246 rows. Starting to evaluate predicate Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_term/3@9f4ca5g8 Tuple counts for Exprs::Call::getMinimumUnpackingIndex_dispred#ff#min_term/3@9f4ca5g8 after 0ms: 246 ~2% {3} r1 = JOIN Keywords::DictUnpackingOrKeyword#class#f#shared WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'arg0', Rhs.2 'arg2', Rhs.2 'arg2' return r1 Tuple counts for Exprs::Call::getAKeyword_dispred#ff/2@000a0alb after 906ms: 1 ~0% {1} r1 = CONSTANT(unique int)[2] 4244385 ~2% {1} r2 = JOIN r1 WITH py_dict_items_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'result' 4244352 ~0% {3} r3 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Lhs.0 'result', Rhs.1 'this', Rhs.2 4244280 ~0% {3} r4 = r3 AND NOT Exprs::Call::getMinimumUnpackingIndex_dispred#ff_0#antijoin_rhs(Lhs.1 'this') 4244280 ~6% {2} r5 = SCAN r4 OUTPUT In.1 'this', In.0 'result' 4244352 ~3% {3} r6 = JOIN r2 WITH AstGenerated::Call_::getNamedArg_dispred#ffb_201#join_rhs ON FIRST 1 OUTPUT Rhs.1 'this', Lhs.0 'result', Rhs.2 72 ~4% {4} r7 = JOIN r6 WITH Exprs::Call::getMinimumUnpackingIndex_dispred#ff ON FIRST 1 OUTPUT Lhs.1 'result', Lhs.0 'this', Lhs.2, Rhs.1 72 ~4% {4} r8 = SELECT r7 ON In.2 <= In.3 72 ~0% {2} r9 = SCAN r8 OUTPUT In.1 'this', In.0 'result' 4244352 ~6% {2} r10 = r5 UNION r9 return r10 ``` This is not the perfect join order (note the similarity between `r3` and `r6`) but overall it's a win.	2022-04-28 11:11:37 +00:00
Taus	80ef09f034	Python: Fix bad join in `declaredAttributeVar` Before: ``` Tuple counts for PointsTo::declaredAttributeVar#fbf/3@99d5aenq after 1.1s: 451054 ~7% {2} r1 = SCAN variable OUTPUT In.0, In.2 'name' 1296149 ~0% {2} r2 = JOIN r1 WITH Essa::EssaVariable::getSourceVariable_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.1 'name' 12179900 ~4% {3} r3 = JOIN r2 WITH Essa::EssaVariable::getAUse_dispred#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'name', Lhs.0 'var' 8028 ~2% {3} r4 = JOIN r3 WITH Scope::Scope::getANormalExit_dispred#bf_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'name', Lhs.2 'var' 8028 ~2% {3} r5 = JOIN r4 WITH Classes::PythonClassObjectInternal::getScope_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'cls', Lhs.1 'name', Lhs.2 'var' return r5 ``` After: ``` Tuple counts for PointsTo::declaredAttributeVar#fbf/3@cccf36hb after 4ms: 1450 ~0% {2} r1 = SCAN Classes::PythonClassObjectInternal::getScope_dispred#ff OUTPUT In.1, In.0 'cls' 1450 ~7% {2} r2 = JOIN r1 WITH Scope::Scope::getANormalExit_dispred#bf ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'cls' 8028 ~0% {2} r3 = JOIN r2 WITH Essa::EssaVariable::getAUse_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1 'var', Lhs.1 'cls' 8028 ~0% {3} r4 = JOIN r3 WITH Essa::EssaVariable::getSourceVariable_dispred#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1 'cls', Lhs.0 'var' 8028 ~2% {3} r5 = JOIN r4 WITH variable ON FIRST 1 OUTPUT Lhs.1 'cls', Rhs.2 'name', Lhs.2 'var' return r5 ```	2022-04-28 11:11:37 +00:00
Taus	d28f9f41e8	Python: Fix bad join in `import_star_read` Makes this ``` (21s) Tuple counts for DataFlowPublic::import_star_read#ff/2@fcd5e6nr after 8.5s: 9743 ~6% {3} r1 = SCAN num#DataFlowPublic::TModuleVariableNode#fff OUTPUT In.1, In.0, In.2 'result' 9743 ~1% {3} r2 = JOIN r1 WITH Variables::Variable::getId_dispred#ff ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2 'result' 390808917 ~3% {3} r3 = JOIN r2 WITH Flow::NameNode::getId_dispred#ff_10#join_rhs ON FIRST 1 OUTPUT Rhs.1, Lhs.1, Lhs.2 'result' 307 ~0% {2} r4 = JOIN r3 WITH ImportStar::ImportStar::importStarResolvesTo#ff ON FIRST 2 OUTPUT Lhs.0, Lhs.2 'result' 307 ~0% {2} r5 = JOIN r4 WITH num#DataFlowPublic::TCfgNode#ff ON FIRST 1 OUTPUT Rhs.1 'n', Lhs.1 'result' return r5 ``` become this ``` (17s) Tuple counts for DataFlowPublic::resolved_import_star_module#fff/3@f5e84aic after 0ms: 307 ~0% {3} r1 = JOIN ImportStar::ImportStar::importStarResolvesTo#ff WITH num#DataFlowPublic::TCfgNode#ff ON FIRST 1 OUTPUT Lhs.0, Lhs.1 'm', Rhs.1 'n' 307 ~0% {3} r2 = JOIN r1 WITH Flow::NameNode::getId_dispred#ff ON FIRST 1 OUTPUT Lhs.1 'm', Rhs.1 'name', Lhs.2 'n' return r2 (17s) Registering DataFlowPublic::resolved_import_star_module#fff/3@f5e84aic + with content f29281ig38r98icro4ege09mrva (17s) >>> Created relation DataFlowPublic::resolved_import_star_module#fff/3@f5e84aic with 307 rows. (17s) Starting to evaluate predicate DataFlowPublic::import_star_read#ff/2@57b0c06e (17s) Tuple counts for DataFlowPublic::import_star_read#ff/2@57b0c06e after 2ms: 9743 ~0% {3} r1 = SCAN num#DataFlowPublic::TModuleVariableNode#fff OUTPUT In.1, In.0, In.2 'result' 9743 ~0% {3} r2 = JOIN r1 WITH Variables::Variable::getId_dispred#ff ON FIRST 1 OUTPUT Lhs.1, Rhs.1, Lhs.2 'result' 307 ~0% {2} r3 = JOIN r2 WITH DataFlowPublic::resolved_import_star_module#fff ON FIRST 2 OUTPUT Rhs.2 'n', Lhs.2 'result' return r3 ```	2022-04-28 11:11:37 +00:00
yoff	4553a0913f	Merge pull request #8897 from tausbn/python-fix-bad-methodcallsite-join Python: Fix bad join in `MethodCallsiteRefinement`	2022-04-28 12:17:33 +02:00
Taus	b4a31e572f	Python: Add global attribute writes	2022-04-27 16:45:00 +00:00
yoff	39753d5a0b	Merge pull request #8693 from erik-krogh/pyApi PY: more API-graphs refactorings	2022-04-27 13:19:50 +02:00
Taus	d3a05b8b7e	Python: Fix bad join in `MethodCallsiteRefinement` Observed on `FreeCAD/FreeCAD`: ``` Tuple counts for Essa::MethodCallsiteRefinement#24e22a14#f/1@274967ic after 34.5s: 638284 ~0% {2} r1 = SCAN Essa::TEssaNodeRefinement#24e22a14#ffff OUTPUT In.0, In.3 'this' 636521 ~0% {2} r2 = r1 AND NOT Essa::SingleSuccessorGuard#class#24e22a14#f(Lhs.1 'this') 1579493668 ~0% {2} r3 = JOIN r2 WITH SsaDefinitions::SsaSource::method_call_refinement#9197156e#fff ON FIRST 1 OUTPUT Lhs.1 'this', Rhs.2 266673 ~3% {1} r4 = JOIN r3 WITH Essa::EssaNodeRefinement::getDefiningNode#dispred#f0820431#ff ON FIRST 2 OUTPUT Lhs.0 'this' return r4 ``` After a bit of unbinding, we have: ``` Tuple counts for Essa::MethodCallsiteRefinement#24e22a14#f/1@d73d8e27 after 66ms: 215168 ~1% {2} r1 = SCAN Definitions::SsaSourceVariable#class#486534ab#f OUTPUT In.0, In.0 283965 ~2% {2} r2 = JOIN r1 WITH SsaDefinitions::SsaSource::method_call_refinement#9197156e#fff ON FIRST 1 OUTPUT Rhs.2, Lhs.1 401274 ~0% {2} r3 = JOIN r2 WITH Essa::EssaNodeRefinement::getDefiningNode#dispred#f0820431#ff_10#join_rhs ON FIRST 1 OUTPUT Lhs.1, Rhs.1 'this' 266671 ~2% {1} r4 = JOIN r3 WITH Essa::TEssaNodeRefinement#24e22a14#ffff_03#join_rhs ON FIRST 2 OUTPUT Lhs.1 'this' 266671 ~2% {1} r5 = r4 AND NOT Essa::SingleSuccessorGuard#class#24e22a14#f(Lhs.0 'this') return r5 ``` (I'm somewhat confused about the slight difference in tuples, but it's probably just because the compiler moved some stuff around.)	2022-04-27 11:13:37 +00:00
Erik Krogh Kristensen	e1c7d369be	Merge pull request #8796 from erik-krogh/redundantImport Remove redundant imports	2022-04-27 12:39:51 +02:00
yoff	9d774463f5	Merge pull request #8859 from tausbn/python-fix-bad-essa-joins Python: Fix a bunch of bad joins	2022-04-27 12:27:50 +02:00
Erik Krogh Kristensen	d389012b75	Merge branch 'main' into redundantImport	2022-04-26 14:24:51 +02:00
Taus	b2cc91369a	Python: Fix bad join in `firstUse` This was what it looked like (at the point when I killed the evaluation): ``` Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#c5fa2be7#ff/2@i1#be98bwif after 1m50s: 274000 ~7% {4} r1 = SCAN SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#c5fa2be7#ffff OUTPUT In.1, In.0 'def', In.2, In.3 2731768000 ~1% {7} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::variableSourceUse#c5fa2be7#ffff ON FIRST 1 OUTPUT Rhs.0, Lhs.2, Lhs.3, Rhs.2, Rhs.3, Rhs.1 'use', Lhs.1 'def' 178000 ~4% {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#c5fa2be7#fffff ON FIRST 5 OUTPUT Lhs.6 'def', Lhs.5 'use' return r3 ``` And this is what it looks like now: ``` Tuple counts for SsaCompute::SsaComputeImpl::AdjacentUsesImpl::firstUse#c5fa2be7#ff/2@i1#f9d6ewsi after 207ms: 931353 ~2% {4} r1 = SCAN SsaCompute::SsaComputeImpl::AdjacentUsesImpl::variableSourceUse#c5fa2be7#ffff OUTPUT In.0, In.2, In.3, In.1 'use' 1050477 ~0% {4} r2 = JOIN r1 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::adjacentVarRefs#c5fa2be7#fffff_03412#join_rhs ON FIRST 3 OUTPUT Lhs.0, Rhs.3, Rhs.4, Lhs.3 'use' 506626 ~0% {2} r3 = JOIN r2 WITH SsaCompute::SsaComputeImpl::AdjacentUsesImpl::definesAt#c5fa2be7#ffff_1230#join_rhs ON FIRST 3 OUTPUT Rhs.3 'def', Lhs.3 'use' return r3 ```	2022-04-25 14:33:31 +00:00

1 2 3 4 5 ...

726 Commits