mirror of
https://github.com/github/codeql.git
synced 2026-04-29 02:35:15 +02:00
This QL
```codeql
import python
import semmle.python.dataflow.TaintTracking
import semmle.python.security.strings.Untrusted
from CollectionKind ck
where
ck.(DictKind).getMember() instanceof StringKind
or
ck.getMember().(DictKind).getMember() instanceof StringKind
select ck, ck.getAQlClass(), ck.getMember().getAQlClass()
```
generates these 6 results.
```
1 {externally controlled string} ExternalStringDictKind UntrustedStringKind
2 {externally controlled string} StringDictKind UntrustedStringKind
3 [{externally controlled string}] SequenceKind ExternalStringDictKind
4 [{externally controlled string}] SequenceKind StringDictKind
5 {{externally controlled string}} DictKind ExternalStringDictKind
6 {{externally controlled string}} DictKind StringDictKind
```
StringDictKind was only used in *one* place in our library code. As illustrated
above, it pollutes our set of TaintKinds. Effectively, every time we make a
flow-step for dictionaries with tainted strings as values, we do it TWICE --
once for ExternalStringDictKind, and once for StringDictKind... that is just a
waste.