Compare commits

..

4 Commits

Author SHA1 Message Date
Henry Mercer
a1af496216 ATM support doc: Update email address 2021-07-13 10:19:54 +01:00
Henry Mercer
70b0fe38e3 Docs: Fix punctation in section header 2021-07-13 10:19:53 +01:00
Ian Wright
a5a3c047a8 correct "known endpoints" to "candidate endpoints" 2021-07-13 10:19:53 +01:00
Henry Mercer
1d05f98eb6 ATM: Initial commit of ATM for JavaScript beta 2021-07-13 10:19:53 +01:00
115 changed files with 2806 additions and 493 deletions

View File

@@ -2,7 +2,7 @@
We welcome contributions to our CodeQL libraries and queries. Got an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Contributions to this project are [released](https://help.github.com/articles/github-terms-of-service/#6-contributions-under-repository-license) to the public under the [project's open source license](LICENSE).
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html) on [help.semmle.com](https://help.semmle.com).
There is lots of useful documentation to help you write queries, ranging from information about query file structure to tutorials for specific target languages. For more information on the documentation available, see [Writing CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html) on [help.semmle.com](https://help.semmle.com).
## Submitting a new experimental query
@@ -20,7 +20,7 @@ If you have an idea for a query that you would like to share with other CodeQL u
* Python: `python/ql/src`
Each language-specific directory contains further subdirectories that group queries based on their `@tags` or purpose.
- Experimental queries and libraries are stored in the `experimental` subdirectory within each language-specific directory in the [CodeQL repository](https://github.com/github/codeql). For example, experimental Java queries and libraries are stored in `java/ql/src/experimental` and any corresponding tests in `java/ql/test/experimental`.
- Experimental queries and libraries are stored in the `experimental` subdirectory within each language-specific directory in the [CodeQL repository](https://github.com/Semmle/ql). For example, experimental Java queries and libraries are stored in `java/ql/src/experimental` and any corresponding tests in `java/ql/test/experimental`.
- The structure of an `experimental` subdirectory mirrors the structure of its parent directory.
- Select or create an appropriate directory in `experimental` based on the existing directory structure of `experimental` or its parent directory.
@@ -32,11 +32,11 @@ If you have an idea for a query that you would like to share with other CodeQL u
For details, see the [guide on query metadata](docs/query-metadata-style-guide.md).
Make sure the `select` statement is compatible with the query `@kind`. See [About CodeQL queries](https://help.semmle.com/QL/learn-ql/writing-queries/introduction-to-queries.html#select-clause) on help.semmle.com.
Make sure the `select` statement is compatible with the query `@kind`. See [Introduction to query files](https://help.semmle.com/QL/learn-ql/writing-queries/introduction-to-queries.html#select-clause) on help.semmle.com.
3. **Formatting**
- The queries and libraries must be autoformatted, for example using the "Format Document" command in [CodeQL for Visual Studio Code](https://help.semmle.com/codeql/codeql-for-vscode/procedures/about-codeql-for-vscode.html).
- The queries and libraries must be [autoformatted](https://help.semmle.com/codeql/codeql-for-vscode/reference/editor.html#autoformatting).
4. **Compilation**

View File

@@ -9,7 +9,7 @@ You can use the [interactive query console](https://lgtm.com/help/lgtm/using-que
## Contributing
We welcome contributions to our standard library and standard checks. Do you have an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Before you do, though, please take the time to read our [contributing guidelines](CONTRIBUTING.md). You can also consult our [style guides](https://github.com/github/codeql/tree/master/docs) to learn how to format your code for consistency and clarity, how to write query metadata, and how to write query help documentation for your query.
We welcome contributions to our standard library and standard checks. Do you have an idea for a new check, or how to improve an existing query? Then please go ahead and open a pull request! Before you do, though, please take the time to read our [contributing guidelines](CONTRIBUTING.md). You can also consult our [style guides](https://github.com/Semmle/ql/tree/master/docs) to learn how to format your code for consistency and clarity, how to write query metadata, and how to write query help documentation for your query.
## License

View File

@@ -598,7 +598,7 @@ module FlowVar_internal {
private predicate largeVariable(Variable v, int liveBlocks, int defs) {
liveBlocks = strictcount(SubBasicBlock sbb | variableLiveInSBB(sbb, v)) and
defs = strictcount(SubBasicBlock sbb | exists(TBlockVar(sbb, v))) and
liveBlocks.(float) * defs.(float) > 10000.0
liveBlocks * defs > 1000000
}
/**

View File

@@ -77,48 +77,49 @@ private module VirtualDispatch {
// Local flow
DataFlow::localFlowStep(src, other) and
allowFromArg = allowOtherFromArg
or
// Flow from global variable to load.
exists(LoadInstruction load, GlobalOrNamespaceVariable var |
var = src.asVariable() and
other.asInstruction() = load and
// The `allowFromArg` concept doesn't play a role when `src` is a
// global variable, so we just set it to a single arbitrary value for
// performance.
allowFromArg = true
|
// Load directly from the global variable
load.getSourceAddress().(VariableAddressInstruction).getASTVariable() = var
or
// Load from a field on a global union
exists(FieldAddressInstruction fa |
fa = load.getSourceAddress() and
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
fa.getField().getDeclaringType() instanceof Union
)
or
// Flow through global variable
exists(StoreInstruction store |
store = src.asInstruction() and
(
exists(Variable var |
var = store.getDestinationAddress().(VariableAddressInstruction).getASTVariable() and
this.flowsFromGlobal(var)
)
)
or
// Flow from store to global variable. These cases are similar to the
// above but have `StoreInstruction` instead of `LoadInstruction` and
// have the roles swapped between `other` and `src`.
exists(StoreInstruction store, GlobalOrNamespaceVariable var |
var = other.asVariable() and
store = src.asInstruction() and
// Setting `allowFromArg` to `true` like in the base case means we
// treat a store to a global variable like the dispatch itself: flow
// may come from anywhere.
allowFromArg = true
|
// Store directly to the global variable
store.getDestinationAddress().(VariableAddressInstruction).getASTVariable() = var
or
// Store to a field on a global union
exists(FieldAddressInstruction fa |
fa = store.getDestinationAddress() and
fa.getObjectAddress().(VariableAddressInstruction).getASTVariable() = var and
fa.getField().getDeclaringType() instanceof Union
exists(Variable var, FieldAccess a |
var =
store
.getDestinationAddress()
.(FieldAddressInstruction)
.getObjectAddress()
.(VariableAddressInstruction)
.getASTVariable() and
this.flowsFromGlobalUnionField(var, a)
)
)
) and
allowFromArg = true
)
}
private predicate flowsFromGlobal(GlobalOrNamespaceVariable var) {
exists(LoadInstruction load |
this.flowsFrom(DataFlow::instructionNode(load), _) and
load.getSourceAddress().(VariableAddressInstruction).getASTVariable() = var
)
}
private predicate flowsFromGlobalUnionField(Variable var, FieldAccess a) {
a.getTarget().getDeclaringType() instanceof Union and
exists(LoadInstruction load |
this.flowsFrom(DataFlow::instructionNode(load), _) and
load
.getSourceAddress()
.(FieldAddressInstruction)
.getObjectAddress()
.(VariableAddressInstruction)
.getASTVariable() = var
)
}
}

View File

@@ -1,13 +1,13 @@
#include "shared.h"
int atoi(const char *nptr);
char *getenv(const char *name);
char *strcat(char * s1, const char * s2);
char *strdup(const char *);
char *_strdup(const char *);
char *unmodeled_function(const char *);
void sink(const char *);
void sink(int);
int main(int argc, char *argv[]) {

View File

@@ -1,35 +0,0 @@
#include "shared.h"
using SinkFunction = void (*)(int);
void notSink(int notSinkParam);
void callsSink(int sinkParam) {
sink(sinkParam);
}
struct {
SinkFunction sinkPtr, notSinkPtr;
} globalStruct;
union {
SinkFunction sinkPtr, notSinkPtr;
} globalUnion;
SinkFunction globalSinkPtr;
void assignGlobals() {
globalStruct.sinkPtr = callsSink;
globalUnion.sinkPtr = callsSink;
globalSinkPtr = callsSink;
};
void testStruct() {
globalStruct.sinkPtr(atoi(getenv("TAINTED"))); // should reach sinkParam [NOT DETECTED]
globalStruct.notSinkPtr(atoi(getenv("TAINTED"))); // shouldn't reach sinkParam
globalUnion.sinkPtr(atoi(getenv("TAINTED"))); // should reach sinkParam
globalUnion.notSinkPtr(atoi(getenv("TAINTED"))); // should reach sinkParam
globalSinkPtr(atoi(getenv("TAINTED"))); // should reach sinkParam
}

View File

@@ -1,6 +1,4 @@
| globals.cpp:13:15:13:20 | call to getenv | globals.cpp:12:10:12:16 | (const char *)... | global1 |
| globals.cpp:13:15:13:20 | call to getenv | globals.cpp:2:17:2:25 | sinkParam | global1 |
| globals.cpp:13:15:13:20 | call to getenv | globals.cpp:12:10:12:16 | global1 | global1 |
| globals.cpp:13:15:13:20 | call to getenv | shared.h:5:23:5:31 | sinkparam | global1 |
| globals.cpp:23:15:23:20 | call to getenv | globals.cpp:19:10:19:16 | (const char *)... | global2 |
| globals.cpp:23:15:23:20 | call to getenv | globals.cpp:2:17:2:25 | sinkParam | global2 |
| globals.cpp:23:15:23:20 | call to getenv | globals.cpp:19:10:19:16 | global2 | global2 |
| globals.cpp:23:15:23:20 | call to getenv | shared.h:5:23:5:31 | sinkparam | global2 |

View File

@@ -1,5 +1,5 @@
#include "shared.h"
char * getenv(const char *);
void sink(char *sinkParam);
void throughLocal() {
char * local = getenv("VAR");

View File

@@ -1,14 +0,0 @@
// Common declarations in this test dir should go in this file. Otherwise, some
// declarations will have multiple locations, which leads to confusing test
// output.
void sink(const char *sinkparam);
void sink(int sinkparam);
int atoi(const char *nptr);
char *getenv(const char *name);
char *strcat(char * s1, const char * s2);
char *strdup(const char *string);
char *_strdup(const char *string);
char *unmodeled_function(const char *const_string);

View File

@@ -1,18 +1,22 @@
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:6:15:6:24 | p#0 |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:16:8:16:14 | call to _strdup |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:16:8:16:29 | (const char *)... |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:16:16:16:21 | call to getenv |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:16:16:16:28 | (const char *)... |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | shared.h:13:27:13:32 | string |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | defaulttainttracking.cpp:5:14:5:23 | p#0 |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | defaulttainttracking.cpp:17:8:17:13 | call to strdup |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | defaulttainttracking.cpp:17:8:17:28 | (const char *)... |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | defaulttainttracking.cpp:17:15:17:20 | call to getenv |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | defaulttainttracking.cpp:17:15:17:27 | (const char *)... |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | shared.h:12:26:12:31 | string |
| defaulttainttracking.cpp:17:15:17:20 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:18:27:18:32 | call to getenv | defaulttainttracking.cpp:7:26:7:35 | p#0 |
| defaulttainttracking.cpp:18:27:18:32 | call to getenv | defaulttainttracking.cpp:18:27:18:32 | call to getenv |
| defaulttainttracking.cpp:18:27:18:32 | call to getenv | defaulttainttracking.cpp:18:27:18:39 | (const char *)... |
| defaulttainttracking.cpp:18:27:18:32 | call to getenv | shared.h:14:38:14:49 | const_string |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:3:38:3:39 | s2 |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:22:8:22:13 | call to strcat |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:22:8:22:33 | (const char *)... |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:22:20:22:25 | call to getenv |
@@ -20,8 +24,7 @@
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:24:8:24:10 | (const char *)... |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:24:8:24:10 | array to pointer conversion |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:24:8:24:10 | buf |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | shared.h:10:38:10:39 | s2 |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:31:40:31:53 | dotted_address |
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:32:11:32:26 | p#0 |
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:38:11:38:21 | env_pointer |
@@ -32,37 +35,42 @@
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:39:36:39:61 | (const char *)... |
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:39:50:39:61 | & ... |
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:40:10:40:10 | a |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:45:20:45:29 | p#0 |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:57:24:57:24 | p |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:58:14:58:14 | p |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:64:10:64:15 | call to getenv |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:64:10:64:22 | (const char *)... |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | defaulttainttracking.cpp:57:24:57:24 | p |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | defaulttainttracking.cpp:58:14:58:14 | p |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | defaulttainttracking.cpp:66:17:66:22 | call to getenv |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | defaulttainttracking.cpp:66:17:66:29 | (const char *)... |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:66:17:66:22 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | defaulttainttracking.cpp:57:24:57:24 | p |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | defaulttainttracking.cpp:58:14:58:14 | p |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | defaulttainttracking.cpp:67:28:67:33 | call to getenv |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | defaulttainttracking.cpp:67:28:67:40 | (const char *)... |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:67:28:67:33 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | defaulttainttracking.cpp:57:24:57:24 | p |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | defaulttainttracking.cpp:58:14:58:14 | p |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | defaulttainttracking.cpp:68:29:68:34 | call to getenv |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | defaulttainttracking.cpp:68:29:68:41 | (const char *)... |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:68:29:68:34 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | defaulttainttracking.cpp:57:24:57:24 | p |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | defaulttainttracking.cpp:58:14:58:14 | p |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | defaulttainttracking.cpp:69:33:69:38 | call to getenv |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | defaulttainttracking.cpp:69:33:69:45 | (const char *)... |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:69:33:69:38 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:72:11:72:16 | call to getenv | defaulttainttracking.cpp:45:20:45:29 | p#0 |
| defaulttainttracking.cpp:72:11:72:16 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:72:11:72:16 | call to getenv | defaulttainttracking.cpp:72:11:72:16 | call to getenv |
@@ -79,77 +87,54 @@
| defaulttainttracking.cpp:77:34:77:39 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p |
| defaulttainttracking.cpp:77:34:77:39 | call to getenv | defaulttainttracking.cpp:77:34:77:39 | call to getenv |
| defaulttainttracking.cpp:77:34:77:39 | call to getenv | defaulttainttracking.cpp:77:34:77:46 | (const char *)... |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | defaulttainttracking.cpp:57:24:57:24 | p |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | defaulttainttracking.cpp:58:14:58:14 | p |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | defaulttainttracking.cpp:79:30:79:35 | call to getenv |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | defaulttainttracking.cpp:79:30:79:42 | (const char *)... |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:79:30:79:35 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:84:17:84:17 | t |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:8:88:16 | call to move |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:8:88:32 | (const char *)... |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:8:88:32 | (reference dereference) |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:18:88:23 | call to getenv |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:18:88:30 | (reference to) |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:91:42:91:44 | arg |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:92:12:92:14 | arg |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:96:11:96:12 | p2 |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:97:27:97:32 | call to getenv |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:98:10:98:11 | (const char *)... |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:98:10:98:11 | p2 |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| dispatch.cpp:28:29:28:34 | call to getenv | dispatch.cpp:28:24:28:27 | call to atoi |
| dispatch.cpp:28:29:28:34 | call to getenv | dispatch.cpp:28:29:28:34 | call to getenv |
| dispatch.cpp:28:29:28:34 | call to getenv | dispatch.cpp:28:29:28:45 | (const char *)... |
| dispatch.cpp:28:29:28:34 | call to getenv | shared.h:8:22:8:25 | nptr |
| dispatch.cpp:29:32:29:37 | call to getenv | dispatch.cpp:29:27:29:30 | call to atoi |
| dispatch.cpp:29:32:29:37 | call to getenv | dispatch.cpp:29:32:29:37 | call to getenv |
| dispatch.cpp:29:32:29:37 | call to getenv | dispatch.cpp:29:32:29:48 | (const char *)... |
| dispatch.cpp:29:32:29:37 | call to getenv | shared.h:8:22:8:25 | nptr |
| dispatch.cpp:31:28:31:33 | call to getenv | dispatch.cpp:7:20:7:28 | sinkParam |
| dispatch.cpp:31:28:31:33 | call to getenv | dispatch.cpp:8:8:8:16 | sinkParam |
| dispatch.cpp:31:28:31:33 | call to getenv | dispatch.cpp:31:23:31:26 | call to atoi |
| dispatch.cpp:31:28:31:33 | call to getenv | dispatch.cpp:31:28:31:33 | call to getenv |
| dispatch.cpp:31:28:31:33 | call to getenv | dispatch.cpp:31:28:31:44 | (const char *)... |
| dispatch.cpp:31:28:31:33 | call to getenv | shared.h:6:15:6:23 | sinkparam |
| dispatch.cpp:31:28:31:33 | call to getenv | shared.h:8:22:8:25 | nptr |
| dispatch.cpp:32:31:32:36 | call to getenv | dispatch.cpp:7:20:7:28 | sinkParam |
| dispatch.cpp:32:31:32:36 | call to getenv | dispatch.cpp:8:8:8:16 | sinkParam |
| dispatch.cpp:32:31:32:36 | call to getenv | dispatch.cpp:32:26:32:29 | call to atoi |
| dispatch.cpp:32:31:32:36 | call to getenv | dispatch.cpp:32:31:32:36 | call to getenv |
| dispatch.cpp:32:31:32:36 | call to getenv | dispatch.cpp:32:31:32:47 | (const char *)... |
| dispatch.cpp:32:31:32:36 | call to getenv | shared.h:6:15:6:23 | sinkparam |
| dispatch.cpp:32:31:32:36 | call to getenv | shared.h:8:22:8:25 | nptr |
| dispatch.cpp:34:22:34:27 | call to getenv | dispatch.cpp:7:20:7:28 | sinkParam |
| dispatch.cpp:34:22:34:27 | call to getenv | dispatch.cpp:8:8:8:16 | sinkParam |
| dispatch.cpp:34:22:34:27 | call to getenv | dispatch.cpp:34:17:34:20 | call to atoi |
| dispatch.cpp:34:22:34:27 | call to getenv | dispatch.cpp:34:22:34:27 | call to getenv |
| dispatch.cpp:34:22:34:27 | call to getenv | dispatch.cpp:34:22:34:38 | (const char *)... |
| dispatch.cpp:34:22:34:27 | call to getenv | shared.h:6:15:6:23 | sinkparam |
| dispatch.cpp:34:22:34:27 | call to getenv | shared.h:8:22:8:25 | nptr |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 |
| globals.cpp:5:20:5:25 | call to getenv | globals.cpp:2:17:2:25 | sinkParam |
| globals.cpp:5:20:5:25 | call to getenv | globals.cpp:5:12:5:16 | local |
| globals.cpp:5:20:5:25 | call to getenv | globals.cpp:5:20:5:25 | call to getenv |
| globals.cpp:5:20:5:25 | call to getenv | globals.cpp:6:10:6:14 | (const char *)... |
| globals.cpp:5:20:5:25 | call to getenv | globals.cpp:6:10:6:14 | local |
| globals.cpp:5:20:5:25 | call to getenv | shared.h:5:23:5:31 | sinkparam |
| globals.cpp:13:15:13:20 | call to getenv | globals.cpp:9:8:9:14 | global1 |
| globals.cpp:13:15:13:20 | call to getenv | globals.cpp:13:15:13:20 | call to getenv |
| globals.cpp:23:15:23:20 | call to getenv | globals.cpp:16:15:16:21 | global2 |
| globals.cpp:23:15:23:20 | call to getenv | globals.cpp:23:15:23:20 | call to getenv |
| test_diff.cpp:92:10:92:13 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:92:10:92:13 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:92:10:92:13 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:92:10:92:13 | argv | test_diff.cpp:92:10:92:13 | argv |
| test_diff.cpp:92:10:92:13 | argv | test_diff.cpp:92:10:92:16 | (const char *)... |
| test_diff.cpp:92:10:92:13 | argv | test_diff.cpp:92:10:92:16 | access to array |
| test_diff.cpp:94:32:94:35 | argv | shared.h:6:15:6:23 | sinkparam |
| test_diff.cpp:94:32:94:35 | argv | defaulttainttracking.cpp:10:11:10:13 | p#0 |
| test_diff.cpp:94:32:94:35 | argv | test_diff.cpp:2:11:2:13 | p#0 |
| test_diff.cpp:94:32:94:35 | argv | test_diff.cpp:94:10:94:36 | reinterpret_cast<int>... |
| test_diff.cpp:94:32:94:35 | argv | test_diff.cpp:94:32:94:35 | argv |
| test_diff.cpp:96:26:96:29 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:96:26:96:29 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:96:26:96:29 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:96:26:96:29 | argv | test_diff.cpp:16:39:16:39 | a |
| test_diff.cpp:96:26:96:29 | argv | test_diff.cpp:17:10:17:10 | a |
| test_diff.cpp:96:26:96:29 | argv | test_diff.cpp:96:26:96:29 | argv |
| test_diff.cpp:96:26:96:29 | argv | test_diff.cpp:96:26:96:32 | (const char *)... |
| test_diff.cpp:96:26:96:29 | argv | test_diff.cpp:96:26:96:32 | access to array |
| test_diff.cpp:98:18:98:21 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:98:18:98:21 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:16:39:16:39 | a |
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:17:10:17:10 | a |
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:98:13:98:13 | p |
@@ -163,13 +148,15 @@
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:102:26:102:30 | * ... |
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:102:27:102:27 | p |
| test_diff.cpp:98:18:98:21 | argv | test_diff.cpp:102:27:102:30 | access to array |
| test_diff.cpp:104:12:104:15 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:104:12:104:15 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:104:10:104:20 | (const char *)... |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:104:10:104:20 | * ... |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:104:11:104:20 | (...) |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:104:12:104:15 | argv |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:104:12:104:19 | ... + ... |
| test_diff.cpp:108:10:108:13 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:108:10:108:13 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:108:10:108:13 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:108:10:108:13 | argv | test_diff.cpp:24:20:24:29 | p#0 |
| test_diff.cpp:108:10:108:13 | argv | test_diff.cpp:29:24:29:24 | p |
| test_diff.cpp:108:10:108:13 | argv | test_diff.cpp:30:14:30:14 | p |
@@ -181,7 +168,8 @@
| test_diff.cpp:111:10:111:13 | argv | test_diff.cpp:111:10:111:13 | argv |
| test_diff.cpp:111:10:111:13 | argv | test_diff.cpp:111:10:111:16 | (const char *)... |
| test_diff.cpp:111:10:111:13 | argv | test_diff.cpp:111:10:111:16 | access to array |
| test_diff.cpp:115:11:115:14 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:115:11:115:14 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:115:11:115:14 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:115:11:115:14 | argv | test_diff.cpp:24:20:24:29 | p#0 |
| test_diff.cpp:115:11:115:14 | argv | test_diff.cpp:41:24:41:24 | p |
| test_diff.cpp:115:11:115:14 | argv | test_diff.cpp:42:14:42:14 | p |
@@ -196,7 +184,8 @@
| test_diff.cpp:118:26:118:29 | argv | test_diff.cpp:118:26:118:29 | argv |
| test_diff.cpp:118:26:118:29 | argv | test_diff.cpp:118:26:118:32 | (const char *)... |
| test_diff.cpp:118:26:118:29 | argv | test_diff.cpp:118:26:118:32 | access to array |
| test_diff.cpp:121:23:121:26 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:121:23:121:26 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:60:24:60:24 | p |
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:61:34:61:34 | p |
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:67:24:67:24 | p |
@@ -204,7 +193,8 @@
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:121:23:121:26 | argv |
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:121:23:121:29 | (const char *)... |
| test_diff.cpp:121:23:121:26 | argv | test_diff.cpp:121:23:121:29 | access to array |
| test_diff.cpp:124:19:124:22 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:124:19:124:22 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:24:20:24:29 | p#0 |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:76:24:76:24 | p |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:81:24:81:24 | p |
@@ -212,14 +202,16 @@
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:124:19:124:22 | argv |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:124:19:124:25 | (const char *)... |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:124:19:124:25 | access to array |
| test_diff.cpp:126:43:126:46 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:126:43:126:46 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:76:24:76:24 | p |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:81:24:81:24 | p |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:82:14:82:14 | p |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:126:43:126:46 | argv |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:126:43:126:49 | (const char *)... |
| test_diff.cpp:126:43:126:46 | argv | test_diff.cpp:126:43:126:49 | access to array |
| test_diff.cpp:128:44:128:47 | argv | shared.h:5:23:5:31 | sinkparam |
| test_diff.cpp:128:44:128:47 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 |
| test_diff.cpp:128:44:128:47 | argv | test_diff.cpp:1:11:1:20 | p#0 |
| test_diff.cpp:128:44:128:47 | argv | test_diff.cpp:76:24:76:24 | p |
| test_diff.cpp:128:44:128:47 | argv | test_diff.cpp:81:24:81:24 | p |
| test_diff.cpp:128:44:128:47 | argv | test_diff.cpp:82:14:82:14 | p |

View File

@@ -1,5 +1,5 @@
#include "shared.h"
void sink(const char *);
void sink(int);
struct S {
void(*f)(const char*);

View File

@@ -1,30 +1,34 @@
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 | IR only |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:16:8:16:14 | call to _strdup | IR only |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | defaulttainttracking.cpp:16:8:16:29 | (const char *)... | IR only |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | shared.h:5:23:5:31 | sinkparam | IR only |
| defaulttainttracking.cpp:16:16:16:21 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 | IR only |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:3:21:3:22 | s1 | AST only |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:21:8:21:10 | buf | AST only |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:22:15:22:17 | buf | AST only |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:24:8:24:10 | (const char *)... | IR only |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | defaulttainttracking.cpp:24:8:24:10 | array to pointer conversion | IR only |
| defaulttainttracking.cpp:22:20:22:25 | call to getenv | shared.h:10:21:10:22 | s1 | AST only |
| defaulttainttracking.cpp:38:25:38:30 | call to getenv | defaulttainttracking.cpp:39:51:39:61 | env_pointer | AST only |
| defaulttainttracking.cpp:64:10:64:15 | call to getenv | defaulttainttracking.cpp:52:24:52:24 | p | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:8:88:16 | call to move | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:8:88:32 | (const char *)... | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:8:88:32 | (reference dereference) | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | defaulttainttracking.cpp:88:18:88:30 | (reference to) | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | shared.h:5:23:5:31 | sinkparam | IR only |
| defaulttainttracking.cpp:88:18:88:23 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 | IR only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:9:11:9:20 | p#0 | IR only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:91:31:91:33 | ret | AST only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:92:5:92:8 | * ... | AST only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:92:6:92:8 | ret | AST only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:96:11:96:12 | p2 | IR only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:98:10:98:11 | (const char *)... | IR only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | defaulttainttracking.cpp:98:10:98:11 | p2 | IR only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | shared.h:5:23:5:31 | sinkparam | IR only |
| defaulttainttracking.cpp:97:27:97:32 | call to getenv | test_diff.cpp:1:11:1:20 | p#0 | IR only |
| globals.cpp:13:15:13:20 | call to getenv | globals.cpp:13:5:13:11 | global1 | AST only |
| globals.cpp:23:15:23:20 | call to getenv | globals.cpp:23:5:23:11 | global2 | AST only |
| test_diff.cpp:104:12:104:15 | argv | test_diff.cpp:104:11:104:20 | (...) | IR only |
| test_diff.cpp:108:10:108:13 | argv | test_diff.cpp:36:24:36:24 | p | AST only |
| test_diff.cpp:111:10:111:13 | argv | shared.h:5:23:5:31 | sinkparam | AST only |
| test_diff.cpp:111:10:111:13 | argv | defaulttainttracking.cpp:9:11:9:20 | p#0 | AST only |
| test_diff.cpp:111:10:111:13 | argv | test_diff.cpp:1:11:1:20 | p#0 | AST only |
| test_diff.cpp:111:10:111:13 | argv | test_diff.cpp:29:24:29:24 | p | AST only |
| test_diff.cpp:111:10:111:13 | argv | test_diff.cpp:30:14:30:14 | p | AST only |
| test_diff.cpp:124:19:124:22 | argv | test_diff.cpp:76:24:76:24 | p | IR only |

View File

@@ -18,7 +18,8 @@ Project structure
The documentation currently consists of the following Sphinx projects:
- ``learn-ql``help topics to help you learn CodeQL and write queries
- ``ql-handbook``an overview of important concepts in QL, the language that underlies CodeQL analysis
- ``ql-handbook``a user-friendly guide to the QL language, which underlies CodeQL analysis
- ``ql-spec``formal descriptions of the QL language and QLDoc comments
- ``support``the languages and frameworks currently supported in CodeQL analysis
- ``ql-training``source files for the CodeQL training and variant analysis examples slide decks
@@ -103,7 +104,7 @@ generates html slide shows in the ``<slides-output>`` directory when run from
the ``ql-training`` source directory.
For more information about creating slides for QL training and variant analysis
examples, see the `template slide deck <https://github.com/github/codeql/blob/master/docs/language/ql-training/template.rst>`__.
examples, see the `template slide deck <https://github.com/Semmle/ql/blob/master/docs/language/ql-training/template.rst>`__.
Viewing the current version of the CodeQL documentation
*******************************************************

View File

@@ -147,4 +147,6 @@ You have found the two fire starters! They are arrested and the villagers are on
Further reading
---------------
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out who will be the new ruler of the village in the :doc:`next tutorial <crown-the-rightful-heir>`.
- Learn more about predicates and classes in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Explore the libraries that help you get data about code in :doc:`Learning CodeQL <../../index>`.

View File

@@ -262,9 +262,4 @@ Here are some more example queries that solve the river crossing puzzle:
#. This query introduces `algebraic datatypes <https://help.semmle.com/QL/ql-handbook/types.html#algebraic-datatypes>`__
to model the situation, instead of defining everything as a subclass of ``string``.
`See solution in the query console on LGTM.com <https://lgtm.com/query/7260748307619718263/>`__
Further reading
---------------
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
`See solution in the query console on LGTM.com <https://lgtm.com/query/7260748307619718263/>`__

View File

@@ -161,4 +161,6 @@ You could also try writing more of your own QL queries to find interesting facts
Further reading
---------------
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Learn more about recursion in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Put your QL skills to the test and solve the :doc:`River crossing puzzle <cross-the-river>`.
- Start using QL to analyze projects. See :doc:`Learning CodeQL <../../index>` for a summary of the available languages and resources.

View File

@@ -292,4 +292,6 @@ Have you found the thief?
Further reading
---------------
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Help the villagers track down another criminal in the :doc:`next tutorial <catch-the-fire-starter>`.
- Find out more about the concepts you discovered in this tutorial in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Explore the libraries that help you get data about code in :doc:`Learning CodeQL <../../index>`.

View File

@@ -223,5 +223,8 @@ There is a similar built-in `query <https://lgtm.com/rules/2158670642/>`__ on LG
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Explore other ways of querying classes using examples from the `C/C++ cookbook <https://help.semmle.com/wiki/label/CBCPP/class>`__.
- Take a look at the :doc:`Analyzing data flow in C and C++ <dataflow>` tutorial.
- Try the worked examples in the following topics: :doc:`Refining a query to account for edge cases <private-field-initialization>`, and :doc:`Detecting a potential buffer overflow <zero-space-terminator>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -139,10 +139,6 @@ Global data flow
Global data flow tracks data flow throughout the entire program, and is therefore more powerful than local data flow. However, global data flow is less precise than local data flow, and the analysis typically requires significantly more time and memory to perform.
.. pull-quote:: Note
.. include:: ../../reusables/path-problem.rst
Using global data flow
~~~~~~~~~~~~~~~~~~~~~~
@@ -299,6 +295,13 @@ Exercise 3: Write a class that represents flow sources from ``getenv``. (`Answer
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flows from ``getenv`` to ``gethostbyname``. (`Answer <#exercise-4>`__)
Further reading
---------------
- Try the worked examples in the following topics: :doc:`Refining a query to account for edge cases <private-field-initialization>` and :doc:`Detecting a potential buffer overflow <zero-space-terminator>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.
Answers
-------
@@ -386,11 +389,3 @@ Exercise 4
from DataFlow::Node getenv, FunctionCall fc, GetenvToGethostbynameConfiguration cfg
where cfg.hasFlow(getenv, DataFlow::exprNode(fc.getArgument(0)))
select getenv.asExpr(), fc
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -132,5 +132,7 @@ Note that we replaced ``e.getEnclosingStmt()`` with ``e.getEnclosingStmt().getPa
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Explore other ways of finding types and statements using examples from the C/C++ cookbook for `types <https://help.semmle.com/wiki/label/CBCPP/type>`__ and `statements <https://help.semmle.com/wiki/label/CBCPP/statement>`__.
- Take a look at the :doc:`Conversions and classes in C and C++ <conversions-classes>` and :doc:`Analyzing data flow in C and C++ <dataflow>` tutorials.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -92,5 +92,7 @@ The LGTM version of this query is considerably more complicated, but if you look
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Explore other ways of finding functions using examples from the `C/C++ cookbook <https://help.semmle.com/wiki/label/CBCPP/function>`__.
- Take a look at some other tutorials: :doc:`Expressions, types and statements in C and C++ <introduce-libraries-cpp>`, :doc:`Conversions and classes in C and C++ <conversions-classes>`, and :doc:`Analyzing data flow in C and C++ <dataflow>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -93,9 +93,3 @@ The ``comparesLt`` predicate
``comparesLt(left, right, k, isLessThan, testIsTrue)`` holds if ``left < right + k`` evaluates to ``isLessThan`` when the expression evaluates to ``testIsTrue``.
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -525,5 +525,6 @@ This table lists `Preprocessor <https://help.semmle.com/qldoc/cpp/semmle/code/cp
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Experiment with the worked examples in the CodeQL for C/C++ topics: :doc:`Functions in C and C++ <function-classes>`, :doc:`Expressions, types, and statements in C and C++ <expressions-types>`, :doc:`Conversions and classes in C and C++ <conversions-classes>`, and :doc:`Analyzing data flow in C and C++ <dataflow>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -149,5 +149,6 @@ Finally we can simplify the query by using the transitive closure operator. In t
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Take a look at another example: :doc:`Detecting a potential buffer overflow <zero-space-terminator>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -39,3 +39,10 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- :doc:`Using range analysis for C and C++ <range-analysis>`: You can use range analysis to determine the upper or lower bounds on an expression, or whether an expression could potentially over or underflow.
- :doc:`Hash consing and value numbering <value-numbering-hash-cons>`: You can use specialized CodeQL libraries to recognize expressions that are syntactically identical or compute the same value at runtime in C and C++ codebases.
Further reading
---------------
- For examples of how to query common C/C++ elements, see the `C/C++ cookbook <https://help.semmle.com/wiki/display/CBCPP>`__.
- For the queries used in LGTM, display a `C/C++ query <https://lgtm.com/search?q=language%3Acpp&t=rules>`__ and click **Open in query console** to see the code used to find alerts.
- For more information about the library for C/C++ see the `CodeQL library for C/C++ <https://help.semmle.com/qldoc/cpp>`__.

View File

@@ -41,9 +41,3 @@ This query uses ``upperBound`` to determine whether the result of ``snprintf`` i
convSink = call.getArgument(1).getFullyConverted()
select call, upperBound(call.getArgument(1).getFullyConverted())
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -110,9 +110,3 @@ Example query
hashCons(outer.getCondition()) = hashCons(inner.getCondition())
select inner.getCondition(), "The condition of this if statement duplicates the condition of $@",
outer.getCondition(), "an enclosing if statement"
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -224,5 +224,5 @@ The completed query will now identify cases where the result of ``strlen`` is st
Further reading
---------------
.. include:: ../../reusables/cpp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -137,10 +137,6 @@ Global data flow
Global data flow tracks data flow throughout the entire program, and is therefore more powerful than local data flow. However, global data flow is less precise than local data flow, and the analysis typically requires significantly more time and memory to perform.
.. pull-quote:: Note
.. include:: ../../reusables/path-problem.rst
Using global data flow
~~~~~~~~~~~~~~~~~~~~~~
@@ -553,7 +549,6 @@ This can be adapted from the ``SystemUriFlow`` class:
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/csharp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Learn about the standard libraries used to write queries for C# in :doc:`Introducing the C# libraries <introduce-libraries-csharp>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -1122,5 +1122,6 @@ Here is the fixed version:
Further reading
---------------
.. include:: ../../reusables/csharp-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Visit :doc:`Analyzing data flow in C# <dataflow>` to learn more about writing queries using the standard data flow and taint tracking libraries.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -15,4 +15,9 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- :doc:`Analyzing data flow in C# <dataflow>`: You can use CodeQL to track the flow of data through a C# program to its use.
Further reading
---------------
- For examples of how to query common C# elements, see the `C# cookbook <https://help.semmle.com/wiki/display/CBCSHARP>`__.
- For the queries used in LGTM, display a `C# query <https://lgtm.com/search?q=language%3Acsharp&t=rules>`__ and click **Open in query console** to see the code used to find alerts.
- For more information about the library for C# see the `CodeQL library for C# <https://help.semmle.com/qldoc/csharp>`__.

View File

@@ -0,0 +1,33 @@
What's in a CodeQL database?
============================
A CodeQL database contains a variety of data related to a particular code base at a particular point in time. For details of how the database is generated see `Database generation <https://lgtm.com/help/lgtm/generate-database>`__ on LGTM.com.
The database contains a full, hierarchical representation of the program defined by the code base. The database schema varies according to the language analyzed. The schema provides an interface between the initial lexical analysis during the extraction process, and the actual complex analysis using CodeQL. When the source code languages being analyzed change (such as Java 7 evolving into Java 8), this interface between the analysis phases can also change.
For each language, a CodeQL library defines classes to provide a layer of abstraction over the database tables. This provides an object-oriented view of the data which makes it easier to write queries. This is easiest to explain using an example.
Example
-------
For a Java program, two key tables are:
- The ``expressions`` table containing a row for every single expression in the source code that was analyzed during the build process.
- The ``statements`` table containing a row for every single statement in the source code that was analyzed during the build process.
The CodeQL library defines classes to provide a layer of abstraction over each of these tables (and the related auxiliary tables): ``Expr`` and ``Stmt``.
Most classes in the library are similar: they are abstractions over one or more database tables. Looking at one of the libraries illustrates this:
.. code-block:: ql
class Expr extends StmtParent, @expr {
...
/** the location of this expression */
Location getLocation() { exprs(this,_,_,result) }
...
}
The ``Expr`` class, shown here, extends from the database type ``@expr``. Member predicates of the ``Expr`` class are implemented in terms of the database-provided ``exprs`` table.

View File

@@ -611,8 +611,8 @@ is to compare them to each other to determine whether two data-flow nodes have t
Further reading
---------------
.. include:: ../../reusables/go-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.
.. |ast| image:: ast.png
.. |cfg| image:: cfg.png

View File

@@ -11,3 +11,9 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- `Basic Go query <https://lgtm.com/help/lgtm/console/ql-go-basic-example>`__: Learn to write and run a simple CodeQL query using LGTM.
- :doc:`CodeQL library for Go <introduce-libraries-go>`: When you're analyzing a Go program, you can make use of the large collection of classes in the CodeQL library for Go.
Further reading
---------------
- For the queries used in LGTM, display a `Go query <https://lgtm.com/search?q=language%3Ago&t=rules>`__ and click **Open in query console** to see the code used to find alerts.
- For more information about the library for Go see the `CodeQL library for Go <https://help.semmle.com/qldoc/go/>`__.

View File

@@ -3,7 +3,7 @@ Learning CodeQL
CodeQL is the code analysis platform used by security researchers to automate variant analysis.
You can use CodeQL queries to explore code and quickly find variants of security vulnerabilities and bugs.
These queries are easy to write and sharevisit the topics below and `our open source repository on GitHub <https://github.com/github/codeql>`__ to learn more.
These queries are easy to write and sharevisit the topics below and `our open source repository on GitHub <https://github.com/Semmle/ql>`__ to learn more.
You can also try out CodeQL in the `query console on LGTM.com <https://lgtm.com/query>`__.
Here, you can query open source projects directly, without having to download CodeQL databases and libraries.
@@ -27,6 +27,7 @@ CodeQL is based on a powerful query language called QL. The following topics hel
javascript/ql-for-javascript
python/ql-for-python
ql-training
technical-info
.. toctree::
:hidden:

View File

@@ -79,7 +79,8 @@ However, since ``y`` is derived from ``x``, it is influenced by the untrusted or
In QL, taint tracking extends data flow analysis by including steps in which the data values are not necessarily preserved, but the potentially insecure object is still propagated.
These flow steps are modeled in the taint-tracking library using predicates that hold if taint is propagated between nodes.
Further reading
***************
What next?
**********
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
- Search for ``DataFlow`` and ``TaintTracking`` in the `standard CodeQL libraries <https://help.semmle.com/QL/ql-libraries.html>`__ to learn more about the technical implementation of data flow analysis for specific programming languages.
- Visit `Learning CodeQL <https://help.semmle.com/QL/learn-ql/>`__ to find language-specific tutorials on data flow and other topics.

View File

@@ -240,5 +240,6 @@ Now we can extend our query to filter out calls in methods carrying a ``Suppress
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Take a look at some of the other articles in this section: :doc:`Javadoc <javadoc>` and :doc:`Working with source locations <source-locations>`.
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -1,9 +1,7 @@
Abstract syntax tree classes for working with Java programs
===========================================================
Classes for working with Java code
==================================
CodeQL has a large selection of classes for representing the abstract syntax tree of Java programs.
.. include:: ../../reusables/abstract-syntax-tree.rst
CodeQL has a large selection of classes for working with Java statements and expressions.
.. _Expr: https://help.semmle.com/qldoc/java/semmle/code/java/Expr.qll/type.Expr$Expr.html
.. _Stmt: https://help.semmle.com/qldoc/java/semmle/code/java/Statement.qll/type.Statement$Stmt.html
@@ -276,9 +274,3 @@ Miscellaneous
+------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+
| ``@Annot(key=val)`` | `Annotation <https://help.semmle.com/qldoc/java/semmle/code/java/Annotation.qll/type.Annotation$Annotation.html>`__ |   |
+------------------------------------------------------------------+-----------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------+
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -164,5 +164,6 @@ Finally, on many Java projects there are methods that are invoked indirectly by
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out how to query metadata and white space: :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>`, and :doc:`Working with source locations <source-locations>`.
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -147,10 +147,6 @@ Global data flow
Global data flow tracks data flow throughout the entire program, and is therefore more powerful than local data flow. However, global data flow is less precise than local data flow, and the analysis typically requires significantly more time and memory to perform.
.. pull-quote:: Note
.. include:: ../../reusables/path-problem.rst
Using global data flow
~~~~~~~~~~~~~~~~~~~~~~
@@ -257,6 +253,13 @@ Exercise 3: Write a class that represents flow sources from ``java.lang.System.g
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flows from ``getenv`` to ``java.net.URL``. (`Answer <#exercise-4>`__)
Further reading
---------------
- Try the worked examples in these articles: :doc:`Navigating the call graph <call-graph>` and :doc:`Working with source locations <source-locations>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.
Answers
-------
@@ -354,11 +357,3 @@ Exercise 4
from DataFlow::Node src, DataFlow::Node sink, GetenvToURLConfiguration config
where config.hasFlow(src, sink)
select src, "This environment variable constructs a URL $@.", sink, "here"
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -26,7 +26,7 @@ If ``l`` is bigger than 2\ :sup:`31`\ - 1 (the largest positive value of type ``
All primitive numeric types have a maximum value, beyond which they will wrap around to their lowest possible value (called an "overflow"). For ``int``, this maximum value is 2\ :sup:`31`\ - 1. Type ``long`` can accommodate larger values up to a maximum of 2\ :sup:`63`\ - 1. In this example, this means that ``l`` can take on a value that is higher than the maximum for type ``int``; ``i`` will never be able to reach this value, instead overflowing and returning to a low value.
We're going to develop a query that finds code that looks like it might exhibit this kind of behavior. We'll be using several of the standard library classes for representing statements and functions. For a full list, see :doc:`Abstract syntax tree classes for working with Java programs <ast-class-reference>`.
We're going to develop a query that finds code that looks like it might exhibit this kind of behavior. We'll be using several of the standard library classes for representing statements and functions. For a full list, see :doc:`Classes for working with Java code <ast-class-reference>`.
Initial query
-------------
@@ -125,5 +125,6 @@ Now we rewrite our query to make use of these new classes:
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Have a look at some of the other articles in this section: :doc:`Java types <types-class-hierarchy>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>`, and :doc:`Working with source locations <source-locations>`.
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -1,7 +1,7 @@
CodeQL library for Java
=======================
When you're analyzing a Java program, you can make use of the large collection of classes in the CodeQL library for Java.
When you're analyzing a Java program in {{ site.data.variables.product.prodname_dotcom }}, you can make use of the large collection of classes in the CodeQL library for Java.
About the CodeQL library for Java
---------------------------------
@@ -210,7 +210,7 @@ Class ``Variable`` represents a variable `in the Java sense <http://docs.oracle.
Abstract syntax tree
--------------------
Classes in this category represent abstract syntax tree (AST) nodes, that is, statements (class ``Stmt``) and expressions (class ``Expr``). For a full list of expression and statement types available in the standard QL library, see :doc:`Abstract syntax tree classes for working with Java programs <ast-class-reference>`.
Classes in this category represent abstract syntax tree (AST) nodes, that is, statements (class ``Stmt``) and expressions (class ``Expr``). For a full list of expression and statement types available in the standard QL library, see :doc:`Classes for working with Java code <ast-class-reference>`.
Both ``Expr`` and ``Stmt`` provide member predicates for exploring the abstract syntax tree of a program:
@@ -386,5 +386,6 @@ For more information about callables and calls, see the :doc:`article on the cal
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Experiment with the worked examples in the CodeQL for Java articles: :doc:`Java types <types-class-hierarchy>`, :doc:`Overflow-prone comparisons in Java <expressions-statements>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>` and :doc:`Working with source locations <source-locations>`.
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -221,5 +221,6 @@ Currently, ``visibleIn`` only considers single-type imports, but you could exten
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out how you can use the location API to define queries on whitespace: :doc:`Working with source locations <source-locations>`.
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -34,5 +34,12 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- :doc:`Working with source locations <source-locations>`: You can use the location of entities within Java code to look for potential errors. Locations allow you to deduce the presence, or absence, of white space which, in some cases, may indicate a problem.
- :doc:`Abstract syntax tree classes for working with Java programs <ast-class-reference>`: CodeQL has a large selection of classes for representing the abstract syntax tree of Java programs.
- :doc:`Classes for working with Java code <ast-class-reference>`: CodeQL has a large selection of classes for working with Java statements and expressions.
Further reading
---------------
- For examples of how to query common Java elements, see the `Java cookbook <https://help.semmle.com/wiki/display/CBJAVA>`__.
- For the queries used in LGTM, display a `Java query <https://lgtm.com/search?q=language%3Ajava&t=rules>`__ and click **Open in query console** to see the code used to find alerts.
- For more information about the library for Java see the `CodeQL library for Java <https://help.semmle.com/qldoc/java>`__.

View File

@@ -186,5 +186,5 @@ Whitespace suggests that the programmer meant to toggle ``i`` between zero and o
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -114,7 +114,7 @@ To identify these cases, we can create two CodeQL classes that represent, respec
class CollectionToArrayCall extends MethodAccess {
CollectionToArrayCall() {
exists(CollectionToArray m |
this.getMethod().getSourceDeclaration().overridesOrInstantiates*(m)
this.getMethod().getSourceDeclaration().overrides*(m)
)
}
@@ -124,7 +124,7 @@ To identify these cases, we can create two CodeQL classes that represent, respec
}
}
Notice the use of ``getSourceDeclaration`` and ``overridesOrInstantiates`` in the constructor of ``CollectionToArrayCall``: we want to find calls to ``Collection.toArray`` and to any method that overrides it, as well as any parameterized instances of these methods. In our example above, for instance, the call ``l.toArray`` resolves to method ``toArray`` in the raw class ``ArrayList``. Its source declaration is ``toArray`` in the generic class ``ArrayList<T>``, which overrides ``AbstractCollection<T>.toArray``, which in turn overrides ``Collection<T>.toArray``, which is an instantiation of ``Collection.toArray`` (since the type parameter ``T`` in the overridden method belongs to ``ArrayList`` and is an instantiation of the type parameter belonging to ``Collection``).
Notice the use of ``getSourceDeclaration`` and ``overrides`` in the constructor of ``CollectionToArrayCall``: we want to find calls to ``Collection.toArray`` and to any method that overrides it, as well as any parameterized instances of these methods. In our example above, for instance, the call ``l.toArray`` resolves to method ``toArray`` in the raw class ``ArrayList``. Its source declaration is method\ ``toArray`` in the generic class ``ArrayList``, which overrides ``AbstractCollection.toArray``, which in turn overrides ``Collection.toArray``.
Using these new classes we can extend our query to exclude calls to ``toArray`` on an argument of type ``A[]`` which are then cast to ``A[]``:
@@ -299,5 +299,6 @@ Adding these three improvements, our final query becomes:
Further reading
---------------
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Take a look at some of the other articles in this section: :doc:`Overflow-prone comparisons in Java <expressions-statements>`, :doc:`Navigating the call graph <call-graph>`, :doc:`Annotations in Java <annotations>`, :doc:`Javadoc <javadoc>`, and :doc:`Working with source locations <source-locations>`.
- Find out how specific classes in the AST are represented in the standard library for Java: :doc:`Classes for working with Java code <ast-class-reference>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -1,9 +1,7 @@
Abstract syntax tree classes for working with JavaScript and TypeScript programs
================================================================================
Abstract syntax tree classes for JavaScript and TypeScript
==========================================================
CodeQL has a large selection of classes for representing the abstract syntax tree of JavaScript and TypeScript programs.
.. include:: ../../reusables/abstract-syntax-tree.rst
CodeQL has a large selection of classes for working with JavaScript and TypeScript statements and expressions.
Statement classes
-----------------
@@ -358,9 +356,3 @@ All classes in this table are subclasses of `Expr <https://help.semmle.com/qldoc
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------+
| ``yield`` `Expr <https://help.semmle.com/qldoc/javascript/semmle/javascript/Expr.qll/type.Expr$Expr.html>`__ | `YieldExpr <https://help.semmle.com/qldoc/javascript/semmle/javascript/Expr.qll/type.Expr$YieldExpr.html>`__ |
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------+
Further reading
---------------
.. include:: ../../reusables/javascript-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -142,7 +142,7 @@ Files
AST nodes
---------
See also: :doc:`Abstract syntax tree classes for working with JavaScript and TypeScript programs <ast-class-reference>`.
See also: :doc:`Abstract syntax tree classes for JavaScript and TypeScript <ast-class-reference>`.
Conversion between DataFlow and AST nodes:
@@ -216,11 +216,3 @@ Troubleshooting
- Compilation fails due to incompatible types? Make sure AST nodes and
DataFlow nodes are not mixed up. Use `asExpr() <https://help.semmle.com/qldoc/javascript/semmle/javascript/dataflow/DataFlow.qll/predicate.DataFlow$DataFlow$Node$asExpr.0.html>`__ or
`flow() <https://help.semmle.com/qldoc/javascript/semmle/javascript/AST.qll/predicate.AST$AST$ValueNode$flow.0.html>`__ to convert.
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/javascript-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -188,10 +188,6 @@ Global data flow tracks data flow throughout the entire program, and is therefor
than local data flow. That is, the analysis may report spurious flows that cannot in fact happen. Moreover, global data flow analysis typically requires significantly
more time and memory than local analysis.
.. pull-quote:: Note
.. include:: ../../reusables/path-problem.rst
Using global data flow
~~~~~~~~~~~~~~~~~~~~~~
@@ -468,6 +464,13 @@ Hint: array indices are properties with numeric names; you can use regular expre
Exercise 4: Using the answers from 2 and 3, write a query which finds all global data flows from array elements of the result of a call to the ``tagName`` argument to the
``createElement`` function. (`Answer <#exercise-4>`__)
Further reading
---------------
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.
- Learn about writing more precise data-flow analyses in :doc:`Using flow labels for precise data flow analysis <flow-labels>`
Answers
-------
@@ -550,11 +553,3 @@ Exercise 4
from HardCodedTagNameConfiguration cfg, DataFlow::Node source, DataFlow::Node sink
where cfg.hasFlow(source, sink)
select source, sink
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/java-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst

View File

@@ -398,7 +398,6 @@ string may be an absolute path and whether it may contain ``..`` components.
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/javascript-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Learn about the standard CodeQL libraries used to write queries for JavaScript in :doc:`CodeQL libraries for JavaScript <introduce-libraries-js>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -1031,5 +1031,6 @@ Predicate ``YAMLMapping.maps(key, value)`` models the key-value relation represe
Further reading
---------------
.. include:: ../../reusables/javascript-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Learn about the standard CodeQL libraries used to write queries for TypeScript in :doc:`CodeQL libraries for TypeScript <introduce-libraries-ts>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -449,5 +449,6 @@ A `LocalNamespaceName <https://help.semmle.com/qldoc/javascript/semmle/javascrip
Further reading
---------------
.. include:: ../../reusables/javascript-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Learn about the standard CodeQL libraries used to write queries for JavaScript in :doc:`CodeQL libraries for JavaScript <introduce-libraries-js>`.
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.

View File

@@ -26,6 +26,13 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- :doc:`Using type tracking for API modeling <type-tracking>`: You can track data through an API by creating a model using the CodeQL type-tracking library for JavaScript.
- :doc:`Abstract syntax tree classes for working with JavaScript and TypeScript programs <ast-class-reference>`: CodeQL has a large selection of classes for representing the abstract syntax tree of JavaScript and TypeScript programs.
- :doc:`Abstract syntax tree classes for JavaScript and TypeScript <ast-class-reference>`: CodeQL has a large selection of classes for working with JavaScript and TypeScript statements and expressions.
- :doc:`Data flow cheat sheet for JavaScript <dataflow-cheat-sheet>`: This article describes parts of the JavaScript libraries commonly used for variant analysis and in data flow queries.
Further reading
---------------
- For examples of how to query common JavaScript elements, see the `JavaScript cookbook <https://help.semmle.com/wiki/display/CBJS>`__.
- For the queries used in LGTM, display a `JavaScript query <https://lgtm.com/search?q=language%3Ajavascript&t=rules>`__ and click **Open in query console** to see the code used to find alerts.
- For more information about the library for JavaScript see the `CodeQL library for JavaScript <https://help.semmle.com/qldoc/javascript/>`__.

View File

@@ -493,7 +493,7 @@ Prefer data-flow configurations when:
- Differentiating between different kinds of user-controlled data -- see :doc:`Using flow labels for precise data flow analysis <flow-labels>`.
- Tracking transformations of a value through generic utility functions.
- Tracking values through string manipulation.
- Generating a path from source to sink -- see :doc:`Creating path queries <../writing-queries/path-queries>`.
- Generating a path from source to sink -- see :doc:`constructing path queries <../writing-queries/path-queries>`.
Lastly, depending on the code base being analyzed, some alternatives to consider are:
@@ -521,5 +521,6 @@ Type tracking is used in a few places in the standard libraries:
Further reading
---------------
.. include:: ../../reusables/javascript-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
- Learn more about the query console in `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com.
- Learn about writing precise data-flow analyses in :doc:`Using flow labels for precise data flow analysis <flow-labels>`.

View File

@@ -115,8 +115,3 @@ The ``toString()`` predicate
----------------------------
All classes except those that extend primitive types, must provide a ``string toString()`` member predicate. The query compiler will complain if you don't. The uniqueness warning, noted above for locations, applies here too.
Further reading
---------------
- `CodeQL repository <https://github.com/github/codeql>`__

View File

@@ -117,6 +117,6 @@ Example finding mutually exclusive blocks within the same function
Further reading
---------------
.. include:: ../../reusables/python-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`"
.. include:: ../../reusables/python-other-resources.rst

View File

@@ -3,7 +3,7 @@ Functions in Python
You can use syntactic classes from the standard CodeQL library to find Python functions and identify calls to them.
These examples use the standard CodeQL class `Function <https://help.semmle.com/qldoc/python/semmle/python/Function.qll/type.Function$Function.html>`__. For more information, see ":doc:`CodeQL library for Python <introduce-libraries-python>`."
These examples use the standard CodeQL class `Function <https://help.semmle.com/qldoc/python/semmle/python/Function.qll/type.Function$Function.html>`__. For more information, see ":doc:`Introducing the Python libraries <introduce-libraries-python>`."
Finding all functions called "get..."
-------------------------------------
@@ -81,6 +81,9 @@ In a later tutorial we will see how to use the type-inference library to find ca
Further reading
---------------
.. include:: ../../reusables/python-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- ":doc:`Expressions and statements in Python <statements-expressions>`"
- ":doc:`Pointer analysis and type inference in Python <pointsto-type-infer>`"
- ":doc:`Analyzing control flow in Python <control-flow>`"
- ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`"
.. include:: ../../reusables/python-other-resources.rst

View File

@@ -340,6 +340,10 @@ For more information about these classes, see ":doc:`Analyzing data flow and tra
Further reading
---------------
.. include:: ../../reusables/python-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- ":doc:`Functions in Python <functions>`"
- ":doc:`Expressions and statements in Python <statements-expressions>`"
- ":doc:`Pointer analysis and type inference in Python <pointsto-type-infer>`"
- ":doc:`Analyzing control flow in Python <control-flow>`"
- ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`"
.. include:: ../../reusables/python-other-resources.rst

View File

@@ -226,6 +226,7 @@ Then we can use ``Value.getACall()`` to identify calls to the ``eval`` function,
Further reading
---------------
.. include:: ../../reusables/python-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- ":doc:`Analyzing control flow in Python <control-flow>`"
- ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`"
.. include:: ../../reusables/python-other-resources.rst

View File

@@ -26,3 +26,10 @@ Experiment and learn how to write effective and efficient queries for CodeQL dat
- :doc:`Pointer analysis and type inference in Python <pointsto-type-infer>`: At runtime, each Python expression has a value with an associated type. You can learn how an expression behaves at runtime by using type-inference classes from the standard CodeQL library.
- :doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`: You can use CodeQL to track the flow of data through a Python program. Tracking user-controlled, or tainted, data is a key technique for security researchers.
Further reading
---------------
- For examples of how to query common Python elements, see the `JavaScript cookbook <https://help.semmle.com/wiki/display/CBPython>`__.
- For the queries used in LGTM, display a `Python query <https://lgtm.com/search?q=language%3APython&t=rules>`__ and click **Open in query console** to see the code used to find alerts.
- For more information about the library for JavaScript see the `CodeQL library for Python <https://help.semmle.com/qldoc/python/>`__.

View File

@@ -156,7 +156,7 @@ The clause ``cmp.getOp(0) instanceof Is and cmp.getComparator(0) = literal`` che
Tip
We have to use ``cmp.getOp(0)`` and ``cmp.getComparator(0)``\ as there is no ``cmp.getOp()`` or ``cmp.getComparator()``. The reason for this is that a ``Compare`` expression can have multiple operators. For example, the expression ``3 < x < 7`` has two operators and two comparators. You use ``cmp.getComparator(0)`` to get the first comparator (in this example the ``x``) and ``cmp.getComparator(1)`` to get the second comparator (in this example the ``7``).
We have to use ``cmp.getOp(0)`` and ``cmp.getComparator(0)``\ as there is no ``cmp.getOp()`` or ``cmp.getComparator()``. The reason for this is that a ``Compare`` expression can have multiple operators. For example, the expression ``3 < x < 7`` has two operators and two comparators. You use ``cmp.getComparator(0)`` to get the first comparator (in this example the ``3``) and ``cmp.getComparator(1)`` to get the second comparator (in this example the ``7``).
Example finding duplicates in dictionary literals
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -256,6 +256,9 @@ Here is the relevant part of the class hierarchy:
Further reading
---------------
.. include:: ../../reusables/python-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- ":doc:`Functions in Python <functions>`"
- ":doc:`Pointer analysis and type inference in Python <pointsto-type-infer>`"
- ":doc:`Analyzing control flow in Python <control-flow>`"
- ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`"
.. include:: ../../reusables/python-other-resources.rst

View File

@@ -259,8 +259,8 @@ which defines the simplest possible taint kind class, ``HardcodedValue``, and cu
Further reading
---------------
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
.. include:: ../../reusables/python-further-reading.rst
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- ":doc:`Pointer analysis and type inference in Python <pointsto-type-infer>`"
- ":doc:`Analyzing control flow in Python <control-flow>`"
- ":doc:`Analyzing data flow and tracking tainted data in Python <taint-tracking>`"
.. include:: ../../reusables/python-other-resources.rst

View File

@@ -60,4 +60,5 @@ CodeQL and variant analysis for Java
Further reading
~~~~~~~~~~~~~~~
- `GitHub Security Lab <https://securitylab.github.com/research>`__
- If you are completely new to CodeQL, look at our introductory topics in :doc:`Learning CodeQL <index>`.
- To see examples of CodeQL queries that have been used to find security vulnerabilities and bugs in open source software projects, visit the `GitHub Security Lab website <https://securitylab.github.com/research>`__ and the associated `repository <https://github.com/github/security-lab>`__.

View File

@@ -0,0 +1,9 @@
Technical information
=====================
.. toctree::
:hidden:
database
- :doc:`What's in a CodeQL database? <database>`

View File

@@ -18,10 +18,10 @@ Previously we used the term QL to refer to the whole code analysis platform, whi
The name QL now only refers to the query language that powers CodeQL analysis.
The CodeQL queries and libraries used to analyze source code are written in QL.
These queries and libraries are open source, and can be found in the `CodeQL repository <https://github.com/github/codeql>`__.
These queries and libraries are open source, and can be found in the `CodeQL repository <https://github.com/semmle/ql>`__.
QL is a general-purpose, object-oriented language that can be used to query any kind of data.
CodeQL databases
----------------
QL snapshots have been renamed CodeQL databases. `CodeQL databases <https://help.semmle.com/codeql/about-codeql.html#about-codeql-databases>`__ contain relational data created and analyzed using CodeQL. They are the equivalent of QL snapshots, but have been optimized for use with the CodeQL tools.
QL snapshots have been renamed CodeQL databases. :doc:`CodeQL databases <database>` contain relational data created and analyzed using CodeQL. They are the equivalent of QL snapshots, but have been optimized for use with the CodeQL tools.

View File

@@ -148,7 +148,7 @@ However, as written it is difficult for the optimizer to pick out the best order
Now the structure we want is clearer. We've separated out the easy part into its own predicate ``locInfo``, and the main predicate ``sameLoc`` is just a larger join.
Further reading
---------------
Further information
-------------------
.. include:: ../../reusables/codeql-ref-tools-further-reading.rst
- Find out more about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.

View File

@@ -10,11 +10,21 @@ CodeQL includes queries to find the most relevant and interesting problems for e
- **Alert queries**: queries that highlight issues in specific locations in your code.
- **Path queries**: queries that describe the flow of information between a source and a sink in your code.
- **Metric queries**: queries that compute statistics for your code.
You can add custom queries to `custom query packs <https://lgtm.com/help/lgtm/about-queries#what-are-query-packs>`__ to analyze your projects in `LGTM <https://lgtm.com>`__, use them to analyze a database with the `CodeQL CLI <https://help.semmle.com/codeql/codeql-cli.html>`__, or you can contribute to the standard CodeQL queries in our `open source repository on GitHub <https://github.com/semmle/ql>`__.
.. pull-quote::
Note
Only the results generated by alert and path queries are displayed on LGTM.
You can display the results generated by metric queries by running them against your project in the `query console on LGTM <https://lgtm.com/query>`__ or with the CodeQL `extension for VS Code <https://help.semmle.com/codeql/codeql-for-vscode.html>`__.
You can explore the paths generated by path queries `directly in LGTM <https://lgtm.com/help/lgtm/exploring-data-flow-paths>`__ and in the `Results view <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__ in VS Code.
You can add custom queries to `custom query packs <https://lgtm.com/help/lgtm/about-queries#what-are-query-packs>`__ to analyze your projects in `LGTM <https://lgtm.com>`__, use them to analyze a database with the `CodeQL CLI <https://help.semmle.com/codeql/codeql-cli.html>`__, or you can contribute to the standard CodeQL queries in our `open source repository on GitHub <https://github.com/github/codeql>`__.
This topic is a basic introduction to query files. You can find more information on writing queries for specific programming languages `here <https://help.semmle.com/QL/learn-ql/>`__, and detailed technical information about QL in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__.
For more information on how to format your code when contributing queries to the GitHub repository, see the `CodeQL style guide <https://github.com/github/codeql/blob/master/docs/ql-style-guide.md>`__.
For more information on how to format your code when contributing queries to the GitHub repository, see the `CodeQL style guide <https://github.com/Semmle/ql/blob/master/docs/ql-style-guide.md>`__.
Basic query structure
*********************
@@ -35,17 +45,17 @@ Basic query structure
where /* ... logical formula ... */
select /* ... expressions ... */
The following sections describe the information that is typically included in a query file for alerts. Path queries are discussed in more detail in :doc:`Creating path queries <path-queries>`.
The following sections describe the information that is typically included in a query file for alerts and metrics. Path queries are discussed in more detail in :doc:`Creating path queries <path-queries>`.
Query metadata
==============
Query metadata is used to identify your custom queries when they are added to the GitHub repository or used in your analysis. Metadata provides information about the query's purpose, and also specifies how to interpret and display the query results. For a full list of metadata properties, see :doc:`Metadata for CodeQL queries <query-metadata>`. The exact metadata requirement depends on how you are going to run your query:
Query metadata is used to identify your custom queries when they are added to the GitHub repository or used in your analysis. Metadata provides information about the query's purpose, and also specifies how to interpret and display the query results. For a full list of metadata properties, see the :doc:`query metadata reference <query-metadata>`. The exact metadata requirement depends on how you are going to run your query:
- If you are contributing a query to the GitHub repository, please read the `query metadata style guide <https://github.com/github/codeql/blob/master/docs/query-metadata-style-guide.md#metadata-area>`__.
- If you are contributing a query to the GitHub repository, please read the `query metadata style guide <https://github.com/Semmle/ql/blob/master/docs/query-metadata-style-guide.md#metadata-area>`__.
- If you are adding a custom query to a query pack for analysis using LGTM , see `Writing custom queries to include in LGTM analysis <https://lgtm.com/help/lgtm/writing-custom-queries>`__.
- If you are analyzing a database using the `CodeQL CLI <https://help.semmle.com/codeql/codeql-cli.html>`__, your query metadata must contain ``@kind``.
- If you are running a query in the query console on LGTM or with the CodeQL extension for VS Code, metadata is not mandatory. However, if you want your results to be displayed as either an 'alert' or a 'path', you must specify the correct ``@kind`` property, as explained below. For more information, see `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com and `Analyzing your projects <https://help.semmle.com/codeql/codeql-for-vscode/procedures/using-extension.html>`__ in the CodeQL for VS Code help.
- If you are running a query in the query console on LGTM or with the CodeQL extension for VS Code, metadata is not mandatory. However, if you want your results to be displayed as either an 'alert' or a 'path', you must specify the correct ``@kind`` property, as explained below. For more information, see `Using the query console <https://lgtm.com/help/lgtm/using-query-console>`__ on LGTM.com and `Using the extension <https://help.semmle.com/codeql/codeql-for-vscode/procedures/using-extension.html>`__ in the CodeQL for VS Code help.
.. pull-quote::
@@ -55,6 +65,7 @@ Query metadata is used to identify your custom queries when they are added to th
- Alert query metadata must contain ``@kind problem``.
- Path query metadata must contain ``@kind path-problem``.
- Metric query metadata must contain ``@kind metric``.
When you define the ``@kind`` property of a custom query you must also ensure that the rest of your query has the correct structure in order to be valid, as described below.
@@ -62,7 +73,7 @@ Import statements
=================
Each query generally contains one or more ``import`` statements, which define the `libraries <https://help.semmle.com/QL/ql-handbook/modules.html#library-modules>`__ or `modules <https://help.semmle.com/QL/ql-handbook/modules.html>`__ to import into the query. Libraries and modules provide a way of grouping together related `types <https://help.semmle.com/QL/ql-handbook/types.html>`__, `predicates <https://help.semmle.com/QL/ql-handbook/predicates.html>`__, and other modules. The contents of each library or module that you import can then be accessed by the query.
Our `open source repository on GitHub <https://github.com/github/codeql>`__ contains the standard CodeQL libraries for each supported language.
Our `open source repository on GitHub <https://github.com/semmle/ql>`__ contains the standard CodeQL libraries for each supported language.
When writing your own alert queries, you would typically import the standard library for the language of the project that you are querying, using ``import`` followed by a language:
@@ -75,7 +86,7 @@ When writing your own alert queries, you would typically import the standard lib
There are also libraries containing commonly used predicates, types, and other modules associated with different analyses, including data flow, control flow, and taint-tracking. In order to calculate path graphs, path queries require you to import a data flow library into the query file. For more information, see :doc:`Creating path queries <path-queries>`.
You can explore the contents of all the standard libraries in the `CodeQL library reference documentation <https://help.semmle.com/QL/ql-libraries.html>`__ or in the `GitHub repository <https://github.com/github/codeql>`__.
You can explore the contents of all the standard libraries in the `CodeQL library reference documentation <https://help.semmle.com/QL/ql-libraries.html>`__ or in the `GitHub repository <https://github.com/semmle/ql>`__.
Optional CodeQL classes and predicates
--------------------------------------
@@ -110,25 +121,41 @@ You can modify the alert message defined in the final column of the ``select`` s
Select clauses for path queries (``@kind path-problem``) are crafted to display both an alert and the source and sink of an associated path graph. For more information, see :doc:`Creating path queries <path-queries>`.
Select clauses for metric queries (``@kind metric``) consist of two 'columns', with the following structure::
select element, metric
- ``element``: a code element that is identified by the query, which defines where the alert is displayed.
- ``metric``: the result of the metric that the query computes.
Viewing the standard CodeQL queries
***********************************
One of the easiest ways to get started writing your own queries is to modify an existing query. To view the standard CodeQL queries, or to try out other examples, visit the `CodeQL <https://github.com/github/codeql>`__ and `CodeQL for Go <https://github.com/github/codeql-go>`__ repositories on GitHub.
One of the easiest ways to get started writing your own queries is to modify an existing query. To view the standard CodeQL queries, or to try out other examples, visit the `CodeQL <https://github.com/semmle/ql>`__ and `CodeQL for Go <https://github.com/github/codeql-go>`__ repositories on GitHub.
You can also find examples of queries developed to find security vulnerabilities and bugs in open source software projects on the `GitHub Security Lab website <https://securitylab.github.com/research>`__ and in the associated `repository <https://github.com/github/security-lab>`__.
Contributing queries
********************
Contributions to the standard queries and libraries are very welcome. For more information, see our `contributing guidelines <https://github.com/github/codeql/blob/master/CONTRIBUTING.md>`__.
Contributions to the standard queries and libraries are very welcome. For more information, see our `contributing guidelines <https://github.com/Semmle/ql/blob/master/CONTRIBUTING.md>`__.
If you are contributing a query to the open source GitHub repository, writing a custom query for LGTM, or using a custom query in an analysis with the CodeQL CLI, then you need to include extra metadata in your query to ensure that the query results are interpreted and displayed correctly. See the following topics for more information on query metadata:
- :doc:`Metadata for CodeQL queries <query-metadata>`
- `Query metadata style guide on GitHub <https://github.com/github/codeql/blob/master/docs/query-metadata-style-guide.md>`__
- `Query metadata style guide on GitHub <https://github.com/Semmle/ql/blob/master/docs/query-metadata-style-guide.md>`__
Query contributions to the open source GitHub repository may also have an accompanying query help file to provide information about their purpose for other users. For more information on writing query help, see the `Query help style guide on GitHub <https://github.com/github/codeql/blob/master/docs/query-help-style-guide.md>`__ and the :doc:`Query help files <query-help>`.
Query contributions to the open source GitHub repository may also have an accompanying query help file to provide information about their purpose for other users. For more information on writing query help, see the `Query help style guide on GitHub <https://github.com/Semmle/ql/blob/master/docs/query-help-style-guide.md>`__ and the :doc:`Query help files <query-help>`.
Query help files
****************
When you write a custom query, we also recommend that you write a query help file to explain the purpose of the query to other users. For more information, see the `Query help style guide <https://github.com/github/codeql/blob/master/docs/query-help-style-guide.md>`__ on GitHub, and the :doc:`Query help files <query-help>`.
When you write a custom query, we also recommend that you write a query help file to explain the purpose of the query to other users. For more information, see the `Query help style guide <https://github.com/Semmle/ql/blob/master/docs/query-help-style-guide.md>`__ on GitHub, and the :doc:`Query help files <query-help>`.
What next?
==========
- See the queries used in real-life variant analysis on the `GitHub Security Lab website <https://securitylab.github.com/research>`__.
- To learn more about writing path queries, see :doc:`Creating path queries <path-queries>`.
- Take a look at the `built-in queries <https://help.semmle.com/wiki/display/QL/Built-in+queries>`__ to see examples of the queries included in CodeQL.
- Explore the `query cookbooks <https://help.semmle.com/wiki/display/QL/QL+cookbooks>`__ to see how to access the basic language elements contained in the CodeQL libraries.
- For a full list of resources to help you learn CodeQL, including beginner tutorials and language-specific examples, visit `Learning CodeQL <https://help.semmle.com/QL/learn-ql/>`__.

View File

@@ -189,8 +189,9 @@ The ``element`` that you select in the first column depends on the purpose of th
The alert message defined in the final column in the ``select`` statement can be developed to give more detail about the alert or path found by the query using links and placeholders. For more information, see :doc:`Defining the results of a query <select-statement>`.
Further reading
***************
What next?
**********
- `Exploring data flow with path queries <https://help.semmle.com/codeql/codeql-for-vscode/procedures/exploring-paths.html>`__
- `CodeQL repository <https://github.com/github/codeql>`__
- Take a look at the path queries for `C/C++ <https://help.semmle.com/wiki/label/CCPPOBJ/path-problem>`__, `C# <https://help.semmle.com/wiki/label/CSHARP/path-problem>`__, `Java <https://help.semmle.com/wiki/label/java/path-problem>`__, `JavaScript <https://help.semmle.com/wiki/label/js/path-problem>`__, and `Python <https://help.semmle.com/wiki/label/python/path-problem>`__ to see examples of these queries.
- Explore the `query cookbooks <https://help.semmle.com/wiki/display/QL/QL+cookbooks>`__ to see how to access the basic language elements contained in the CodeQL libraries.
- For a full list of resources to help you learn CodeQL, including beginner tutorials and language-specific examples, visit `Learning CodeQL <https://help.semmle.com/QL/learn-ql/>`__.

View File

@@ -4,7 +4,7 @@ Query help files
Query help files tell users the purpose of a query, and recommend how to solve the potential problem the query finds.
This topic provides detailed information on the structure of query help files.
For more information about how to write useful query help in a style that is consistent with the standard CodeQL queries, see the `Query help style guide <https://github.com/github/codeql/blob/master/docs/query-help-style-guide.md>`__ on GitHub.
For more information about how to write useful query help in a style that is consistent with the standard CodeQL queries, see the `Query help style guide <https://github.com/Semmle/ql/blob/master/docs/query-help-style-guide.md>`__ on GitHub.
.. pull-quote::
@@ -12,8 +12,8 @@ For more information about how to write useful query help in a style that is con
Note
You can access the query help for CodeQL queries by visiting the `Built-in query pages <https://help.semmle.com/wiki/display/QL/Built-in+queries>`__.
You can also access the raw query help files in the `GitHub repository <https://github.com/github/codeql>`__.
For example, see the `JavaScript security queries <https://github.com/github/codeql/tree/master/javascript/ql/src/Security>`__ and `C/C++ critical queries <https://github.com/github/codeql/tree/master/cpp/ql/src/Critical>`__.
You can also access the raw query help files in the `GitHub repository <https://github.com/semmle/ql>`__.
For example, see the `JavaScript security queries <https://github.com/Semmle/ql/tree/master/javascript/ql/src/Security>`__ and `C/C++ critical queries <https://github.com/Semmle/ql/tree/master/cpp/ql/src/Critical>`__.
For queries run by default on LGTM, there are several different ways to access the query help. For further information, see `Where do I see the query help for a query on LGTM? <https://lgtm.com/help/lgtm/query-help#where-query-help-in-lgtm>`__ in the LGTM user help.
@@ -169,7 +169,7 @@ The ``include`` element can be used as a section or block element. The content
Section-level include elements
------------------------------
Section-level ``include`` elements can be located beneath the top-level ``qhelp`` element. For example, in `StoredXSS.qhelp <https://github.com/github/codeql/blob/master/csharp/ql/src/Security%20Features/CWE-079/StoredXSS.qhelp>`__, a full query help file is reused:
Section-level ``include`` elements can be located beneath the top-level ``qhelp`` element. For example, in `StoredXSS.qhelp <https://github.com/Semmle/ql/blob/master/csharp/ql/src/Security%20Features/CWE-079/StoredXSS.qhelp>`__, a full query help file is reused:
.. code-block:: xml
@@ -177,12 +177,12 @@ Section-level ``include`` elements can be located beneath the top-level ``qhelp`
<include src="XSS.qhelp" />
</qhelp>
In this example, the `XSS.qhelp <https://github.com/github/codeql/blob/master/csharp/ql/src/Security%20Features/CWE-079/XSS.qhelp>`__ file must conform to the standard for a full query help file as described above. That is, the ``qhelp`` element may only contain non-``fragment``, section-level elements.
In this example, the `XSS.qhelp <https://github.com/Semmle/ql/blob/master/csharp/ql/src/Security%20Features/CWE-079/XSS.qhelp>`__ file must conform to the standard for a full query help file as described above. That is, the ``qhelp`` element may only contain non-``fragment``, section-level elements.
Block-level include elements
----------------------------
Block-level ``include`` elements can be included beneath section-level elements. For example, an ``include`` element is used beneath the ``overview`` section in `ThreadUnsafeICryptoTransform.qhelp <https://github.com/github/codeql/blob/master/csharp/ql/src/Likely%20Bugs/ThreadUnsafeICryptoTransform.qhelp>`__:
Block-level ``include`` elements can be included beneath section-level elements. For example, an ``include`` element is used beneath the ``overview`` section in `ThreadUnsafeICryptoTransform.qhelp <https://github.com/Semmle/ql/blob/master/csharp/ql/src/Likely%20Bugs/ThreadUnsafeICryptoTransform.qhelp>`__:
.. code-block:: xml
@@ -193,7 +193,7 @@ Block-level ``include`` elements can be included beneath section-level elements.
...
</qhelp>
The included file, `ThreadUnsafeICryptoTransformOverview.qhelp <https://github.com/github/codeql/blob/master/csharp/ql/src/Likely%20Bugs/ThreadUnsafeICryptoTransformOverview.qhelp>`_, may only contain one or more ``fragment`` sections. For example:
The included file, `ThreadUnsafeICryptoTransformOverview.qhelp <https://github.com/Semmle/ql/blob/master/csharp/ql/src/Likely%20Bugs/ThreadUnsafeICryptoTransformOverview.qhelp>`_, may only contain one or more ``fragment`` sections. For example:
.. code-block:: xml
@@ -206,3 +206,8 @@ The included file, `ThreadUnsafeICryptoTransformOverview.qhelp <https://github.
</fragment>
</qhelp>
Further information
===================
- To learn more about contributing to the standard CodeQL queries and libraries, see our `Contributing guidelines <https://github.com/Semmle/ql/blob/master/CONTRIBUTING.md>`__ on GitHub.
- To learn more about writing custom queries, and how to format your code for clarity and consistency, see `Writing CodeQL queries <https://help.semmle.com/QL/learn-ql/writing-queries/writing-queries.html>`__.

View File

@@ -7,8 +7,9 @@ About query metadata
--------------------
Any query that is run as part of an analysis includes a number of properties, known as query metadata. Metadata is included at the top of each query file as the content of a `QLDoc <https://help.semmle.com/QL/ql-spec/qldoc.html>`__ comment.
This metadata tells LGTM and the CodeQL `extension for VS Code <https://help.semmle.com/codeql/codeql-for-vscode.html>`__ how to handle the query and display its results correctly.
It also gives other users information about what the query results mean. For further information on query metadata, see the `query metadata style guide <https://github.com/github/codeql/blob/master/docs/query-metadata-style-guide.md#metadata-area>`__ in our `open source repository <https://github.com/github/codeql>`__ on GitHub.
For alerts and path queries, this metadata tells LGTM and the CodeQL `extension for VS Code <https://help.semmle.com/codeql/codeql-for-vscode.html>`__ how to handle the query and display its results correctly.
It also gives other users information about what the query results mean. For further information on query metadata, see the `query metadata style guide <https://github.com/Semmle/ql/blob/master/docs/query-metadata-style-guide.md#metadata-area>`__ in our `open source repository <https://github.com/semmle/ql>`__ on GitHub.
You can also add metric queries to LGTM, but the results are not shown. To see the results of metric queries, you can run them in the query console or in `Visual Studio Code <https://help.semmle.com/codeql/codeql-for-vscode.html>`__.
.. pull-quote::
@@ -16,36 +17,72 @@ It also gives other users information about what the query results mean. For fur
The exact metadata requirement depends on how you are going to run your query. For more information, see the section on query metadata in :doc:`About CodeQL queries <introduction-to-queries>`.
Metadata properties
-------------------
Core properties
---------------
The following properties are supported by all query files:
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value | Description |
+=======================+===========================+======================================================================================================================================================================================================================================================================================================================================================+
| ``@description`` | ``<text>`` | A sentence or short paragraph to describe the purpose of the query and *why* the result is useful or important. The description is written in plain text, and uses single quotes (``'``) to enclose code elements. |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@id`` | ``<text>`` | A sequence of words composed of lowercase letters or digits, delimited by ``/`` or ``-``, identifying and classifying the query. Each query must have a **unique** ID. To ensure this, it may be helpful to use a fixed structure for each ID. For example, the standard LGTM queries have the following format: ``<language>/<brief-description>``. |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@kind`` | | ``problem`` | Identifies the query is an alert (``@kind problem``) or a path (``@kind path-problem``). For further information on these query types, see :doc:`About CodeQL queries <introduction-to-queries>`. |
| | | ``path-problem`` | |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@name`` | ``<text>`` | A statement that defines the label of the query. The name is written in plain text, and uses single quotes (``'``) to enclose code elements. |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@tags`` | | ``correctness`` | These tags group queries together in broad categories to make it easier to search for them and identify them. In addition to the common tags listed here, there are also a number of more specific categories. For more information, see the |
| | | ``maintainability`` | `Query metadata style guide <https://github.com/github/codeql/blob/master/docs/query-metadata-style-guide.md#query-tags-tags>`__. |
| | | ``readability`` | |
| | | ``security`` | |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@precision`` | | ``medium``   | Indicates the percentage of query results that are true positives (as opposed to false positive results). This, along with the ``@problem.severity`` property, determines whether the results are displayed by default on LGTM. |
| | | ``high``   | |
| | | ``very-high`` | |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@problem.severity`` | | ``error`` | Defines the level of severity of any alerts generated by the query. This, along with the ``@precision`` property, determines whether the results are displayed by default on LGTM. |
| | | ``warning`` | |
| | | ``recommendation`` | |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value | Description |
+=======================+===========================+==============================================================================================================================================================================================================================================================================================================================================================================+
| ``@description`` | ``<text>`` | A sentence or short paragraph to describe the purpose of the query and *why* the result is useful or important. The description is written in plain text, and uses single quotes (``'``) to enclose code elements. |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@id`` | ``<text>`` | A sequence of words composed of lowercase letters or digits, delimited by ``/`` or ``-``, identifying and classifying the query. Each query must have a **unique** ID. To ensure this, it may be helpful to use a fixed structure for each ID. For example, the standard LGTM queries have the following format: ``<language>/<brief-description>``. |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@kind`` | | ``problem`` | Identifies the query is an alert (``@kind problem``), a path (``@kind path-problem``), or a metric (``@kind metric``). For further information on these query types, see :doc:`About CodeQL queries <introduction-to-queries>`. |
| | | ``path-problem`` | |
| | | ``metric`` | |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@name`` | ``<text>`` | A statement that defines the label of the query. The name is written in plain text, and uses single quotes (``'``) to enclose code elements. |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@tags`` | | ``correctness`` | These tags group queries together in broad categories to make it easier to search for them and identify them. In addition to the common tags listed here, there are also a number of more specific categories. For more information about some of the tags that are already used and what they mean, see `Query tags <https://lgtm.com/help/lgtm/query-tags>`__ on LGTM.com. |
| | | ``mantainability`` | |
| | | ``readability`` | |
| | | ``security`` | |
+-----------------------+---------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Additional properties for problem and path-problem queries
----------------------------------------------------------
In addition to the core properties, alert queries (``@kind problem``) and path queries (``@kind path-problem``) support the following properties:
+-----------------------+------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value | Example | Notes |
+=======================+============+=======================+=====================================================================================================================================================================================================================+
| ``@precision`` | ``<type>`` | | ``medium``   | Indicates the percentage of query results that are true positives (as opposed to false positive results). This controls how alerts for problems found by the query are displayed in client applications. |
| | | | ``high``   | |
| | | | ``very-high`` | |
+-----------------------+------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@problem.severity`` | ``<type>`` | | ``error`` | Defines the level of severity of any alerts generated by the query. This controls how alerts are displayed in client applications. |
| | | | ``warning`` | |
| | | | ``recommendation`` | |
+-----------------------+------------+-----------------------+---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Additional properties for metric queries
----------------------------------------
In addition to the core properties, metric queries (``@kind metric``) support the following properties:
+------------------------+--------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Property | Value | Example | Notes |
+========================+==============+===================+==========================================================================================================================================================================================================+
| ``@metricType`` | ``<type>`` | | ``file`` | Defines the code element that the query acts on. This information is used by client applications; it should match the type of result returned by the query. |
| | | | ``callable`` | |
| | | | ``package`` | |
| | | | ``project`` | |
| | | | ``reftype`` | |
+------------------------+--------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@metricAggregate`` | ``<method>`` | | ``avg`` | Defines the allowable aggregations for this metric. A space separated list of the four possibilities ``sum``, ``avg``, ``min`` and ``max``. If it is not present, it defaults to ``sum avg``. |
| | | | ``sum`` | |
| | | | ``min`` | |
| | | | ``max`` | |
+--------------------+---+--------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@treemap.threshold`` | ``<number>`` | ``10`` | Optional, defines a metric threshold. Used with ``@treemap.warnOn`` to define a "danger area" on the metric charts displayed in client applications. |
+------------------------+--------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| ``@treemap.warnOn`` | ``<type>`` | | ``highValues`` | Optional, defines whether high or low values are dangerous. Used with ``@treemap.threshold`` to define a "danger area" on the metric charts displayed in client applications. |
| | | | ``lowValues`` | |
+------------------------+--------------+-------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
Additional properties for filter queries
----------------------------------------
@@ -61,4 +98,8 @@ Here is the metadata for one of the standard Java queries:
.. |image0| image:: ../../images/query-metadata.png
For more examples of query metadata, see the standard CodeQL queries in our `GitHub repository <https://github.com/github/codeql>`__.
For more examples of query metadata, see the standard CodeQL queries in our `GitHub repository <https://github.com/semmle/ql>`__.

View File

@@ -15,7 +15,7 @@ This topic explains how to write your select statement to generate helpful analy
Overview
--------
Alert queries must have the property ``@kind problem`` defined in their metadata. For further information, see :doc:`Metadata for CodeQL queries <query-metadata>`.
Alert queries must have the property ``@kind problem`` defined in their metadata. For further information, see the :doc:`query metadata reference <query-metadata>`.
In their most basic form, the ``select`` statement must select two 'columns':
- **Element**—a code element that's identified by the query. This defines the location of the alert.
@@ -27,7 +27,7 @@ If you look at some of the LGTM queries, you'll see that they can select extra e
Note
An in-depth discussion of ``select`` statements for path queries is not included in this topic. However, you can develop the string column of the ``select`` statement in the same way as for alert queries. For more specific information about path queries, see :doc:`Creating path queries <path-queries>`.
An in-depth discussion of ``select`` statements for path and metric queries is not included in this topic. However, you can develop the string column of the ``select`` statement in the same way as for alert queries. For more specific information about path queries, see :doc:`Creating path queries <path-queries>`.
Developing a select statement
-----------------------------
@@ -105,8 +105,3 @@ The new elements added here don't need to be clickable, so we added them directl
.. image:: ../../images/ql-select-statement-similarity.png
:alt: Results showing the extent of similarity
:class: border
Further reading
---------------
- `CodeQL repository <https://github.com/github/codeql>`__

View File

@@ -80,12 +80,9 @@ Query modules
A query module is defined by a ``.ql`` file. It can contain any of the elements listed
in :ref:`module-bodies` below.
Query modules are slightly different from other modules:
- A query module can't be imported.
- A query module must have at least one query in its
:ref:`namespace <namespaces>`. This is usually a :ref:`select clause <select-clauses>`,
but can also be a :ref:`query predicate <query-predicates>`.
The difference is that a query module must have at least one query in its
:ref:`namespace <namespaces>`. This is usually a :ref:`select clause <select-clauses>`,
but can also be a :ref:`query predicate <query-predicates>`.
For example:

View File

@@ -385,7 +385,7 @@ Algebraic datatypes
*******************
.. note:: The syntax for algebraic datatypes is considered experimental and is subject to
change. However, they appear in the `standard QL libraries <https://github.com/github/codeql>`_
change. However, they appear in the `standard QL libraries <https://github.com/Semmle/ql>`_
so the following sections should help you understand those examples.
An algebraic datatype is another form of user-defined type, declared with the keyword ``newtype``.

View File

@@ -0,0 +1,83 @@
# -*- coding: utf-8 -*-
#
# QL specifications build configuration file, created
# on Weds Nov 21 2018.
#
# This file is execfile()d with the current directory set to its
# containing dir.
#
# Note that not all possible configuration values are present in this
# autogenerated file.
#
# All configuration values have a default; values that are commented out
# serve to show the default.
# For details of all possible config values,
# see https://www.sphinx-doc.org/en/master/usage/configuration.html
###############################################################################
#
# Modified 22052019.
# The configuration values below are specific to the specifications
# To amend html_theme_options, update version/release number, or add more sphinx extensions,
# refer to code/documentation/ql-documentation/global-sphinx-files/global-conf.py
##############################################################################
# -- Project-specific configuration -----------------------------------
import os
# Import global config values
with open(os.path.abspath("../global-sphinx-files/global-conf.py")) as in_file:
exec(in_file.read())
# QLlexer doesn't cover everything included in the specs.
# Syntax highlighting turned off until lexer has been expanded.
highlight_language ='none'
# The master toctree document.
master_doc = 'index'
# Project-specific information.
project = u'QL specifications'
# The version info for this project, if different from version and release in main conf.py file.
# The short X.Y version.
#version = u'test'
# The full version, including alpha/beta/rc tags.
#release = u'test'
# -- Options for HTML output ----------------------------------------------
# The name for this set of Sphinx documents. If None, it defaults to
# "<project> v<release> documentation".
html_title = 'QL specifications'
# Output file base name for HTML help builder.
htmlhelp_basename = 'QL specifications'
# -- Currently unused, but potentially useful, configs--------------------------------------
# Add any paths that contain custom themes here, relative to this directory.
#html_theme_path = []
# A shorter title for the navigation bar. Default is the same as html_title.
#html_short_title = None
# The name of an image file (relative to this directory) to place at the top
# of the sidebar.
#html_logo = None
# Custom sidebar templates, maps document names to template names.
#html_sidebars = {}
# Add any extra paths that contain custom files (such as robots.txt or
# .htaccess) here, relative to this directory. These files are copied
# directly to the root of the documentation.
#html_extra_path = []
# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
# exclude_patterns = []

View File

@@ -0,0 +1,5 @@
README
######
The specifications have moved to ``ql/docs/language/ql-handbook``.
See https://github.com/github/semmle-docs/issues/21 for details of the restructuring.

View File

@@ -68,7 +68,7 @@ A simple CodeQL query
We are going to write a simple query which finds “if statements” with empty “then” blocks, so we can highlight the results like those on the previous slide. The query can be run in the `query console on LGTM <https://lgtm.com/query>`__, or in your `IDE <https://lgtm.com/help/lgtm/running-queries-ide>`__.
A `query <https://help.semmle.com/QL/ql-handbook/queries.html>`__ consists of a “select” clause that indicates what results should be returned. Typically it will also provide a “from” clause to declare some variables, and a “where” clause to state conditions over those variables. For more information on the structure of query files (including links to useful topics in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__), see `About CodeQL queries <https://help.semmle.com/QL/learn-ql/ql/writing-queries/introduction-to-queries.html>`__.
A `query <https://help.semmle.com/QL/ql-handbook/queries.html>`__ consists of a “select” clause that indicates what results should be returned. Typically it will also provide a “from” clause to declare some variables, and a “where” clause to state conditions over those variables. For more information on the structure of query files (including links to useful topics in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__), see `Introduction to query files <https://help.semmle.com/QL/learn-ql/ql/writing-queries/introduction-to-queries.html>`__.
In our example here, the first line of the query imports the `CodeQL library for C/C++ <https://help.semmle.com/qldoc/cpp/>`__, which defines concepts like ``IfStmt`` and ``Block``.
The query proper starts by declaring two variablesifStmt and block. These variables represent sets of values in the database, according to the type of each of the variables. For example, ifStmt has the type IfStmt, which means it represents the set of all if statements in the program.

View File

@@ -165,8 +165,8 @@ Add an additional taint step that (heuristically) taints a local variable if it
.. code-block:: ql
class TaintedOGNLConfig extends TaintTracking::Configuration {
override predicate isAdditionalTaintStep(DataFlow::Node node1,
DataFlow::Node node2) {
override predicate isAdditionalTaintStep(DataFlow::Node pred,
DataFlow::Node succ) {
exists(Field f, RefType t |
node1.asExpr() = f.getAnAssignedValue() and
node2.asExpr() = f.getAnAccess() and

View File

@@ -68,7 +68,7 @@ A simple CodeQL query
We are going to write a simple query which finds “if statements” with empty “then” blocks, so we can highlight the results like those on the previous slide. The query can be run in the `query console on LGTM <https://lgtm.com/query>`__, or in your `IDE <https://lgtm.com/help/lgtm/running-queries-ide>`__.
A `query <https://help.semmle.com/QL/ql-handbook/queries.html>`__ consists of a “select” clause that indicates what results should be returned. Typically it will also provide a “from” clause to declare some variables, and a “where” clause to state conditions over those variables. For more information on the structure of query files (including links to useful topics in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__), see `About CodeQL queries <https://help.semmle.com/QL/learn-ql/ql/writing-queries/introduction-to-queries.html>`__.
A `query <https://help.semmle.com/QL/ql-handbook/queries.html>`__ consists of a “select” clause that indicates what results should be returned. Typically it will also provide a “from” clause to declare some variables, and a “where” clause to state conditions over those variables. For more information on the structure of query files (including links to useful topics in the `QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__), see `Introduction to query files <https://help.semmle.com/QL/learn-ql/ql/writing-queries/introduction-to-queries.html>`__.
In our example here, the first line of the query imports the `CodeQL library for Java <https://help.semmle.com/qldoc/java/>`__, which defines concepts like ``IfStmt`` and ``Block``.
The query proper starts by declaring two variablesifStmt and block. These variables represent sets of values in the database, according to the type of each of the variables. For example, ``ifStmt`` has the type ``IfStmt``, which means it represents the set of all if statements in the program.

View File

@@ -39,9 +39,9 @@ The basic representation of an analyzed program is an *abstract syntax tree (AST
The following topics contain overviews of the important AST classes and CodeQL libraries for C/C++, C#, and Java:
- `CodeQL library for C/C++ <https://help.semmle.com/QL/learn-ql/cpp/introduce-libraries-cpp.html>`__
- `CodeQL library for C# <https://help.semmle.com/QL/learn-ql/csharp/introduce-libraries-csharp.html>`__
- `CodeQL library for Java <https://help.semmle.com/QL/learn-ql/java/introduce-libraries-java.html>`__
- `Introducing the C/C++ libraries <https://help.semmle.com/QL/learn-ql/cpp/introduce-libraries-cpp.html>`__
- `Introducing the C# libraries <https://help.semmle.com/QL/learn-ql/csharp/introduce-libraries-csharp.html>`__
- `Introducing the Java libraries <https://help.semmle.com/QL/learn-ql/java/introduce-libraries-java.html>`__
Database representations of ASTs
@@ -65,6 +65,6 @@ Entity types are rarely used directly, the usual pattern is to define a class th
For example, the database schemas for C/++, C#, and Java CodeQL databases are here:
- https://github.com/github/codeql/blob/master/cpp/ql/src/semmlecode.cpp.dbscheme
- https://github.com/github/codeql/blob/master/csharp/ql/src/semmlecode.csharp.dbscheme
- https://github.com/github/codeql/blob/master/java/ql/src/config/semmlecode.dbscheme
- https://github.com/Semmle/ql/blob/master/cpp/ql/src/semmlecode.cpp.dbscheme
- https://github.com/Semmle/ql/blob/master/csharp/ql/src/semmlecode.csharp.dbscheme
- https://github.com/Semmle/ql/blob/master/java/ql/src/config/semmlecode.dbscheme

View File

@@ -4,6 +4,6 @@ You can download the database as a zip file by clicking the link on the slide ab
#. Add the unzipped database to Visual Studio Code
#. Upgrade the database if necessary
For further information, see `Analyzing your projects <https://help.semmle.com/codeql/codeql-for-vscode/procedures/using-extension.html>`__ in the CodeQL for Visual Studio Code help.
For further information, see `Using the extension <https://help.semmle.com/codeql/codeql-for-vscode/procedures/using-extension.html>`__ in the CodeQL for Visual Studio Code help.
Note that results generated in the query console are likely to differ to those generated in CodeQL for Visual Studio Code as LGTM.com analyzes the most recent revisions of each project that has been addedthe CodeQL database available to download above is based on an historical version of the codebase.

View File

@@ -103,9 +103,9 @@ Analysis overview
CodeQL analysis works by extracting a queryable database from your project. For compiled languages, the tools observe an ordinary build of the source code. Each time a compiler is invoked to process a source file, a copy of that file is made, and all relevant information about the source code (syntactic data about the abstract syntax tree, semantic data like name binding and type information, data on the operation of the C preprocessor, etc.) is collected. For interpreted languages, the extractor gathers similar information by running directly on the source code. Multi-language code bases are analyzed one language at a time.
Once the extraction finishes, all this information is collected into a single `CodeQL database <https://help.semmle.com/codeql/about-codeql.html#about-codeql-databases>`__, which is then ready to query, possibly on a different machine. A copy of the source files, made at the time the database was created, is also included in the CodeQL database so analysis results can be displayed at the correct location in the code. The database schema is (source) language specific.
Once the extraction finishes, all this information is collected into a single `CodeQL database <https://help.semmle.com/QL/learn-ql/database.html>`__, which is then ready to query, possibly on a different machine. A copy of the source files, made at the time the database was created, is also included in the CodeQL database so analysis results can be displayed at the correct location in the code. The database schema is (source) language specific.
Queries are written in QL and usually depend on one or more of the `standard CodeQL libraries <https://github.com/github/codeql>`__ (and of course you can write your own custom libraries). They are compiled into an efficiently executable format by the QL compiler and then run on a CodeQL database by the QL evaluator, either on a remote worker machine or locally on a developers machine.
Queries are written in QL and usually depend on one or more of the `standard CodeQL libraries <https://github.com/semmle/ql>`__ (and of course you can write your own custom libraries). They are compiled into an efficiently executable format by the QL compiler and then run on a CodeQL database by the QL evaluator, either on a remote worker machine or locally on a developers machine.
Query results can be interpreted and presented in a variety of ways, including displaying them in an `IDE extension <https://lgtm.com/help/lgtm/running-queries-ide>`__ such as CodeQL for Visual Studio Code, or in a web dashboard as on `LGTM <https://lgtm.com/help/lgtm/about-lgtm>`__.
@@ -129,7 +129,7 @@ QL is:
- All common logic connectives are available, including quantifiers like ``exist``, which can also introduce new variables.
- The language is declarativethe user focuses on stating what they would like to find, and leaves the details of how to evaluate the query to the engine.
- The object-oriented layer allows us to develop rich standard libraries for program analysis. These model the common AST node types, control flow and name lookup, and define further layers on topfor example control flow or data flow analysis. The `standard CodeQL libraries and queries <https://github.com/github/codeql>`__ ship as source and can be inspected by the user, and new abstractions are readily defined.
- The object-oriented layer allows us to develop rich standard libraries for program analysis. These model the common AST node types, control flow and name lookup, and define further layers on topfor example control flow or data flow analysis. The `standard CodeQL libraries and queries <https://github.com/semmle/ql>`__ ship as source and can be inspected by the user, and new abstractions are readily defined.
- The database generated by the CodeQL tools is treated as read-only; queries cannot insert new data into it, though they can inspect its contents in various ways.
You can start writing running queries on open source projects in the `query console <https://lgtm.com/query>`__ on LGTM.com. You can also download CodeQL databases from LGTM.com to query locally, by `running queries in your IDE <https://lgtm.com/help/lgtm/running-queries-ide>`__.

View File

@@ -70,7 +70,7 @@ Local vs global data flow
For further information, see:
- `About data flow analysis <https://help.semmle.com/QL/learn-ql/ql/intro-to-data-flow.html>`__
- `Introduction to data flow analysis with CodeQL <https://help.semmle.com/QL/learn-ql/ql/intro-to-data-flow.html>`__
.. rst-class:: background2

View File

@@ -1 +0,0 @@
The `abstract syntax tree (AST) <https://en.wikipedia.org/wiki/Abstract_syntax_tree>`__ represents the syntactic structure of a program. Nodes on the AST represent elements such as statements and expressions.

View File

@@ -1,2 +0,0 @@
- `QL language reference <https://help.semmle.com/QL/ql-handbook>`__
- `CodeQL tools <https://help.semmle.com/codeql/codeql-tools.html>`__

View File

@@ -1,4 +0,0 @@
- `CodeQL queries for C and C++ <https://github.com/github/codeql/tree/master/cpp/ql/src>`__
- `Example queries for C and C++ <https://github.com/github/codeql/tree/master/cpp/ql/examples>`__
- `CodeQL library reference for C and C++ <https://help.semmle.com/qldoc/cpp/>`__

View File

@@ -1,4 +0,0 @@
- `CodeQL queries for C# <https://github.com/github/codeql/tree/master/csharp/ql/src>`__
- `Example queries for C# <https://github.com/github/codeql/tree/master/csharp/ql/examples>`__
- `CodeQL library reference for C# <https://help.semmle.com/qldoc/csharp/>`__

View File

@@ -1,3 +0,0 @@
- `CodeQL queries for Go <https://github.com/github/codeql-go/tree/master/ql/src>`__
- `Example queries for Go <https://github.com/github/codeql-go/tree/master/ql/examples>`__
- `CodeQL library reference for Go <https://help.semmle.com/qldoc/go/>`__

View File

@@ -1,4 +0,0 @@
- `CodeQL queries for Java <https://github.com/github/codeql/tree/master/java/ql/src>`__
- `Example queries for Java <https://github.com/github/codeql/tree/master/java/ql/examples>`__
- `CodeQL library reference for Java <https://help.semmle.com/qldoc/java/>`__

View File

@@ -1,3 +0,0 @@
- `CodeQL queries for JavaScript <https://github.com/github/codeql/tree/master/javascript/ql/src>`__
- `Example queries for JavaScript <https://github.com/github/codeql/tree/master/javascript/ql/examples>`__
- `CodeQL library reference for JavaScript <https://help.semmle.com/qldoc/javascript/>`__

View File

@@ -1 +0,0 @@
You can model data flow paths in CodeQL by creating path queries. To view data flow paths generated by a path query in CodeQL for VS Code, you need to make sure that it has the correct metadata and ``select`` clause. For more information, see `Creating path queries <https://help.semmle.com/QL/learn-ql/writing-queries/path-queries.html>`__.

View File

@@ -1,4 +0,0 @@
- `CodeQL queries for Python <https://github.com/github/codeql/tree/master/python/ql/src>`__
- `Example queries for Python <https://github.com/github/codeql/tree/master/python/ql/examples>`__
- `CodeQL library reference for Python <https://help.semmle.com/qldoc/python/>`__

View File

@@ -0,0 +1,3 @@
- "`QL language reference <https://help.semmle.com/QL/ql-handbook/index.html>`__"
- `Python cookbook queries <https://help.semmle.com/wiki/display/CBPYTHON>`__ in the Semmle wiki
- `Python queries in action <https://lgtm.com/search?q=language%3Apython&t=rules>`__ on LGTM.com

View File

@@ -6,6 +6,8 @@ CodeQL and LGTM version |version| support analysis of the following languages co
Note that where there are several versions or dialects of a language, the supported variants are listed.
If your code requires a particular version of a compiler, check that this version is included below.
If you have any questions about language and compiler support, you can find help on the `GitHub Security Lab discussions board <https://github.com/github/securitylab/discussions>`__.
Customers with any questions should contact their usual Semmle contact with any questions.
If you're not a customer yet, contact us at info@semmle.com
with any questions you have about language and compiler support.
.. include:: reusables/versions-compilers.rst

View File

@@ -11,22 +11,23 @@
Microsoft extensions (up to VS 2019),
Arm Compiler 5 [2]_","``.cpp``, ``.c++``, ``.cxx``, ``.hpp``, ``.hh``, ``.h++``, ``.hxx``, ``.c``, ``.cc``, ``.h``"
C#,C# up to 8.0,"Microsoft Visual Studio up to 2019 with .NET up to 4.8,
C#,C# up to 8.0. with .NET up to 4.8 [3]_,"Microsoft Visual Studio up to 2019,
.NET Core up to 3.1","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``"
.NET Core up to 3.0","``.sln``, ``.csproj``, ``.cs``, ``.cshtml``, ``.xaml``"
Go (aka Golang), "Go up to 1.14", "Go 1.11 or more recent", ``.go``
Java,"Java 6 to 14 [3]_","javac (OpenJDK and Oracle JDK),
Java,"Java 6 to 14 [4]_","javac (OpenJDK and Oracle JDK),
Eclipse compiler for Java (ECJ) [4]_",``.java``
JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [5]_"
Eclipse compiler for Java (ECJ) [5]_",``.java``
JavaScript,ECMAScript 2019 or lower,Not applicable,"``.js``, ``.jsx``, ``.mjs``, ``.es``, ``.es6``, ``.htm``, ``.html``, ``.xhm``, ``.xhtml``, ``.vue``, ``.json``, ``.yaml``, ``.yml``, ``.raml``, ``.xml`` [6]_"
Python,"2.7, 3.5, 3.6, 3.7, 3.8",Not applicable,``.py``
TypeScript [6]_,"2.6-3.7",Standard TypeScript compiler,"``.ts``, ``.tsx``"
TypeScript [7]_,"2.6-3.7",Standard TypeScript compiler,"``.ts``, ``.tsx``"
.. container:: footnote-group
.. [1] Support for the clang-cl compiler is preliminary.
.. [2] Support for the Arm Compiler (armcc) is preliminary.
.. [3] Builds that execute on Java 6 to 14 can be analyzed. The analysis understands Java 14 standard language features.
.. [4] ECJ is supported when the build invokes it via the Maven Compiler plugin or the Takari Lifecycle plugin.
.. [5] JSX and Flow code, YAML, JSON, HTML, and XML files may also be analyzed with JavaScript files.
.. [6] TypeScript analysis is performed by running the JavaScript extractor with TypeScript enabled. This is the default for LGTM.
.. [3] In addition, support is included for the preview features of C# 8.0 and .NET Core 3.0.
.. [4] Builds that execute on Java 6 to 14 can be analyzed. The analysis understands Java 14 standard language features.
.. [5] ECJ is supported when the build invokes it via the Maven Compiler plugin or the Takari Lifecycle plugin.
.. [6] JSX and Flow code, YAML, JSON, HTML, and XML files may also be analyzed with JavaScript files.
.. [7] TypeScript analysis is performed by running the JavaScript extractor with TypeScript enabled. This is the default for LGTM.

View File

@@ -3,7 +3,7 @@
## Introduction
This document describes how to format the code you contribute to this repository. It covers aspects such as layout, white-space, naming, and documentation. Adhering to consistent standards makes code easier to read and maintain. Of course, these are only guidelines, and can be overridden as the need arises on a case-by-case basis. Where existing code deviates from these guidelines, prefer consistency with the surrounding code.
Note, if you use [CodeQL for Visual Studio Code](https://help.semmle.com/codeql/codeql-for-vscode/procedures/about-codeql-for-vscode.html), you can autoformat your query in the editor.
Note, if you use CodeQL for VS Code, you can autoformat your query in the [Editor](https://help.semmle.com/codeql/codeql-for-vscode/reference/editor.html#autoformatting).
Words in *italic* are defined in the [Glossary](#glossary).
@@ -216,7 +216,7 @@ class Type extends ... {
General requirements:
1. Documentation *must* adhere to the [QLDoc specification](https://help.semmle.com/QL/ql-handbook/qldoc.html).
1. Documentation *must* adhere to the [QLDoc specification](https://help.semmle.com/QL/QLDocSpecification.html).
1. Use `/** ... */` for documentation, even for single line comments.
1. For single-line documentation, the `/**` and `*/` are written on the same line as the comment.
1. For multi-line documentation, the `/**` and `*/` are written on separate lines. There is a `*` preceding each comment line, aligned on the first `*`.
@@ -417,16 +417,16 @@ deprecated Expr getInitializer()
| Phrase | Meaning |
|-------------|----------|
| *[annotation](https://help.semmle.com/QL/ql-handbook/language.html#annotations)* | An additional specifier used to modify a declaration, such as `private`, `override`, `deprecated`, `pragma`, `bindingset`, or `cached`. |
| *[annotation](https://help.semmle.com/QL/QLLanguageSpecification.html#annotations)* | An additional specifier used to modify a declaration, such as `private`, `override`, `deprecated`, `pragma`, `bindingset`, or `cached`. |
| *body* | The text inside `{ }`, `( )`, or each section of an `if`-`then`-`else` or `from`-`where`-`select`. |
| *binary operator* | An operator with two operands, such as comparison operators, `and`, `or`, `implies`, or arithmetic operators. |
| *call* | A *formula* that invokes a predicate, e.g. `this.isStatic()` or `calls(a,b)`. |
| *[conjunct](https://help.semmle.com/QL/ql-handbook/language.html#conjunctions)* | A formula that is an operand to an `and`. |
| *[conjunct](https://help.semmle.com/QL/QLLanguageSpecification.html#conjunctions)* | A formula that is an operand to an `and`. |
| *declaration* | A class, module, predicate, field or newtype. |
| *[disjunct](https://help.semmle.com/QL/ql-handbook/language.html#disjunctions)* | A formula that is an operand to an `or`. |
| *[formula](https://help.semmle.com/QL/ql-handbook/language.html#formulas)* | A logical expression, such as `A = B`, a *call*, a *quantifier*, `and`, `or`, `not`, `in` or `instanceof`. |
| *[disjunct](https://help.semmle.com/QL/QLLanguageSpecification.html#disjunctions)* | A formula that is an operand to an `or`. |
| *[formula](https://help.semmle.com/QL/QLLanguageSpecification.html#formulas)* | A logical expression, such as `A = B`, a *call*, a *quantifier*, `and`, `or`, `not`, `in` or `instanceof`. |
| *should/should not/avoid/prefer* | Adhere to this rule wherever possible, where it makes sense. |
| *may/can* | This is a reasonable alternative, to be used with discretion. |
| *must/always/do not* | Always adhere to this rule. |
| *[quantifier/aggregation](https://help.semmle.com/QL/ql-handbook/language.html#aggregations)* | `exists`, `count`, `strictcount`, `any`, `forall`, `forex` and so on. |
| *[quantifier/aggregation](https://help.semmle.com/QL/QLLanguageSpecification.html#aggregations)* | `exists`, `count`, `strictcount`, `any`, `forall`, `forex` and so on. |
| *variable* | A parameter to a predicate, a field, a from variable, or a variable introduced by a *quantifier* or *aggregation*. |

View File

@@ -36,7 +36,7 @@ Section-level elements are used to group the information within the query help f
3. `example`—an example of code showing the problem. Where possible, this section should also include a solution to the issue.
4. `references`—relevant references, such as authoritative sources on language semantics and best practice.
For further information about the other section-level, block, list and table elements supported by query help files, see [Query help files](https://help.semmle.com/QL/learn-ql/ql/writing-queries/query-help.html) on help.semmle.com.
For further information about the other section-level, block, list and table elements supported by query help files, see the [Query help reference](https://help.semmle.com/QL/learn-ql/ql/writing-queries/query-help.html) on help.semmle.com.
## English style

View File

@@ -11,22 +11,22 @@ Query files have the extension `.ql`. Each file has two distinct areas:
* Metadata areadisplayed at the top of the file, contains the metadata that defines how results for the query are interpreted and gives a brief description of the purpose of the query.
* Query definitiondefined using QL. The query includes a select statement, which defines the content and format of the results. For further information about writing QL, see the following topics:
* [Learning CodeQL](https://help.semmle.com/QL/learn-ql/index.html)
* [QL language reference](https://help.semmle.com/QL/ql-handbook/index.html)
* [CodeQL style guide](https://github.com/github/codeql/blob/master/docs/ql-style-guide.md)
* [QL language handbook](https://help.semmle.com/QL/ql-handbook/index.html)
* [QL language specification](https://help.semmle.com/QL/ql-spec/language.html)
* [CodeQL style guide](https://github.com/Semmle/ql/blob/master/docs/ql-style-guide.md)
For examples of query files for the languages supported by CodeQL, visit the following links:
* [C/C++ queries](https://help.semmle.com/wiki/display/CCPPOBJ/)
* [C# queries](https://help.semmle.com/wiki/display/CSHARP/)
* [Go queries](https://help.semmle.com/wiki/display/GO/)
* [Java queries](https://help.semmle.com/wiki/display/JAVA/)
* [JavaScript queries](https://help.semmle.com/wiki/display/JS/)
* [Python queries](https://help.semmle.com/wiki/display/PYTHON/)
## Metadata area
Query file metadata contains important information that defines the identifier and purpose of the query. The metadata is included as the content of a valid [QLDoc](https://help.semmle.com/QL/ql-handbook/qldoc.html) comment, on lines with leading whitespace followed by `*`, between an initial `/**` and a trailing `*/`. For example:
Query file metadata contains important information that defines the identifier and purpose of the query. The metadata is included as the content of a valid [QLDoc](https://help.semmle.com/QL/ql-spec/qldoc.html) comment, on lines with leading whitespace followed by `*`, between an initial `/**` and a trailing `*/`. For example:
```
/**
@@ -42,7 +42,7 @@ Query file metadata contains important information that defines the identifier a
*/
```
To help others use your query, and to ensure that the query works correctly on LGTM, you should include all of the required information outlined below in the metadata, and as much of the optional information as possible. For further information on query metadata see [Metadata for CodeQL queries](https://help.semmle.com/QL/learn-ql/ql/writing-queries/query-metadata.html) on help.semmle.com.
To help others use your query, and to ensure that the query works correctly on LGTM, you should include all of the required information outlined below in the metadata, and as much of the optional information as possible. For further information on query metadata see [Query metadata](https://help.semmle.com/QL/learn-ql/ql/writing-queries/query-metadata.html) on help.semmle.com.
@@ -134,7 +134,6 @@ There are also more specific `@tags` that can be added. See, the following pages
* [C/C++ queries](https://help.semmle.com/wiki/display/CCPPOBJ/)
* [C# queries](https://help.semmle.com/wiki/display/CSHARP/)
* [Go queries](https://help.semmle.com/wiki/display/GO/)
* [Java queries](https://help.semmle.com/wiki/display/JAVA/)
* [JavaScript queries](https://help.semmle.com/wiki/display/JS/)
* [Python queries](https://help.semmle.com/wiki/display/PYTHON/)
@@ -159,7 +158,7 @@ When you tag a query like this, the associated CWE pages from [MITRE.org](http:/
## QL area
### Alert messages
### Alert messages
The select clause of each alert query defines the alert message that is displayed for each result found by the query. Alert messages are strings that concisely describe the problem that the alert is highlighting and, if possible, also provide some context. For consistency, alert messages should adhere to the following guidelines:
@@ -168,15 +167,14 @@ The select clause of each alert query defines the alert message that is displaye
* Program element references should be in 'single quotes' to distinguish them from ordinary words. Quotes are not needed around substitutions ($@).
* Avoid constant alert message strings and include some context, if possible. For example, `The class 'Foo' is duplicated as 'Bar'.` is preferable to `This class is duplicated here.`
* Where you reference another program element, link to it if possible using a substitution (`$@`). Links should be used inline in the sentence, rather than as parenthesised lists or appositions.
* When a message contains multiple links, construct a sentence that has the most variable link (that is, the link with most targets) last. For further information, see [Defining the results of a query](https://help.semmle.com/QL/learn-ql/ql/writing-queries/select-statement.html).
* When a message contains multiple links, construct a sentence that has the most variable link (that is, the link with most targets) last. For further information, see [Defining select statements](https://help.semmle.com/QL/learn-ql/ql/writing-queries/select-statement.html).
For examples of select clauses and alert messages, see the query source files at the following pages:
* [C/C++ queries](https://help.semmle.com/wiki/display/CCPPOBJ/)
* [C# queries](https://help.semmle.com/wiki/display/CSHARP/)
* [Go queries](https://help.semmle.com/wiki/display/GO/)
* [Java queries](https://help.semmle.com/wiki/display/JAVA/)
* [JavaScript queries](https://help.semmle.com/wiki/display/JS/)
* [Python queries](https://help.semmle.com/wiki/display/PYTHON/)
For further information on query writing, see [CodeQL queries](https://help.semmle.com/QL/learn-ql/ql/writing-queries/writing-queries.html). For more information on learning CodeQL, see [Learning CodeQL](https://help.semmle.com/QL/learn-ql/index.html).
For further information on query writing, see [Writing CodeQL queries](https://help.semmle.com/QL/learn-ql/ql/writing-queries/writing-queries.html). For more information on learning CodeQL, see [Learning CodeQL](https://help.semmle.com/QL/learn-ql/index.html).

View File

@@ -0,0 +1,125 @@
private import javascript as raw
/**
* EXPERIMENTAL. This API may change in the future.
*
* A configuration class for defining known endpoints and endpoint filters for adaptive threat
* modeling (ATM). Each boosted query must define its own extension of this abstract class.
*
* A configuration defines a set of known sources (`isKnownSource`) and sinks (`isKnownSink`).
* It must also define a sink endpoint filter (`isEffectiveSink`) that filters candidate sinks
* predicted by the machine learning model to a set of effective sinks.
*
* Optionally, a configuration may also define additional edges beyond the base data flow edges
* (`isAdditionalFlowStep`) and sanitizers (`isSanitizer` and `isSanitizerGuard`).
*
* To get started with ATM, you can copy-paste an implementation of the `DataFlow::Configuration`
* class for a standard security query, for example `SqlInjection::Configuration`. Note that if
* the security query configuration defines additional edges beyond the standard data flow edges,
* such as `NosqlInjection::Configuration`, you may need to replace the definition of
* `isAdditionalFlowStep` with a more generalised definition of additional edges. See
* `NosqlInjectionATM.ql` for an example of doing this.
*
* Technical information:
*
* - Conceptually, this class is very similar to the subclass of `DataFlow::Configuration` that is
* used to define the base security query. The reason why we define a new class to provide this
* information to ATM is due to performance implications of QL's dispatch behaviour: defining
* another `DataFlow::Configuration` instance would slow the evaluation of the boosted query.
*
* - Furthermore, we cannot use the approach used by the `ForwardExploration` and
* `BackwardExploration` modules to implement ATM, since ATM needs access to the sets of sources
* and sinks from the *original* dataflow configuration in order to perform similarity search.
*/
abstract class ATMConfig extends string {
bindingset[this]
ATMConfig() { any() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if `source` is a known source of flow.
*/
predicate isKnownSource(raw::DataFlow::Node source) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if `source` is a known source of flow labeled with `lbl`.
*/
predicate isKnownSource(raw::DataFlow::Node source, raw::DataFlow::FlowLabel lbl) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if `sink` is a known sink of flow.
*/
predicate isKnownSink(raw::DataFlow::Node sink) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if `sink` is a known sink of flow labeled with `lbl`.
*/
predicate isKnownSink(raw::DataFlow::Node sink, raw::DataFlow::FlowLabel lbl) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if the candidate sink `candidateSink` predicted by the machine learning model should be
* an effective sink, i.e. one considered as a possible sink of flow in the boosted query.
*/
abstract predicate isEffectiveSink(raw::DataFlow::Node candidateSink);
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if the intermediate node `node` is a taint sanitizer.
*/
predicate isSanitizer(raw::DataFlow::Node node) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if for the boosted query the data flow node `guard` can act as a sanitizer when
* appearing in a condition.
*
* For example, if `guard` is the comparison expression in
* `if(x == 'some-constant'){ ... x ... }`, it could sanitize flow of `x` into the "then"
* branch.
*/
predicate isSanitizerGuard(raw::TaintTracking::SanitizerGuardNode guard) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if the additional taint propagation step from `src` to `trg` must be taken into account
* in the boosted query.
*/
predicate isAdditionalTaintStep(raw::DataFlow::Node src, raw::DataFlow::Node trg) { none() }
/**
* EXPERIMENTAL. This API may change in the future.
*
* Holds if `src -> trg` should be considered as a flow edge in addition to standard data flow
* edges in the boosted query.
*/
predicate isAdditionalFlowStep(
raw::DataFlow::Node src, raw::DataFlow::Node trg, raw::DataFlow::FlowLabel inlbl,
raw::DataFlow::FlowLabel outlbl
) {
none()
}
}
// To debug the ATMConfig module, import this module by adding "import ATMConfigDebugging" to the
// top-level.
module ATMConfigDebugging {
query predicate knownSources(ATMConfig config, raw::DataFlow::Node source) {
config.isKnownSource(source) or config.isKnownSource(source, _)
}
query predicate anchorSinks(ATMConfig config, raw::DataFlow::Node sink) {
config.isKnownSink(sink) or config.isKnownSink(sink, _)
}
}

View File

@@ -0,0 +1,432 @@
external private predicate adaptiveThreatModelingModels(
string modelChecksum, string modelLanguage, string modelName, string modelType
);
private import javascript as raw
private import raw::DataFlow as DataFlow
import ATMConfig
module ATMEmbeddings {
private import CodeToFeatures::DatabaseFeatures as DatabaseFeatures
class Entity = DatabaseFeatures::Entity;
/* Currently the only label is a label marking an embedding as derived from an entity in the current database. */
private newtype TEmbeddingLabel = TEntityLabel(Entity entity)
/**
* An abstract label that can be used to mark an embedding with the object from which it has been
* derived.
*/
abstract class EmbeddingLabel extends TEmbeddingLabel {
abstract string toString();
}
/**
* A label marking an embedding as derived from an entity in the current database, i.e. the
* database we're running the query on.
*/
class EntityLabel extends EmbeddingLabel {
private Entity entity;
EntityLabel() { this = TEntityLabel(entity) }
Entity getEntity() { result = entity }
override string toString() { result = "EntityLabel(" + entity.toString() + ")" }
}
/**
* `entities` relation suitable for passing to the `codeEmbedding` HOP.
*
* The `codeEmbedding` HOP expects an entities relation with eight columns, while
* `DatabaseFeatures` generates one with nine columns.
*/
predicate entities(
Entity entity, string entityName, string entityType, string path, int startLine,
int startColumn, int endLine, int endColumn
) {
DatabaseFeatures::entities(entity, entityName, entityType, path, startLine, startColumn,
endLine, endColumn, _)
}
private predicate databaseEmbeddingsByEntity(
Entity entity, int embeddingIndex, float embeddingValue
) =
codeEmbedding(entities/8, DatabaseFeatures::astNodes/5, DatabaseFeatures::nodeAttributes/2,
modelChecksum/0)(entity, embeddingIndex, embeddingValue)
/** Embeddings for each entity in the current database. */
predicate databaseEmbeddings(EntityLabel label, int embeddingIndex, float embeddingValue) {
exists(Entity entity |
databaseEmbeddingsByEntity(entity, embeddingIndex, embeddingValue) and
label.getEntity() = entity
)
}
/** Checksum of the model that should be used. */
string modelChecksum() { adaptiveThreatModelingModels(result, "javascript", _, _) }
}
private module ATMEmbeddingsDebugging {
query predicate databaseEmbeddingsDebug = ATMEmbeddings::databaseEmbeddings/3;
query predicate modelChecksumDebug = ATMEmbeddings::modelChecksum/0;
}
private ATMConfig getCfg() { any() }
/**
* This module provides functionality that takes a sink and provides an entity that encloses that
* sink and is suitable for similarity analysis.
*/
module SinkToEntity {
private import CodeToFeatures
private raw::Function getNamedEnclosingFunction(raw::Function f) {
if not exists(f.getName())
then result = getNamedEnclosingFunction(f.getEnclosingContainer())
else result = f
}
private raw::Function nodeToNamedFunction(DataFlow::Node node) {
result = getNamedEnclosingFunction(node.getContainer())
}
/**
* We use the innermost named function that encloses a sink, if one exists.
* Otherwise, we use the innermost function that encloses the sink.
*/
private raw::Function sinkToFunction(DataFlow::Node sink) {
if exists(raw::Function f | f = nodeToNamedFunction(sink))
then result = nodeToNamedFunction(sink)
else result = sink.getContainer()
}
private DatabaseFeatures::Entity getFirstExtractedEntity(raw::Function e) {
if
DatabaseFeatures::entities(result, _, _, _, _, _, _, _, _) and
result.getDefinedFunction() = e
then any()
else result = getFirstExtractedEntity(e.getEnclosingContainer())
}
/** Get an entity enclosing the sink that is suitable for similarity analysis. */
DatabaseFeatures::Entity getEntityForSink(DataFlow::Node sink) {
result = getFirstExtractedEntity(sinkToFunction(sink))
}
}
/**
* This module provides functionality that takes an entity and provides sink candidates within
* that entity.
*/
module EntityToSinkCandidate {
private import CodeToFeatures
/** Get a sink candidate enclosed within the specified entity. */
DataFlow::Node getASinkCandidate(DatabaseFeatures::Entity entity) {
getCfg().isEffectiveSink(result) and
result.getContainer().getEnclosingContainer*() = entity.getDefinedFunction()
}
}
// To debug the EntityToSinkCandidate module, import this module by adding
// "import EntityToSinkCandidateDebugging" to the top-level.
module EntityToSinkCandidateDebugging {
private import CodeToFeatures
query predicate databaseSinks(DataFlow::Node sink) {
exists(DatabaseFeatures::Entity entity |
DatabaseFeatures::entities(entity, _, _, _, _, _, _, _, _) and
sink = EntityToSinkCandidate::getASinkCandidate(entity)
)
}
}
module ATM {
import ATMEmbeddings
private int getNumberOfSinkSemSearchResults() { result = 100000000 }
private predicate sinkSemSearchResults(
EmbeddingLabel searchLabel, EmbeddingLabel resultLabel, float score
) =
semanticSearch(sinkQueryEmbeddings/3, databaseEmbeddings/3, getNumberOfSinkSemSearchResults/0)(searchLabel,
resultLabel, score)
/** `DataFlow::Configuration` for adaptive threat modeling (ATM). */
class Configuration extends raw::TaintTracking::Configuration {
Configuration() { this = "AdaptiveThreatModeling" }
override predicate isSource(DataFlow::Node source) {
// Is an existing source
getCfg().isKnownSource(source)
}
override predicate isSource(DataFlow::Node source, DataFlow::FlowLabel lbl) {
// Is an existing source
getCfg().isKnownSource(source, lbl)
}
override predicate isSink(DataFlow::Node sink) {
// Is in a result entity that is similar to a known sink-containing entity according to
// semantic search
exists(Entity resultEntity, EntityLabel resultLabel |
sinkSemSearchResults(_, resultLabel, _) and
sink = EntityToSinkCandidate::getASinkCandidate(resultEntity) and
resultLabel.getEntity() = resultEntity
)
or
// Is an existing sink
getCfg().isKnownSource(sink)
}
override predicate isSink(DataFlow::Node sink, DataFlow::FlowLabel lbl) {
// Is in a result entity that is similar to a known sink-containing entity according to
// semantic search
exists(DataFlow::Node originalSink, EntityLabel seedLabel, EntityLabel resultLabel |
getCfg().isKnownSink(originalSink, lbl) and
seedLabel.getEntity() = SinkToEntity::getEntityForSink(sink) and
sinkSemSearchResults(seedLabel, resultLabel, _) and
sink = EntityToSinkCandidate::getASinkCandidate(resultLabel.getEntity())
)
or
// Is an existing sink
getCfg().isKnownSink(sink, lbl)
}
override predicate isSanitizer(DataFlow::Node node) {
super.isSanitizer(node) or
getCfg().isSanitizer(node)
}
override predicate isSanitizerGuard(raw::TaintTracking::SanitizerGuardNode guard) {
super.isSanitizerGuard(guard) or
getCfg().isSanitizerGuard(guard)
}
override predicate isAdditionalTaintStep(DataFlow::Node src, DataFlow::Node trg) {
getCfg().isAdditionalTaintStep(src, trg)
}
override predicate isAdditionalFlowStep(
DataFlow::Node src, DataFlow::Node trg, DataFlow::FlowLabel inlbl, DataFlow::FlowLabel outlbl
) {
getCfg().isAdditionalFlowStep(src, trg, inlbl, outlbl)
}
}
private Entity getSeedSinkEntity() {
exists(DataFlow::Node sink |
(getCfg().isKnownSink(sink) or getCfg().isKnownSink(sink, _)) and
result = SinkToEntity::getEntityForSink(sink)
)
}
private predicate sinkQueryEmbeddings(
EmbeddingLabel label, int embeddingIndex, float embeddingValue
) {
label.(EntityLabel).getEntity() = getSeedSinkEntity() and
databaseEmbeddings(label, embeddingIndex, embeddingValue)
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* This module contains informational predicates about the results returned by adaptive threat
* modeling (ATM).
*/
module ResultsInfo {
/**
* Holds if the node `source` is a source in the standard security query.
*/
private predicate isSourceASeed(DataFlow::Node source) {
getCfg().isKnownSource(source) or getCfg().isKnownSource(source, _)
}
/**
* Holds if the node `sink` is a sink in the standard security query.
*/
private predicate isSinkASeed(DataFlow::Node sink) {
getCfg().isKnownSink(sink) or getCfg().isKnownSink(sink, _)
}
private float scoreForSink(DataFlow::Node sink) {
if isSinkASeed(sink)
then result = 1.0
else
result =
max(float score |
exists(ATMEmbeddings::EntityLabel entityLabel |
sinkSemSearchResults(_, entityLabel, score) and
sink = EntityToSinkCandidate::getASinkCandidate(entityLabel.getEntity())
)
)
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* Returns the score for the flow between the source `source` and the `sink` sink in the
* boosted query.
*/
float scoreForFlow(DataFlow::Node source, DataFlow::Node sink) { result = scoreForSink(sink) }
/**
* Pad a score returned from `scoreForFlow` to a particular length by adding a decimal point
* if one does not already exist, and "0"s after that decimal point.
*
* Note that this predicate must itself define an upper bound on `length`, so that it has a
* finite number of results. Currently this is defined as 12.
*/
private string paddedScore(float score, int length) {
// In this definition, we must restrict the values that `length` and `score` can take on so that the
// predicate has a finite number of results.
score = scoreForFlow(_, _) and
length = result.length() and
(
// We need to make sure the padded score contains a "." so lexically sorting the padded scores is
// equivalent to numerically sorting the scores.
score.toString().charAt(_) = "." and
result = score.toString()
or
not score.toString().charAt(_) = "." and
result = score.toString() + "."
)
or
result = paddedScore(score, length - 1) + "0" and
length <= 12
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* Return a string representing the score of the flow between `source` and `sink` in the
* boosted query.
*
* The returned string is a fixed length, such that lexically sorting the strings returned by
* this predicate gives the same sort order as numerically sorting the scores of the flows.
*/
string scoreStringForFlow(DataFlow::Node source, DataFlow::Node sink) {
exists(float score |
score = scoreForFlow(source, sink) and
(
// A length of 12 is equivalent to 10 decimal places.
score.toString().length() >= 12 and
result = score.toString().substring(0, 12)
or
score.toString().length() < 12 and
result = paddedScore(score, 12)
)
)
}
private ATMEmbeddings::EmbeddingLabel bestSearchLabelsForSink(DataFlow::Node sink) {
exists(ATMEmbeddings::EntityLabel resultLabel |
sinkSemSearchResults(result, resultLabel, scoreForSink(sink)) and
sink = EntityToSinkCandidate::getASinkCandidate(resultLabel.getEntity())
)
}
private newtype TEndpointOrigins =
TOrigins(boolean isKnown, boolean isSimilarToKnown) {
isKnown = [true, false] and isSimilarToKnown = [true, false]
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* A class representing the origins of an endpoint.
*/
class EndpointOrigins extends TEndpointOrigins {
/**
* EXPERIMENTAL. This API may change in the future.
*
* Whether the endpoint is a known endpoint in the database.
*/
boolean isKnown;
/**
* EXPERIMENTAL. This API may change in the future.
*
* Whether the endpoint is a predicted endpoint that is near to a known endpoint in
* the database.
*/
boolean isSimilarToKnown;
EndpointOrigins() { this = TOrigins(isKnown, isSimilarToKnown) }
/**
* EXPERIMENTAL. This API may change in the future.
*
* A string listing the origins of a predicted endpoint.
*
* Origins include:
*
* - `known`: The endpoint is a known endpoint in the database.
* - `similar_to_known`: The endpoint is a predicted endpoint that is similar to a known
* endpoint in the database.
*/
string listOfOriginComponents() {
// Ensure that this predicate has exactly one result.
result =
any(string x | if isKnown = true then x = "known" else x = "") +
any(string x | if isKnown = true and isSimilarToKnown = true then x = "," else x = "") +
any(string x | if isSimilarToKnown = true then x = "similar_to_known" else x = "")
}
string toString() { result = "EndpointOrigins(" + listOfOriginComponents() + ")" }
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* The highest-scoring origins of the source.
*/
EndpointOrigins originsForSource(DataFlow::Node source) {
result =
TOrigins(any(boolean b | if isSourceASeed(source) then b = true else b = false), false)
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* The highest-scoring origins of the sink.
*/
EndpointOrigins originsForSink(DataFlow::Node sink) {
result =
TOrigins(any(boolean b | if isSinkASeed(sink) then b = true else b = false),
any(boolean b |
if
not isSinkASeed(sink) and
exists(ATMEmbeddings::EntityLabel label | label = bestSearchLabelsForSink(sink))
then b = true
else b = false
))
}
/**
* EXPERIMENTAL. This API may change in the future.
*
* Indicates whether the flow from source to sink is likely to be reported by the base security
* query.
*
* Currently this is a heuristic: it ignores potential differences in the definitions of
* additional flow steps.
*/
predicate isFlowLikelyInBaseQuery(DataFlow::Node source, DataFlow::Node sink) {
isSourceASeed(source) and isSinkASeed(sink)
}
}
// To debug the ATM module, import this module by adding "import ATM::Debugging" to the top-level.
module Debugging {
query predicate sinkSemSearchResultsDebug = sinkSemSearchResults/3;
query predicate atmSources(DataFlow::Node source) {
any(ATM::Configuration cfg).isSource(source)
}
query predicate atmSinks(DataFlow::Node sink) { any(ATM::Configuration cfg).isSink(sink) }
}
}

View File

@@ -0,0 +1,442 @@
/*
* For internal use only.
*
* Extracts data about the functions in the database for use in adaptive threat modeling (ATM).
*/
module Raw {
private import javascript as raw
class RawAstNode = raw::ASTNode;
class Entity = raw::Function;
class Location = raw::Location;
/**
* Exposed as a tool for defining anchors for semantic search.
*/
class UnderlyingFunction = raw::Function;
/**
* Determines whether an entity should be omitted from ATM.
*/
predicate isEntityIgnored(Entity entity) {
// Ignore entities which don't have definitions, for example those in TypeScript
// declaration files.
not exists(entity.getBody())
}
newtype WrappedAstNode = TAstNode(RawAstNode rawNode)
/**
* This class represents nodes in the AST.
*/
class AstNode extends TAstNode {
RawAstNode rawNode;
AstNode() { this = TAstNode(rawNode) }
AstNode getAChildNode() { result = TAstNode(rawNode.getAChild()) }
AstNode getParentNode() { result = TAstNode(rawNode.getParent()) }
/**
* Holds if the AST node has `result` as its `index`th attribute.
*
* The index is not intended to mean anything, and is only here for disambiguation.
* There are no guarantees about any particular index being used (or not being used).
*/
string astNodeAttribute(int index) {
(
// NB: Unary and binary operator expressions e.g. -a, a + b and compound
// assignments e.g. a += b can be identified by the expression type.
result = "ID:" + rawNode.(raw::Identifier).getName()
or
// Add an ID: for computed property accesses for which we can predetermine the property being accessed.
// Slight lie but useful for the model.
// NB: May alias with operators e.g. could have '+' as a property name.
result = "ID:" + rawNode.(raw::IndexExpr).getPropertyName()
or
// Want to have distinct representations for `0xa`, `0xA`, and `10`.
result = "LIT:" + rawNode.(raw::NumberLiteral).getRawValue()
or
// Want to map `"a"` and `'a'` onto the same representation.
not rawNode instanceof raw::NumberLiteral and
result = "LIT:" + rawNode.(raw::Literal).getValue()
or
result = "LIT:" + rawNode.(raw::TemplateElement).getRawValue()
) and
index = 0
}
/**
* Returns a string indicating the "type" of the AST node.
*/
string astNodeType() {
// The definition of this method should correspond with that of the `@ast_node` entry in the
// dbscheme.
result = "js_exprs." + any(int kind | exprs(rawNode, kind, _, _, _))
or
result = "js_properties." + any(int kind | properties(rawNode, _, _, kind, _))
or
result = "js_stmts." + any(int kind | stmts(rawNode, kind, _, _, _))
or
result = "js_toplevel" and rawNode instanceof raw::TopLevel
or
result = "js_typeexprs." + any(int kind | typeexprs(rawNode, kind, _, _, _))
}
/**
* Holds if `result` is the `index`'th child of the AST node, for some arbitrary indexing.
* A root of the AST should be its own child, with an arbitrary (though conventionally
* 0) index.
*
* Notably, the order in which child nodes are visited is not required to be meaningful,
* and no particular index is required to be meaningful. However, `(parent, index)`
* should be a keyset.
*/
pragma[nomagic]
AstNode astNodeChild(int index) {
result =
rank[index - 1](AstNode child, raw::Location l |
child = this.getAChildNode() and l = child.getLocation()
|
child
order by
l.getStartLine(), l.getStartColumn(), l.getEndLine(), l.getEndColumn(),
child.astNodeType()
)
or
not exists(result.getParentNode()) and this = result and index = 0
}
raw::Location getLocation() { result = rawNode.getLocation() }
string toString() { result = rawNode.toString() }
predicate isEntityNameNode(Entity entity) {
exists(int index |
TAstNode(entity) = getParentNode() and
this = getParentNode().astNodeChild(index) and
// An entity name node must be the first child of the entity.
index = min(int otherIndex | exists(getParentNode().astNodeChild(otherIndex))) and
entity.getName() = rawNode.(raw::VarDecl).getName()
)
}
}
/**
* Holds if `result` is the `index`'th child of the `parent` entity. Such
* a node is a root of an AST associated with this entity.
*/
AstNode entityChild(AstNode parent, int index) {
// In JavaScript, entities appear in the AST parent/child relationship.
result = parent.astNodeChild(index)
}
/**
* Holds if `node` is contained in `entity`. Note that a single node may be contained
* in multiple entities, if they are nested. An entity, in particular, should be
* reported as contained within itself.
*/
predicate entityContains(Entity entity, AstNode node) {
node.getParentNode*() = TAstNode(entity) and not node.isEntityNameNode(entity)
}
/**
* Get the name of the entity.
*
* We attempt to assign unnamed entities approximate names if they are passed to a likely
* external library function. If we can't assign them an approximate name, we give them the name
* `""`, so that these entities are included in `AdaptiveThreatModeling.qll`.
*
* For entities which have multiple names, we choose the lexically smallest name.
*/
string getEntityName(Entity entity) {
if exists(entity.getName())
then
// https://github.com/github/ml-ql-adaptive-threat-modeling/issues/244 discusses making use
// of all the names during training.
result = min(entity.getName())
else
if exists(getApproximateNameForEntity(entity))
then result = getApproximateNameForEntity(entity)
else result = ""
}
/**
* Holds if the call `call` has `entity` is its `argumentIndex`th argument.
*/
private predicate entityUsedAsArgumentToCall(
Entity entity, raw::DataFlow::CallNode call, int argumentIndex
) {
raw::DataFlow::localFlowStep*(call.getArgument(argumentIndex), entity.flow())
}
/**
* Returns a generated name for the entity. This name is generated such that
* entities with the same names have similar behaviour.
*/
private string getApproximateNameForEntity(Entity entity) {
count(raw::DataFlow::CallNode call, int index | entityUsedAsArgumentToCall(entity, call, index)) =
1 and
exists(raw::DataFlow::CallNode call, int index, string basePart |
entityUsedAsArgumentToCall(entity, call, index) and
(
if count(getReceiverName(call)) = 1
then basePart = getReceiverName(call) + "."
else basePart = ""
) and
result = basePart + call.getCalleeName() + "#functional_argument_" + index
)
}
private string getReceiverName(raw::DataFlow::CallNode call) {
result = call.getReceiver().asExpr().(raw::VarAccess).getName()
}
/** Sanity checks: these predicates should each have no results */
module Sanity {
/** `getEntityName` should assign each entity a single name. */
query predicate entityWithManyNames(Entity entity, string name) {
name = getEntityName(entity) and
count(getEntityName(entity)) > 1
}
query predicate nodeWithNoType(AstNode node) { not exists(node.astNodeType()) }
query predicate nodeWithManyTypes(AstNode node, string type) {
type = node.astNodeType() and
count(node.astNodeType()) > 1
}
query predicate nodeWithNoParent(AstNode node, string type) {
not node = any(AstNode parent).astNodeChild(_) and
type = node.astNodeType() and
not exists(RawAstNode rawNode | node = TAstNode(rawNode) and rawNode instanceof raw::Module)
}
query predicate duplicateChildIndex(AstNode parent, int index, AstNode child) {
child = parent.astNodeChild(index) and
count(parent.astNodeChild(index)) > 1
}
query predicate duplicateAttributeIndex(AstNode node, int index) {
exists(node.astNodeAttribute(index)) and
count(node.astNodeAttribute(index)) > 1
}
}
}
module Wrapped {
/*
* We require any node with attributes to be a leaf. Where a non-leaf node
* has an attribute, we instead create a synthetic leaf node that has that
* attribute.
*/
/**
* Holds if the AST node `e` is a leaf node.
*/
private predicate isLeaf(Raw::AstNode e) { not exists(e.astNodeChild(_)) }
newtype WrappedEntity =
TEntity(Raw::Entity entity) {
exists(entity.getLocation().getFile().getRelativePath()) and
Raw::entityContains(entity, _)
}
/**
* A type ranging over the kinds of entities for which we want to consider embeddings.
*/
class Entity extends WrappedEntity {
Raw::Entity rawEntity;
Entity() { this = TEntity(rawEntity) and not Raw::isEntityIgnored(rawEntity) }
string getName() { result = Raw::getEntityName(rawEntity) }
AstNode getAstRoot(int index) {
result = TAstNode(rawEntity, Raw::entityChild(Raw::TAstNode(rawEntity), index))
}
string toString() { result = rawEntity.toString() }
Raw::Location getLocation() { result = rawEntity.getLocation() }
Raw::UnderlyingFunction getDefinedFunction() { result = rawEntity }
}
newtype WrappedAstNode =
TAstNode(Raw::Entity enclosingEntity, Raw::AstNode node) {
Raw::entityContains(enclosingEntity, node)
} or
TSyntheticNode(
Raw::Entity enclosingEntity, Raw::AstNode node, int syntheticChildIndex, int attrIndex
) {
Raw::entityContains(enclosingEntity, node) and
exists(node.astNodeAttribute(attrIndex)) and
not isLeaf(node) and
if exists(node.astNodeChild(_))
then
syntheticChildIndex =
attrIndex - min(int other | exists(node.astNodeAttribute(other))) +
max(int other | exists(node.astNodeChild(other))) + 1
else syntheticChildIndex = attrIndex
}
pragma[nomagic]
private AstNode injectedChild(Raw::Entity enclosingEntity, Raw::AstNode parent, int index) {
result = TAstNode(enclosingEntity, parent.astNodeChild(index)) or
result = TSyntheticNode(enclosingEntity, parent, index, _)
}
/**
* A type ranging over AST nodes. Ultimately, only nodes contained in entities will
* be considered.
*/
class AstNode extends WrappedAstNode {
Raw::Entity enclosingEntity;
Raw::AstNode rawNode;
AstNode() {
(
this = TAstNode(enclosingEntity, rawNode) or
this = TSyntheticNode(enclosingEntity, rawNode, _, _)
) and
not Raw::isEntityIgnored(enclosingEntity)
}
string getAttribute(int index) {
result = rawNode.astNodeAttribute(index) and
not exists(TSyntheticNode(enclosingEntity, rawNode, _, index))
}
string getType() { result = rawNode.astNodeType() }
AstNode getChild(int index) { result = injectedChild(enclosingEntity, rawNode, index) }
string toString() { result = getType() }
Raw::Location getLocation() { result = rawNode.getLocation() }
}
/**
* A synthetic AST node, created to be a leaf for an otherwise non-leaf attribute.
*/
class SyntheticAstNode extends AstNode, TSyntheticNode {
int childIndex;
int attributeIndex;
SyntheticAstNode() {
this = TSyntheticNode(enclosingEntity, rawNode, childIndex, attributeIndex)
}
override string getAttribute(int index) {
result = rawNode.astNodeAttribute(attributeIndex) and index = attributeIndex
}
override string getType() {
result = rawNode.astNodeType() + "::<synthetic " + childIndex + ">"
}
override AstNode getChild(int index) { none() }
}
}
module DatabaseFeatures {
/**
* Exposed as a tool for defining anchors for semantic search.
*/
class UnderlyingFunction = Raw::UnderlyingFunction;
private class Location = Raw::Location;
private newtype TEntityOrAstNode =
TEntity(Wrapped::Entity entity) or
TAstNode(Wrapped::AstNode astNode)
class EntityOrAstNode extends TEntityOrAstNode {
abstract string getType();
abstract string toString();
abstract Location getLocation();
}
class Entity extends EntityOrAstNode, TEntity {
Wrapped::Entity entity;
Entity() { this = TEntity(entity) }
string getName() { result = entity.getName() }
AstNode getAstRoot(int index) { result = TAstNode(entity.getAstRoot(index)) }
override string getType() { result = "javascript function" }
override string toString() { result = "Entity: " + getName() }
override Location getLocation() { result = entity.getLocation() }
UnderlyingFunction getDefinedFunction() { result = entity.getDefinedFunction() }
}
class AstNode extends EntityOrAstNode, TAstNode {
Wrapped::AstNode rawNode;
AstNode() { this = TAstNode(rawNode) }
AstNode getChild(int index) { result = TAstNode(rawNode.getChild(index)) }
string getAttribute(int index) { result = rawNode.getAttribute(index) }
override string getType() { result = rawNode.getType() }
override string toString() { result = this.getType() }
override Location getLocation() { result = rawNode.getLocation() }
}
/** Sanity checks: these predicates should each have no results */
module Sanity {
query predicate nonLeafAttribute(AstNode node, int index, string attribute) {
attribute = node.getAttribute(index) and
exists(node.getChild(_))
}
}
query predicate entities(
Entity entity, string entity_name, string entity_type, string path, int startLine,
int startColumn, int endLine, int endColumn, string absolutePath
) {
entity_name = entity.getName() and
entity_type = entity.getType() and
exists(Location l | l = entity.getLocation() |
path = l.getFile().getRelativePath() and
absolutePath = l.getFile().getAbsolutePath() and
l.hasLocationInfo(_, startLine, startColumn, endLine, endColumn)
)
}
query predicate astNodes(
Entity enclosingEntity, EntityOrAstNode parent, int index, AstNode node, string node_type
) {
node = enclosingEntity.getAstRoot(index) and
parent = enclosingEntity and
node_type = node.getType()
or
astNodes(enclosingEntity, _, _, parent, _) and
node = parent.(AstNode).getChild(index) and
node_type = node.getType()
}
query predicate nodeAttributes(AstNode node, string attr) {
// Only get attributes of AST nodes we extract.
// This excludes nodes in standard libraries since the standard library files
// are located outside the source root.
astNodes(_, _, _, node, _) and
attr = node.getAttribute(_)
}
}

Some files were not shown because too many files have changed in this diff Show More