tiferet
|
be9c6500b8
|
In the MaD data, extract the argument index as an int rather than a string wrapped up in "Argument[]"
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
831830831c
|
Fix the MaD signature to the correct format
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
ae69a2bcd9
|
Separate out the sink types to align with the MaD kinds that currently exist, adding a sink type for all sinks of a given query that are not currently mapped in the MaD kinds.
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
65923ed2c1
|
Add support for multiple sink types per query
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
a7269075e2
|
As part of the metadata extraction predicate, surface whether or not the callee is a public method
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
d3a5ee53c6
|
Refactor the CodeQL code that extracts metadata for methods presented to Codex, to make it easy to add another field
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
f32bb65c54
|
Refactor the CodeQL code that extracts metadata for methods presented to Codex, to make it easy to add another field
|
2023-03-14 12:49:29 -07:00 |
|
tiferet
|
633bfdba28
|
Make the endpoint filter to filter out flow steps in Java a bit broader, and document it
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
db9cec6ea6
|
Add an endpoint filter to filter out flow steps
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
ec5425d952
|
When extracting positive and negative examples for the Java prompt, extract the data used in the MaD extensible predicate.
This will enable the codex prompt to optionally use this data in additional columns.
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
7666843316
|
Resolve two TODO items
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
e06bcc3112
|
Exclude negative examples that are type access nodes.
These will never be on a flow path so they're not useful negative examples.
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
3229b37436
|
Increase diversity of negative prompt examples by creating finer sub-types
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
559570419d
|
If a node satisfies the logic for both isSink and isSanitizer, don't include it as a positive or negative example in the prompt, because it's too ambiguous and will confuse the model.
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
844171a28e
|
Simplify the definition of ExtractPositiveExamples.ql
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
ecf4d4dc02
|
Avoid accidentally extracting positive prompt examples when there is a codex-generated data extension file in java/ql/lib/ext
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
0d4e85ff93
|
Add a predicate that finds endpoints with logically-inconsistent characteristics, and exclude such endpoints from both positive and negative examples extracted for the codex prompt.
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
1211197914
|
Fix codeql-pack.lock.yml so it's not looking for an ML model
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
41df8df182
|
Typo fix
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
125245aa62
|
Delete TODO items that are done
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
8bb2b2eaea
|
Have each EndpointType keep track of the sink/source kind for this endpoint type as used in Models as Data
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
27efe524da
|
Fix the extraction of data for the data extension YML file.
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
ae4668c488
|
Add data needed for the data extension YML file to ExtractSinkCandidatesWithFlow.ql: first pass.
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
3987d8d374
|
Small update to SafeExternalApiMethodCharacteristic
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
fd75952c1e
|
Improvements to ExtractSinkCandidatesWithFlow.ql
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
4db0dec82e
|
Minor improvement
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
a73b52adef
|
Improvements to ExtractSinkCandidatesWithFlow.ql
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
39a4513fcc
|
Delete the queries the Java team isn't currently interested in boosting
|
2023-03-14 12:49:28 -07:00 |
|
tiferet
|
3c44332f17
|
Move isFlowLikelyInBaseQuery to the ATMConfig and delete AdaptiveThreatModeling.qll
|
2023-03-14 12:49:27 -07:00 |
|
tiferet
|
06c7f1012c
|
Rename request forgery sink to server-side request forgery sink
|
2023-03-14 12:49:27 -07:00 |
|
tiferet
|
9421ba5303
|
Add and implementation of request forgery sinks and corresponding positive EndpointCharacteristic in Java
|
2023-03-14 12:49:27 -07:00 |
|
tiferet
|
f5109be2ac
|
Bug fixes
|
2023-03-14 12:49:27 -07:00 |
|
tiferet
|
c14a4c4d93
|
Add an implementation of TaintedPathATM.qll and corresponding positive EndpointCharacteristic in Java
|
2023-03-14 12:49:27 -07:00 |
|
tiferet
|
4546dbe51b
|
Subsample negative examples to 1% to prevent huge numbers.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
5d62dc3d2e
|
Add a Java NotASinkCharacteristic safe external API method
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
0acd06a6d3
|
Add queries to surface high-confidence Java sinks and non-sinks to use as examples in the codex prompt.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
04abb87fef
|
Rewrite ExtractSinkCandidatesWithFlow.ql as a problem query so we can run it with codeql database analyze to output SARIF results.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
5dc5c3fb3f
|
Add a couple of endpoint filters for Java
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
653b0128f5
|
Try implementing SqlInjectionATM.qll in Java
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
c0f58371b4
|
Start making the additions needed to surface candidate Java sinks for codex classification outside the evaluator.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
cf289d57e9
|
Go back to the prompt of https://github.com/github/codeql-dca-main/issues/9475
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
459050151a
|
Give more explicit instructions in the codex prompt, but don't solicit rare sink types.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
01979aeb62
|
Give more explicit instructions in the codex prompt.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
ef95f4c419
|
Minor prompt improvements:
- Tell codex explicitly that this is JavaScript code
- Replace "Dataflow node" with "Code snippet"
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
ac5434b3f3
|
Minor prompt improvements:
Remove spaces that break the code syntax or make for strange code styling.
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
ce17d94f80
|
In-line predicates that are costing a lot of compute time
|
2023-03-14 12:49:26 -07:00 |
|
tiferet
|
bcc4cdd376
|
Add a test that can be used to determine the alerts codex will surface for each query.
|
2023-03-14 12:49:25 -07:00 |
|
tiferet
|
9aba7a0bca
|
Bug fixes for things that interfere with using the codex model
|
2023-03-14 12:49:25 -07:00 |
|
tiferet
|
9a21539fca
|
Add a test that can be used to determine how well codex reproduces the manual modeling for each sink type.
|
2023-03-14 12:49:25 -07:00 |
|
tiferet
|
d76d11bd27
|
Fix endpointScores
|
2023-03-14 12:49:25 -07:00 |
|