Commit Graph

344 Commits

Author SHA1 Message Date
tiferet
b885249d9d Add a boosted version of XssThroughDOM 2022-11-29 17:40:20 -08:00
tiferet
c5184d37e7 Suggestion from code review:
Name the query configuration e.g. `NosqlInjectionATMConfig` rather than `Configuration`.
2022-11-29 15:46:05 -08:00
tiferet
6f807e9d43 Doc suggestion from code review 2022-11-29 13:20:47 -08:00
tiferet
75cd7a9ebc Remove code duplication in query .ql files:
Define the query for finding ATM alerts in the base class `AtmConfig`, and call it from each query's .ql file.
2022-11-29 13:20:47 -08:00
tiferet
a710b723d1 Move the definition of isSink to the base class:
Holds if `sink` is a known taint sink or an "effective" sink.
2022-11-29 13:20:47 -08:00
tiferet
cd24ec88d6 Move the definition of isSource to the base class:
A long as we're not boosting sources, `isSource` is identical to `isKnownSource`.
2022-11-29 13:20:47 -08:00
tiferet
50291c7b7c AtmConfig inherits from TaintTracking::Configuration.
That way the specific configs which inherit from `AtmConfig` also inherit from `TaintTracking::Configuration`.

This removes the need for two separate config classes for each query.
2022-11-29 13:20:47 -08:00
tiferet
05a943c9b5 Delete StandardEndpointFilters.
All remaining functionality in `StandardEndpointFilters` is only being used in `EndpointCharacteristics`, so it can be moved there as a small set of helper predicates.
2022-11-29 13:20:47 -08:00
tiferet
5402f047bf Delete CoreKnowledge.
All remaining functionality in `CoreKnowledge` is only being used in `EndpointCharacteristics`, so it can be moved there as a small set of helper predicates.
2022-11-29 13:20:47 -08:00
tiferet
1d4b2ccab4 Merge branch 'main' into tiferet/complexity-reduction 2022-11-29 12:47:18 -08:00
Tiferet Gazit
f375b0cc1b Merge pull request #11281 from github/tiferet/endpoint-filters
ATM: Implement the current endpoint filters as EndpointCharacteristics
2022-11-29 12:38:12 -08:00
tiferet
4580b55673 Oops -- forgot to stage one file in the previous commit :) 2022-11-28 11:34:34 -08:00
tiferet
210644e87d Delete StandardEndpointFilters.
All remaining functionality in `StandardEndpointFilters` is only being used in `EndpointCharacteristics`, so it can be moved there as a small set of helper predicates.
2022-11-28 11:34:34 -08:00
tiferet
15121931b4 Delete CoreKnowledge.
All remaining functionality in `CoreKnowledge` is only being used in `EndpointCharacteristics`, so it can be moved there as a small set of helper predicates.
2022-11-28 11:34:34 -08:00
tiferet
1c679378e7 FilteringReason is no longer being used and can be deleted 2022-11-28 11:34:33 -08:00
tiferet
99de397a5f Remove redundant code
`isOtherModeledArgument` and `isArgumentToBuiltinFunction` contained the old logic for selecting negative endpoints for training.

These can now be deleted, and replaced by a single base class that collects all EndpointCharacteristics that are currently used to indicate negative training samples: `OtherModeledArgumentCharacteristic`.

This in turn lets us delete code from `StandardEndpointFilters` that effectively said that endpoints that are high-confidence non-sinks shouldn't be scored at inference time, either.
2022-11-28 11:34:33 -08:00
tiferet
7b0269c999 Fix British spelling that code scanning didn't like.
I've been working with Brits for too long :)
2022-11-28 11:28:08 -08:00
tiferet
963407de4c Update the documentation 2022-11-28 11:16:06 -08:00
Henry Mercer
56e5f01ce0 Merge branch 'main' into codeql-ci/atm/release-0.4.2 2022-11-24 14:41:49 +00:00
github-actions[bot]
78d49e44b1 JS: Bump version of ML-powered library and query packs to 0.4.3 2022-11-24 14:22:14 +00:00
github-actions[bot]
8d96bfe973 JS: Bump patch version of ML-powered library and query packs 2022-11-24 14:18:13 +00:00
Erik Krogh Kristensen
1eec067474 Merge pull request #11294 from erik-krogh/fileDoc
QL: improve the "this block-comment should have been a QLDoc"-query
2022-11-23 22:23:36 +01:00
tiferet
03b8e649f1 Filter endpoints by confidence
Select endpoints to score at inference time base purely on their confidence level, and not on whether they fit the historical definition of endpoint filters.
2022-11-23 10:46:27 -08:00
Henry Mercer
3b69821630 ATM: Add descriptions to ML-powered packs 2022-11-23 10:46:23 +00:00
tiferet
1c9545e49a Address comment from code review:
Make `SyntacticHeuristics` an explicit import
2022-11-21 08:00:31 -08:00
tiferet
8d22fd25f1 Suggestions from code review 2022-11-18 15:57:46 -08:00
tiferet
4a1382925e Remove some imports that are no longer used 2022-11-16 14:01:16 -08:00
tiferet
ccbf1ca2a9 Add a comment 2022-11-16 13:05:06 -08:00
tiferet
38c40a7192 isEffectiveSink can't be final because ExtractMisclassifiedEndpointFeatures overrides it. 2022-11-16 12:12:50 -08:00
tiferet
8fee9cb0d5 Fix CodeQL warnings 2022-11-16 12:06:52 -08:00
tiferet
c2035e85d2 Be explicit in requiring that each ATM config set its endpoint type. 2022-11-16 11:55:23 -08:00
tiferet
0fd013f9fd Update the reason names in FilteredTruePositives.expected.
This is needed because we changed the names of three endpoint filters that were all called "not a direct argument to a likely external library call or a heuristic sink" in order to disambiguate them (fc56c5a022).
2022-11-16 11:54:10 -08:00
tiferet
eab270eb84 Move the definitions of isEffectiveSink and getAReasonSinkExcluded to the base class.
They can now be implemented generically for all sink types.
2022-11-16 11:47:24 -08:00
tiferet
fc56c5a022 Implement the type-specific endpoint filters as EndpointCharacteristics.
Also disambiguate three filters from three different sink types that all have the same name, "not a direct argument to a likely external library call or a heuristic sink".
2022-11-16 11:14:25 -08:00
erik-krogh
9eaeaf7322 ATM: convert some block-comments that could be QLDoc to QLDoc 2022-11-16 13:41:52 +01:00
tiferet
13cb0ab554 Fix CodeQL warning 2022-11-15 17:32:30 -08:00
tiferet
2ecdfd1ff6 Delete some code that's no longer in use 2022-11-15 17:29:03 -08:00
tiferet
fedb98ddb5 Implement the standard getAReasonSinkExcluded using StandardEndpointFilterCharacteristics 2022-11-15 17:22:00 -08:00
tiferet
cf4e37a0ab Implement the standard endpoint filters as EndpointCharacteristics 2022-11-15 17:20:20 -08:00
tiferet
cb632b3534 Delete the file ExtractEndpointData.expected which was leftover in the last PR 2022-11-15 17:11:34 -08:00
Tiferet Gazit
710b215c38 Merge pull request #11263 from github/tiferet/extract-training-data
ATM: Extract training data
2022-11-15 12:08:13 -08:00
tiferet
fc078a47fd Apply suggestion from code review 2022-11-15 11:14:01 -08:00
Tiferet Gazit
092e019de9 Apply suggestions from code review
Co-authored-by: Stephan Brandauer <kaeluka@github.com>
2022-11-15 10:48:32 -08:00
Andrew Eisenberg
88750a7000 Add more information about ATM queries for external users 2022-11-15 10:17:56 -08:00
Stephan Brandauer
ec3578364e remove superfluous class in EndpointCharacteristics hierarchy 2022-11-15 10:17:38 +01:00
tiferet
9ecff0723c Fix non-ascii character in docs 2022-11-14 16:34:24 -08:00
tiferet
6b7612fed7 Fix import errors in DebugResultInclusion.ql 2022-11-14 15:33:46 -08:00
tiferet
b47723d607 Delete ExtractEndpointData.
Also remove the associated test files.
2022-11-14 14:57:59 -08:00
tiferet
9d7e7735d5 Extract training data:
Implement the new query that selects data for training. For now we include clauses that implement logic that is identical to the old queries.

Include a temporary wrapper query that converts the resulting data into the format expected by the endpoint pipeline.

Move the small pieces of `ExtractEndpointData` that are still needed into `ExtractEndpointDataTraining.qll`.
2022-11-14 14:33:08 -08:00
Tiferet Gazit
855eddab80 Merge pull request #11174 from github/tiferet/non-sink-endpoint-characteristics
Non-sink endpoint characteristics
2022-11-14 09:37:25 -08:00