Commit Graph

59 Commits

Author SHA1 Message Date
Erik Krogh Kristensen
1b5c7392f0 restrict the size of the getASubexpressionWithinQuery predicate, and remove double-recursion 2022-03-01 11:18:42 +01:00
Erik Krogh Kristensen
1407b49a8f fix some instances of ql/pred-doc-style for JS 2022-02-21 15:02:21 +01:00
Erik Krogh Kristensen
a1c5724be7 fix most ql-for-ql warnings in JS 2022-02-11 17:57:37 +01:00
Ian Wright
be5e8dae05 Update javascript/ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/FunctionBodyFeatures.qll
Co-authored-by: Henry Mercer <henrymercer@github.com>
2022-02-04 15:41:50 +00:00
Ian Wright
e57a0e0e2f Update javascript/ql/experimental/adaptivethreatmodeling/lib/experimental/adaptivethreatmodeling/FunctionBodyFeatures.qll
Co-authored-by: Henry Mercer <henrymercer@github.com>
2022-02-04 15:21:56 +00:00
Ian Wright
b38335a6c2 add QL comment; inline a predicate; restore a comment 2022-02-04 15:21:09 +00:00
Ian Wright
dca03d7b5d reinstate the AST node limit to minimize change to feature values 2022-02-03 09:45:35 +00:00
Ian Wright
d5ab119039 actually count the number of chars 2022-02-03 09:41:51 +00:00
Ian Wright
83ecc065ab restrict size of strings 2022-01-31 12:28:46 +00:00
Ian Wright
aceeb7324c restrict AST nodes according to string length 2022-01-28 15:06:10 +00:00
Henry Mercer
c134e6c9ef JS: Bump ML-powered query packs to v0.0.6 2022-01-19 14:40:42 +00:00
Henry Mercer
d467725ccd JS: Bump ML-powered query packs to v0.0.5 2022-01-19 12:08:33 +00:00
Henry Mercer
1893b9f7a9 Merge pull request #7376 from github/henrymercer/js-atm-absent-features-optimization
JS: Update featurization for absent features optimization
2022-01-18 10:15:53 +00:00
Henry Mercer
ed28b7f174 Merge pull request #7575 from github/henrymercer/atm-remove-code-to-features
JS: Remove ATM `CodeToFeatures` library
2022-01-14 15:31:34 +00:00
Henry Mercer
8e9d8c112d JS: Improve comments in FunctionBodyFeatures.qll 2022-01-13 17:20:42 +00:00
Henry Mercer
2aea3257cb JS: Improve documentation for getTokenizedAstNode 2022-01-13 17:20:41 +00:00
Henry Mercer
92d6fecc73 Optimize performance of body tokens
The refactoring to remove the `CodeToFeatures` AST reintroduced a
performance problem. This commit resolves it by pushing size
restrictions into intermediate predicates.
2022-01-13 16:29:04 +00:00
Henry Mercer
9abc3411a4 JS: Bump ATM pack versions to 0.0.4 2022-01-12 15:19:13 +00:00
Henry Mercer
7f61738a23 Use US English spelling 2022-01-12 13:07:09 +00:00
Henry Mercer
6e37a65e84 Remove CodeToFeatures AST library 2022-01-12 12:47:28 +00:00
Henry Mercer
957e34d8a7 Make function body features library independent of CodeToFeatures AST 2022-01-12 12:47:28 +00:00
Henry Mercer
9e50ce873d Move function body features into their own file 2022-01-12 12:47:28 +00:00
Henry Mercer
865fb5d0ef Migrate representative entity -> representative function 2022-01-12 12:47:27 +00:00
Henry Mercer
0e5b493d0e Remove CodeToFeatures AST consistency checks
We no longer use the `CodeToFeatures` AST, therefore these checks are
defunct.
2022-01-12 12:47:27 +00:00
Henry Mercer
387829bbb4 Extract body tokens from the JS AST, not the CodeToFeatures AST 2022-01-12 12:47:25 +00:00
Henry Mercer
3f70476c87 ATM: Optimize body tokens by pushing in size limit
Pushing the restriction to 256 tokens into the `bodyTokens` predicate
means we avoid this predicate blowing up due to very large functions.

This results in a runtime improvement from 1800s+ to 294s as measured
on a problematic repo on my machine (I didn't wait for the query to
finish running).
2022-01-11 16:16:54 +00:00
Andrew Eisenberg
6d62227576 Merge pull request #7431 from aeisenberg/aeisenberg/solorigate-publish
Solorigate: Extract to separate qlpack
2022-01-06 08:53:32 -08:00
Nick Rolfe
f18492e39b Merge pull request #7443 from github/nickrolfe/behavior
QL4QL: catch behaviour/behavior in ql/non-us-spelling
2021-12-20 13:23:53 +00:00
Henry Mercer
144ec8c629 JS: Update featurization for absent features optimization
Absent features are now represented implicitly by the absence of a row
in the `tokenFeatures` relation, rather than explicitly by an empty
string. This leads to improved runtime performance. To enable this
implicit representation, we pass the set of supported token features to
the `scoreEndpoints` HOP. Requires CodeQL CLI v2.7.4.
2021-12-17 18:04:42 +00:00
Henry Mercer
bebf4ca8fc Merge pull request #7357 from github/henrymercer/js-atm-only-featurize-with-flow
JS: Only featurize endpoints that are part of a flow path
2021-12-17 18:03:40 +00:00
Henry Mercer
055432530f Bump ATM pack version to 0.0.2 2021-12-17 16:49:59 +00:00
Henry Mercer
c1864531cd JS: Push FeaturizationConfig context into more predicates 2021-12-17 16:31:56 +00:00
Henry Mercer
383437c571 JS: Only featurize endpoints that are part of a flow path 2021-12-17 16:31:56 +00:00
Nick Rolfe
28912c508f Fix non-US spelling of 'behavior' 2021-12-17 15:29:31 +00:00
Andrew Eisenberg
50ee4ab330 Solorigate: Extract to separate qlpack
Extracts solorigate to separate qlpacks in preparation for
publishing them to the registry.
2021-12-16 16:09:20 -08:00
Ian Wright
1c79d1f985 Merge pull request #7352 from github/esbena/atm-endpoint-polish
ATM Endpoint filtering improvements
2021-12-14 08:19:23 +00:00
Esben Sparre Andreasen
1949a4e59a autoformat 2021-12-13 22:21:52 +01:00
Esben Sparre Andreasen
13288be7fc make ATM anti sink model for dojo.require 2021-12-10 15:07:51 +01:00
Esben Sparre Andreasen
cfd2dcffa0 recognize more modelled database accesses 2021-12-10 14:54:59 +01:00
Esben Sparre Andreasen
b0f6cf1491 expose more marsdb calls as database accesses 2021-12-10 13:46:19 +01:00
Esben Sparre Andreasen
10498c3643 treat jQuery as fully modelled 2021-12-10 12:51:45 +01:00
Esben Sparre Andreasen
a1ee900f50 treat Base64 manipulations as non-sinks 2021-12-10 12:37:44 +01:00
Henry Mercer
f08f07e19e JS: Improve handling of heuristic sinks in endpoint filters
Previously heuristic sinks were always included, to avoid us filtering
them out due to not being an argument to an external library call.
In this commit we move the argument to an external library call
filtering to the query-specific endpoint filters.
This lets us filter out heuristic sinks if they match one of the other
endpoint filters, reducing FPs.
2021-12-09 15:00:54 +00:00
Henry Mercer
322e39446d JS: Autoformat 2021-12-07 14:17:11 +00:00
Henry Mercer
016727d6b6 JS: Fix occasional duplicate body tokens
0e31439 introduces some occasional duplicate tokens due to duplicate AST
node attributes. The long-term fix is to update `CodeToFeatures.qll`,
but for the short-term, we update the concatenation to concatenate
unique (location, token) pairs.
2021-12-07 14:16:48 +00:00
Aditya Sharad
f68a40f82b JS: Simplify calculation of token features for endpoints
Use a `strictcount` to identify whether there is exactly one feature or not.
If so, we use it. If not, we use the empty string.
Add context to ensure we filter the set of data flow nodes down to only
the set of endpoint nodes.

This performance optimisation avoids calculating the Cartesian product
of data flow nodes and feature names, but it does not avoid calculating
the (slightly smaller) Cartesian product of endpoint nodes and feature names.
Product size = number of endpoint nodes * number of feature names.
At time of writing there are 8 feature names.
2021-12-03 14:20:27 -08:00
Aditya Sharad
fac2769d85 JS: Replace an exists+concat with an equivalent strictconcat 2021-12-03 14:20:26 -08:00
Aditya Sharad
0e31439b7e JS: Simplify aggregation of tokens into entity strings
Change the cutoff logic from `count` to `strictcount`, since we know it only applies
to a non-empty set of results.

Use a single `strictconcat` aggregate to combine tokens in order of location,
instead of computing a `rank` followed by a `concat`.

Strictness introduces a slight change of behaviour because missing tokens will now result
in no results from the predicate rather than an empty feature string.
2021-12-03 14:20:26 -08:00
Aditya Sharad
d0840afb80 JS: Fix compilation errors in EndpointFeatures library
Use the LabelParameter API instead of manually constructing the edge label.
2021-12-03 14:20:17 -08:00
Erik Krogh Kristensen
08ce03cd93 Merge branch 'main' into explicit-this 2021-11-24 15:24:58 +01:00