mirror of
https://github.com/github/codeql.git
synced 2026-05-24 16:17:07 +02:00
JS: Improve performance of ClassifyFiles::isTestFile
One of the heuristics for test files looks for source files of the form `base.ext`, then looks for sibling test files of the form `base.test.ext` or `base.spec.ext`. On large databases, the result join order computed all source files, the containers of those files, then all other files within those containers, before computing the test file names and filtering using those names. The product of all files with all other files in the same containers is of the same order of magnitude as the product of the `files` table with itself, which on large DBs like Node can be 12M+ tuples. As a performance optimisation, factor out a helper predicate that computes the likely test file names for each source file, so these can be determined with a single join against the files table. This results in much better join orders, such as computing the set of files and their containers, then the test file names, then the sibling files with those names. This loses some flexibility because the set of 'test' extension names is hardcoded in the library rather than provided by the caller predicate. The original predicate remains to avoid breaking other callers, but could eventually be deprecated.
This commit is contained in:
committed by
Henry Mercer
parent
03bb3cce73
commit
b7852cec7a
@@ -56,9 +56,7 @@ predicate isGeneratedCodeFile(File f) { isGenerated(f.getATopLevel()) }
|
||||
predicate isTestFile(File f) {
|
||||
exists(Test t | t.getFile() = f)
|
||||
or
|
||||
exists(string stemExt | stemExt = "test" or stemExt = "spec" |
|
||||
f = getTestFile(any(File orig), stemExt)
|
||||
)
|
||||
f = getATestFile(_)
|
||||
or
|
||||
f.getAbsolutePath().regexpMatch(".*/__(mocks|tests)__/.*")
|
||||
}
|
||||
|
||||
@@ -40,7 +40,7 @@ class BDDTest extends Test, @call_expr {
|
||||
|
||||
/**
|
||||
* Gets the test file for `f` with stem extension `stemExt`.
|
||||
* That is, a file named file named `<base>.<stemExt>.<ext>` in the
|
||||
* That is, a file named `<base>.<stemExt>.<ext>` in the
|
||||
* same directory as `f` which is named `<base>.<ext>`.
|
||||
*/
|
||||
bindingset[stemExt]
|
||||
@@ -48,6 +48,33 @@ File getTestFile(File f, string stemExt) {
|
||||
result = f.getParentContainer().getFile(f.getStem() + "." + stemExt + "." + f.getExtension())
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets a test file for `f`.
|
||||
* That is, a file named `<base>.<stemExt>.<ext>` in the
|
||||
* same directory as `f`, where `f` is named `<base>.<ext>` and
|
||||
* `<stemExt>` is a well-known test file identifier, such as `test` or `spec`.
|
||||
*/
|
||||
File getATestFile(File f) {
|
||||
result = f.getParentContainer().getFile(getATestFileName(f))
|
||||
}
|
||||
|
||||
/**
|
||||
* Gets a name of a test file for `f`.
|
||||
* That is, `<base>.<stemExt>.<ext>` where
|
||||
* `f` is named `<base>.<ext>` and `<stemExt>` is
|
||||
* a well-known test file identifier, such as `test` or `spec`.
|
||||
*/
|
||||
// Helper predicate factored out for performance.
|
||||
// This predicate is linear in the size of f, and forces
|
||||
// callers to join only once against f rather than two separate joins
|
||||
// when computing the stem and the extension.
|
||||
// This loses some flexibility because callers cannot specify
|
||||
// an arbitrary stemExt.
|
||||
pragma[nomagic]
|
||||
private string getATestFileName(File f) {
|
||||
result = f.getStem() + "." + ["test", "spec"] + "." + f.getExtension()
|
||||
}
|
||||
|
||||
/**
|
||||
* A Jest test, that is, an invocation of a global function named
|
||||
* `test` where the first argument is a string and the second
|
||||
|
||||
Reference in New Issue
Block a user