Commit Graph

55 Commits

Author SHA1 Message Date
Paolo Tranquilli
e1f92b466f Merge pull request #18608 from github/aibaars/tracing
Rust: use tracing/tracing-subscriber for logging
2025-02-14 10:36:54 +01:00
Tom Hvitved
02fd23e53a Rust extractors: Normalize drive letter paths with a trailing / 2025-02-13 14:02:39 +01:00
Arthur Baars
5621eecc86 Rust: config: replace verbose with verbosity 2025-02-12 10:27:33 +01:00
Arthur Baars
c602e82ac4 Rust: use tracing-subscriber 2025-02-12 10:27:29 +01:00
Arthur Baars
a8fbb37569 TreeSitter extractors: log fewer lines
Printing a line for every extracted file is too verbose and for large projects makes it impossible to view the log in the Actions UI.
2025-02-07 12:28:17 +01:00
Paolo Tranquilli
d2c7decd02 Rust/Ruby: upgrade all cargo dependencies excluding rust-analyzer
The rust-analyzer update will need more work as it seems to break rust
analysis on windows.

This was carried out using `cargo upgrade` from `cargo-edit`:
* getting exclusions options for rust-analyzer with
   ```bash
   cargo upgrade -i --dry-run | grep -o 'ra_ap_\S\+' | sort -u | sed 's/^/--exclude=/' > /tmp/exclude
   ```
* running
   ```bash
   cargo upgrade -i $(cat /tmp/exclude)
   misc/bazel/3rdparty/update_cargo_deps.sh
   ```
2025-01-08 09:57:11 +01:00
Cornelius Riemenschneider
a66f8209f9 Rust: Vendor 3rdparty dependencies.
We've been observing some performance issues using crate_universe on CI.
Therefore, we're moving to vendor the auto-generated BUILD files
in our repository. This should provide a nice speed boost, while
getting rid of the complexity of the "rust cache" job we've been using
when we had a lot of git dependencies.

This PR includes a vendor script, and I'll put up a CI job internally
that runs that vendor script on Cargo.toml and Cargo.lock changes, to check
that the vendored files are in sync.
2024-11-13 13:22:14 +01:00
Cornelius Riemenschneider
e8aa5db07a Rust: Update cargo dependencies.
There was a recent round of tree-sitter-* package releases,
so the latest code is now a) released and b) available on crates.io.

Therefore, move away from the (super slow on CI) git dependencies to released crates instead.
This also includes a run of `cargo update`, so there's a bunch of more changes to the lockfile.
2024-11-11 12:13:14 +01:00
Paolo Tranquilli
cb53911224 Merge branch 'main' into redsun82/rust-cli-flags 2024-09-16 09:36:06 +02:00
Paolo Tranquilli
b4b680775c Rust: integrate into standard files+location library 2024-09-12 13:17:10 +02:00
Paolo Tranquilli
0a8c0f5ab4 Rust: fix bazel build 2024-09-12 08:46:50 +02:00
Paolo Tranquilli
f8c9d96882 Bazel: remove non-working fake tree-sitter-extractor workaround
The `.cargo/config.toml` override based workaround wasn't really
working, as while `cargo build|check` was reading that, `cargo metadata`
wasn't, ending up in a completely broken IDE experience.

For the moment, we just use a unified workspace `Cargo.toml` for all
extractors using the shared tree-sitter code, which has the downside of
making bazel pull in dependencies for all of them, and not being able to
do sparse checkouts for them. We should investigate and rivist this in
the future.
2024-09-11 08:17:11 +02:00
Paolo Tranquilli
6f36ea9188 Merge branch 'main' into rust-experiment
Conflicts:
  shared/tree-sitter-extractor/src/trap.rs
2024-09-09 14:15:34 +02:00
Paolo Tranquilli
2c472dd5b8 Tree-sitter: fix formatting 2024-09-09 11:59:17 +02:00
Paolo Tranquilli
4454566d8d Tree-sitter: allow multiple sources per trap file
This generalizes the location cache to allow multiple sources to be
extracted in the same trap file, by adding `file_label` to `Location`,
and therefore to location cache keys. This will be used by the Rust
extractor.
2024-09-09 09:17:45 +02:00
Paolo Tranquilli
b23e482ed2 Merge branch 'main' into rust-experiment 2024-09-05 12:29:29 +02:00
Tom Hvitved
eb1b2a5594 Bump tree-sitter to 0.23.0 2024-09-04 09:47:59 +02:00
Paolo Tranquilli
7e1290aa74 Rust: reuse shared rust trap library 2024-08-30 16:08:37 +02:00
Tom Hvitved
beeae69845 Tree-sitter: Verbosity fixes 2024-05-31 20:10:19 +02:00
Tom Hvitved
d6a3765597 Tree-sitter: Allow for multiple file lists in simple extractor 2024-05-31 11:15:21 +02:00
Tom Hvitved
94d2e9591d Tree-sitter: Emit empty_location relation to avoid scan 2024-05-27 10:39:21 +02:00
Cornelius Riemenschneider
8c46b61e85 Ruby: Change how we pull in shared/tree-sitter-extractor dependency
Previously, we pulled in the shared tree-sitter extractor via a `git`
dependency in `Cargo.toml` to address a `rules_rust` limitation (no `path`
dependencies outside of the cargo workspace)). This was a problem,
as that means we're cloning `github/codeql` _again_ for the build, which is
quite slow.

I found another way that is faster, and still produces correct builds
for both `cargo`` and `rules_rust`:
* Cargo depends on a fake crate that has the same dependencies as the real crate (thanks to `sync-files.py`). Therefore, cargo pulls in the right dependencies into the lockfile, which bazel targets
* For local builds, we override the path to that dependency in a cargo config, so we're pulling in the correct code
* rules_rust only uses `path` dependencies for collecting transitive dependencies, it never pulls in the code from there. So far that, we manually provide a `BUILD.bazel` file for the shared extractor, and depend on that.
2024-05-24 15:37:35 +02:00
Tom Hvitved
e4cd9d86f6 Tree-sitter: Respect verbosity defined in CODEQL_VERBOSITY 2024-05-23 13:38:35 +02:00
Tom Hvitved
a523be4d0a Tree-sitter: Add set_tracing_level to shared extractor module 2024-05-23 12:58:53 +02:00
Tom Hvitved
bf2ae9890f Tree-sitter: Bump to 0.22.6 2024-05-21 11:14:06 +02:00
Tom Hvitved
cee6f003fd Tree-sitter: Split up ast_node_info table into two tables 2024-03-19 10:52:37 +01:00
Rasmus Wriedt Larsen
07223031e8 Merge branch 'main' into lgtm_index_filter_handling 2024-02-26 09:56:02 +01:00
Nick Rolfe
514a92d5bd Tree-sitter extractors: use fresh IDs for locations
Since locations for any given source file are never referenced in any
TRAP files besides the one for that particular source file, it's not
necessary to use global IDs. Using fresh IDs will reduce the size of the
ID pool (both on disk and in memory) and the speed of multi-threaded
TRAP import.

The one exception is the empty location, which still uses a global ID.
2024-02-02 15:06:10 +00:00
Rasmus Wriedt Larsen
f20d4e22fe Handle only exclude 2024-01-18 13:54:45 +01:00
Rasmus Wriedt Larsen
54c7c5e8be Tree sitter extractor: Proper handling of LGTM_INDEX_FILTERS
If someone had used `LGTM_INDEX_FILTERS=exclude:**/*\ninclude:*.rb`
before, we would have mistakenly excluded all files :|
(LGTM_INDEX_FILTERS is a prioritized list where later matches take
priority over earlier ones)

This change is needed to support adding `exclude:**/*` as the first
filter if `paths` include a glob, which currently causes bad behavior in
the Python extractor. However, we can first introduce that change once
this PR has been merged.

I realize this change can cause more folders and files to be traversed
(since they are not just skipped with --exclude). We plan to make a
better long term fix which should bring back the previous performance.
2024-01-18 11:44:31 +01:00
Taus
ff35f9fb8c Shared: Clean up NodeInfo in shared extractor
I was perusing the shared extractor the other day, when I came across
the `NodeInfo` struct. I noticed that the `fields` and `subtypes` fields
on this struct had two seemingly identical ways of expressing the same
thing: `None` and `Some(empty)` (where `empty` is respectively the empty
map and the empty vector). As far as I can tell, there's no semantic
difference in either case, so we can just elide the option type entirely
and use the empty value directly. This has the nice side-effect of
cleaning up some of the other code.
2023-09-27 12:29:07 +00:00
Harry Maclean
b76842ad3d Shared: Fix clippy lint 2023-08-23 16:24:57 +01:00
Harry Maclean
3680613f2d Shared: Restrict extractor file globs to filenames 2023-08-23 16:09:56 +01:00
Harry Maclean
cc7ef5dac1 Shared: Fix clippy lint in shared extractor 2023-08-23 14:11:22 +01:00
Harry Maclean
ed40d72e4f Shared: Bump extractor version 2023-08-23 14:11:22 +01:00
Harry Maclean
7e2abf20c6 Shared: Support glob patterns in shared extractor
Replace the `file_extensions` field with `file_globs`, which supports
UNIX style glob patterns powered by the `globset` crate.

This allows files with no extension (e.g. Dockerfiles) to be extracted,
by specifying a glob such as `*Dockerfile`.

One surprising aspect of this change is that the globs match against the
whole path, rather than just the file name.

This is a breaking change.
2023-08-23 14:11:21 +01:00
Arthur Baars
2416568489 Tree-sitter-xtractor: fix clippy warnings 2023-05-22 19:37:58 +02:00
Arthur Baars
d2bc66e393 QL: switch to shared YAML extractor 2023-05-22 19:28:59 +02:00
Arthur Baars
9f83dd5c7a Tree-sitter extractor: extract shared dbscheme fragments into 'prefix.dbscheme' 2023-05-22 19:28:51 +02:00
Harry Maclean
48f22681a5 Merge pull request #13029 from hmac/ruby-autobuilder-refactor
Shared: Share autobuilder code between Ruby and QL
2023-05-12 18:24:06 +07:00
Harry Maclean
9203efbdc4 Shared: Share autobuilder code between Ruby and QL 2023-05-05 07:20:14 +00:00
Harry Maclean
c7e8f0d12a Shared: Pin rust version for shared extractor 2023-05-05 06:36:55 +00:00
Harry Maclean
a577bec22c Shared: Fix clippy warnings in shared extractor 2023-05-05 06:30:12 +00:00
Harry Maclean
8a89aec220 Shared: Handle trap compression option properly
Extracting the compression setting from an environment variable is the
responsibility of the API consumer.
2023-04-27 05:06:57 +00:00
Harry Maclean
3f6087e179 Shared: formatting 2023-04-23 06:04:55 +00:00
Harry Maclean
9005684b10 Shared: Add integration test for shared extractor
This is a very basic test but provides some confidence that the extractor is
working.
2023-04-23 05:29:22 +00:00
Harry Maclean
ac1d250596 Shared: fix language prefix in extractor 2023-04-21 15:07:47 +07:00
Harry Maclean
8091d57f03 Shared: Remove unused type 2023-04-20 08:07:40 +07:00
Harry Maclean
c4d7658cc6 Shared: high level API for the shared extractor
This API makes it easy to create an extractor for simple use cases.
2023-04-20 08:07:40 +07:00
Harry Maclean
2107533822 Shared: Clippy fixes
Use clearer methods where appropriate.
2023-04-05 18:46:57 +08:00