This code was a bit of a performance cringe. It copied every character into a temporary array, copied that into a String, and slow-appended that onto another String. Note that the call to Characters.toChars is redundant here as advance() doesn't return a code point; it returns -1 or a UTF-16 char. The -1 case is checked for before reaching the call, so we can just cast it to a char and use it directly. We use a StringBuilder to accumulate the string. Normally it's faster to track the start/end indices and do a substring(), but that won't work in the JSDoc extractor because of the star-skipping logic in advance().
JavaScript extractor
This directory contains the source code of the JavaScript extractor. The extractor depends on various libraries that are not currently bundled with the source code, so at present it cannot be built in isolation.
The extractor consists of a parser for the latest version of ECMAScript, including a few proposed and historic extensions (see src/com/semmle/jcorn), classes for representing JavaScript and TypeScript ASTs (src/com/semmle/js/ast and src/com/semmle/ts/ast), and various other bits of functionality. Historically, the main entry point of the JavaScript extractor has been com.semmle.js.extractor.Main. However, this class is slowly being phased out in favour of com.semmle.js.extractor.AutoBuild, which is the entry point used by CodeQL.
License
Like the CodeQL queries, the JavaScript extractor is licensed under the MIT License by GitHub. Some code is derived from other projects, whose licenses are noted in other LICENSE-*.md files in this folder.