\input texinfo @iftex @afourpaper @headings double @end iftex @titlepage @afourpaper @sp 7 @center @titlefont{QuickJS Javascript Engine} @sp 3 @end titlepage @setfilename spec.info @settitle QuickJS Javascript Engine @contents @chapter Introduction QuickJS is a small and embeddable Javascript engine. It supports the ES2019 specification including modules, asynchronous generators and proxies. It optionally supports mathematical extensions such as big integers (BigInt), big floating point numbers (BigFloat) and operator overloading. @section Main Features @itemize @item Small and easily embeddable: just a few C files, no external dependency, 180 KiB of x86 code for a simple ``hello world'' program. @item Fast interpreter with very low startup time: runs the 56000 tests of the ECMAScript Test Suite@footnote{@url{https://github.com/tc39/test262}} in about 100 seconds on a single core of a desktop PC. The complete life cycle of a runtime instance completes in less than 300 microseconds. @item Almost complete ES2019 support including modules, asynchronous generators and full Annex B support (legacy web compatibility). @item Passes 100% of the ECMAScript Test Suite tests. @item Can compile Javascript sources to executables with no external dependency. @item Garbage collection using reference counting (to reduce memory usage and have deterministic behavior) with cycle removal. @item Mathematical extensions: BigInt, BigFloat, operator overloading, bigint mode, math mode. @item Command line interpreter with contextual colorization and completion implemented in Javascript. @item Small built-in standard library with C library wrappers. @end itemize @chapter Usage @section Installation A Makefile is provided to compile the engine on Linux or MacOS/X. A preliminary Windows support is available thru cross compilation on a Linux host with the MingGW tools. Edit the top of the @code{Makefile} if you wish to select specific options then run @code{make}. You can type @code{make install} as root if you wish to install the binaries and support files to @code{/usr/local} (this is not necessary to use QuickJS). @section Quick start @code{qjs} is the command line interpreter (Read-Eval-Print Loop). You can pass Javascript files and/or expressions as arguments to execute them: @example ./qjs examples/hello.js @end example @code{qjsc} is the command line compiler: @example ./qjsc -o hello examples/hello.js ./hello @end example generates a @code{hello} executable with no external dependency. @code{qjsbn} and @code{qjscbn} are the corresponding interpreter and compiler with the mathematical extensions: @example ./qjsbn examples/pi.js 1000 @end example displays 1000 digits of PI. @example ./qjsbnc -o pi examples/pi.js ./pi 1000 @end example compiles and executes the PI program. @section Command line options @subsection @code{qjs} interpreter @verbatim usage: qjs [options] [files] @end verbatim Options are: @table @code @item -h @item --help List options. @item -e @code{EXPR} @item --eval @code{EXPR} Evaluate EXPR. @item -i @item --interactive Go to interactive mode (it is not the default when files are provided on the command line). @item -m @item --module Load as ES6 module (default if .mjs file extension). @end table Advanced options are: @table @code @item -d @item --dump Dump the memory usage stats. @item -q @item --quit just instantiate the interpreter and quit. @end table @subsection @code{qjsc} compiler @verbatim usage: qjsc [options] [files] @end verbatim Options are: @table @code @item -c Only output bytecode in a C file. The default is to output an executable file. @item -e Output @code{main()} and bytecode in a C file. The default is to output an executable file. @item -o output Set the output filename (default = @file{out.c} or @file{a.out}). @item -N cname Set the C name of the generated data. @item -m Compile as Javascript module (default if @file{.mjs} extension). @item -M module_name[,cname] Add initialization code for an external C module. See the @code{c_module} example. @item -x Byte swapped output (only used for cross compilation). @item -flto Use link time optimization. The compilation is slower but the executable is smaller and faster. This option is automatically set when the @code{-fno-x} options are used. @item -fno-[eval|string-normalize|regexp|json|proxy|map|typedarray|promise] Disable selected language features to produce a smaller executable file. @end table @section @code{qjscalc} application The @code{qjscalc} application is a superset of the @code{qjsbn} command line interpreter implementing a Javascript calculator with arbitrarily large integer and floating point numbers, fractions, complex numbers, polynomials and matrices. The source code is in @file{qjscalc.js}. More documentation and a web version are available at @url{http://numcalc.com}. @section Built-in tests Run @code{make test} to run the few built-in tests included in the QuickJS archive. @section Test262 (ECMAScript Test Suite) A test262 runner is included in the QuickJS archive. For reference, the full test262 tests are provided in the archive @file{qjs-tests-yyyy-mm-dd.tar.xz}. You just need to untar it into the QuickJS source code directory. Alternatively, the test262 tests can be installed with: @example git clone https://github.com/tc39/test262.git test262 cd test262 git checkout 94b1e80ab3440413df916cd56d29c5a2fa2ac451 patch -p1 < ../tests/test262.patch cd .. @end example The patch adds the implementation specific @code{harness} functions and optimizes the inefficient RegExp character classes and Unicode property escapes tests (the tests themselves are not modified, only a slow string initialization function is optimized). The tests can be run with @example make test2 @end example For more information, run @code{./run-test262} to see the options of the test262 runner. The configuration files @code{test262.conf} and @code{test262bn.conf} contain the options to run the various tests. @chapter Specifications @section Language support @subsection ES2019 support The ES2019 specification @footnote{@url{https://tc39.github.io/ecma262/}} is almost fully supported including the Annex B (legacy web compatibility) and the Unicode related features. The following features are not supported yet: @itemize @item Realms (althougth the C API supports different runtimes and contexts) @item Tail calls@footnote{We believe the current specification of tails calls is too complicated and presents limited practical interests.} @end itemize @subsection JSON The JSON parser is currently more tolerant than the specification. @subsection ECMA402 ECMA402 (Internationalization API) is not supported. @subsection Extensions @itemize @item The directive @code{"use strip"} indicates that the debug information (including the source code of the functions) should not be retained to save memory. As @code{"use strict"}, the directive can be global to a script or local to a function. @item The first line of a script beginning with @code{#!} is ignored. @end itemize @subsection Mathematical extensions The mathematical extensions are available in the @code{qjsbn} version and are fully backward compatible with standard Javascript. See @code{jsbignum.pdf} for more information. @itemize @item The @code{BigInt} (big integers) TC39 proposal is supported. @item @code{BigFloat} support: arbitrary large floating point numbers in base 2. @item Operator overloading. @item The directive @code{"use bigint"} enables the bigint mode where integers are @code{BigInt} by default. @item The directive @code{"use math"} enables the math mode where the division and power operators on integers produce fractions. Floating point literals are @code{BigFloat} by default and integers are @code{BigInt} by default. @end itemize @section Modules ES6 modules are fully supported. The default name resolution is the following: @itemize @item Module names with a leading @code{.} or @code{..} are relative to the current module path. @item Module names without a leading @code{.} or @code{..} are system modules, such as @code{std} or @code{os}. @item Module names ending with @code{.so} are native modules using the QuickJS C API. @end itemize @section Standard library The standard library is included by default in the command line interpreter. It contains the two modules @code{std} and @code{os} and a few global objects. @subsection Global objects @table @code @item scriptArgs Provides the command line arguments. The first argument is the script name. @item print(...args) Print the arguments separated by spaces and a trailing newline. @item console.log(...args) Same as print(). @end table @subsection @code{std} module The @code{std} module provides wrappers to the libc @file{stdlib.h} and @file{stdio.h} and a few other utilities. Available exports: @table @code @item exit(n) Exit the process. @item evalScript(str) Evaluate the string @code{str} as a script (global eval). @item loadScript(filename) Evaluate the file @code{filename} as a script (global eval). @item Error(errno) @code{std.Error} constructor. Error instances contain the field @code{errno} (error code) and @code{message} (result of @code{std.Error.strerror(errno)}). The constructor contains the following fields: @table @code @item EINVAL @item EIO @item EACCES @item EEXIST @item ENOSPC @item ENOSYS @item EBUSY @item ENOENT @item EPERM @item EPIPE Integer value of common errors (additional error codes may be defined). @item strerror(errno) Return a string that describes the error @code{errno}. @end table @item open(filename, flags) Open a file (wrapper to the libc @code{fopen()}). Throws @code{std.Error} in case of I/O error. @item tmpfile() Open a temporary file. Throws @code{std.Error} in case of I/O error. @item puts(str) Equivalent to @code{std.out.puts(str)}. @item printf(fmt, ...args) Equivalent to @code{std.out.printf(fmt, ...args)} @item sprintf(fmt, ...args) Equivalent to the libc sprintf(). @item in @item out @item err Wrappers to the libc file @code{stdin}, @code{stdout}, @code{stderr}. @item SEEK_SET @item SEEK_CUR @item SEEK_END Constants for seek(). @item global Reference to the global object. @item gc() Manually invoke the cycle removal algorithm. The cycle removal algorithm is automatically started when needed, so this function is useful in case of specific memory constraints or for testing. @item getenv(name) Return the value of the environment variable @code{name} or @code{undefined} if it is not defined. @end table FILE prototype: @table @code @item close() Close the file. @item puts(str) Outputs the string with the UTF-8 encoding. @item printf(fmt, ...args) Formatted printf, same formats as the libc printf. @item flush() Flush the buffered file. @item seek(offset, whence) Seek to a give file position (whence is @code{std.SEEK_*}). Throws a @code{std.Error} in case of I/O error. @item tell() Return the current file position. @item eof() Return true if end of file. @item fileno() Return the associated OS handle. @item read(buffer, position, length) Read @code{length} bytes from the file to the ArrayBuffer @code{buffer} at byte position @code{position} (wrapper to the libc @code{fread}). @item write(buffer, position, length) Write @code{length} bytes to the file from the ArrayBuffer @code{buffer} at byte position @code{position} (wrapper to the libc @code{fread}). @item getline() Return the next line from the file, assuming UTF-8 encoding, excluding the trailing line feed. @item getByte() Return the next byte from the file. @item putByte(c) Write one byte to the file. @end table @subsection @code{os} module The @code{os} module provides Operating System specific functions: @itemize @item low level file access @item signals @item timers @item asynchronous I/O @end itemize The OS functions usually return 0 if OK or an OS specific negative error code. Available exports: @table @code @item open(filename, flags, mode = 0o666) Open a file. Return a handle or < 0 if error. @item O_RDONLY @item O_WRONLY @item O_RDWR @item O_APPEND @item O_CREAT @item O_EXCL @item O_TRUNC POSIX open flags. @item O_TEXT (Windows specific). Open the file in text mode. The default is binary mode. @item close(fd) Close the file handle @code{fd}. @item seek(fd, offset, whence) Seek in the file. Use @code{std.SEEK_*} for @code{whence}. @item read(fd, buffer, offset, length) Read @code{length} bytes from the file handle @code{fd} to the ArrayBuffer @code{buffer} at byte position @code{offset}. Return the number of read bytes or < 0 if error. @item write(fd, buffer, offset, length) Write @code{length} bytes to the file handle @code{fd} from the ArrayBuffer @code{buffer} at byte position @code{offset}. Return the number of written bytes or < 0 if error. @item isatty(fd) Return @code{true} is @code{fd} is a TTY (terminal) handle. @item ttyGetWinSize(fd) Return the TTY size as @code{[width, height]} or @code{null} if not available. @item ttySetRaw(fd) Set the TTY in raw mode. @item remove(filename) Remove a file. Return 0 if OK or < 0 if error. @item rename(oldname, newname) Rename a file. Return 0 if OK or < 0 if error. @item setReadHandler(fd, func) Add a read handler to the file handle @code{fd}. @code{func} is called each time there is data pending for @code{fd}. A single read handler per file handle is supported. Use @code{func = null} to remove the hander. @item setWriteHandler(fd, func) Add a write handler to the file handle @code{fd}. @code{func} is called each time data can be written to @code{fd}. A single write handler per file handle is supported. Use @code{func = null} to remove the hander. @item signal(signal, func) Call the function @code{func} when the signal @code{signal} happens. Only a single handler per signal number is supported. Use @code{null} to set the default handler or @code{undefined} to ignore the signal. @item SIGINT @item SIGABRT @item SIGFPE @item SIGILL @item SIGSEGV @item SIGTERM POSIX signal numbers. @item setTimeout(delay, func) Call the function @code{func} after @code{delay} ms. Return a handle to the timer. @item clearTimer(handle) Cancel a timer. @item platform Return a string representing the platform: @code{"linux"}, @code{"darwin"}, @code{"win32"} or @code{"js"}. @end table @section QuickJS C API The C API was designed to be simple and efficient. The C API is defined in the header @code{quickjs.h}. @subsection Runtime and contexts @code{JSRuntime} represents a Javascript runtime corresponding to an object heap. Several runtimes can exist at the same time but they cannot exchange objects. Inside a given runtime, no multi-threading is supported. @code{JSContext} represents a Javascript context (or Realm). Each JSContext has its own global objects and system objects. There can be several JSContexts per JSRuntime and they can share objects, similary to frames of the same origin sharing Javascript objects in a web browser. @subsection JSValue @code{JSValue} represents a Javascript value which can be a primitive type or an object. Reference counting is used, so it is important to explicitely duplicate (@code{JS_DupValue()}, increment the reference count) or free (@code{JS_FreeValue()}, decrement the reference count) JSValues. @subsection C functions C functions can be created with @code{JS_NewCFunction()}. @code{JS_SetPropertyFunctionList()} is a shortcut to easily add functions, setters and getters properties to a given object. Unlike other embedded Javascript engines, there is no implicit stack, so C functions get their parameters as normal C parameters. As a general rule, C functions take constant @code{JSValue}s as parameters (so they don't need to free them) and return a newly allocated (=live) @code{JSValue}. @subsection Exceptions Exceptions: most C functions can return a Javascript exception. It must be explicitely tested and handled by the C code. The specific @code{JSValue} @code{JS_EXCEPTION} indicates that an exception occured. The actual exception object is stored in the @code{JSContext} and can be retrieved with @code{JS_GetException()}. @subsection Script evaluation Use @code{JS_Eval()} to evaluate a script or module source. If the script or module was compiled to bytecode with @code{qjsc}, @code{JS_EvalBinary()} achieves the same result. The advantage is that no compilation is needed so it is faster and smaller because the compiler can be removed from the executable if no @code{eval} is required. Note: the bytecode format is linked to a given QuickJS version. Moreover, no security check is done before its execution. Hence the bytecode should not be loaded from untrusted sources. That's why there is no option to output the bytecode to a binary file in @code{qjsc}. @subsection JS Classes C opaque data can be attached to a Javascript object. The type of the C opaque data is determined with the class ID (@code{JSClassID}) of the object. Hence the first step is to register a new class ID and JS class (@code{JS_NewClassID()}, @code{JS_NewClass()}). Then you can create objects of this class with @code{JS_NewObjectClass()} and get or set the C opaque point with @code{JS_GetOpaque()}/@code{JS_SetOpaque()}. When defining a new JS class, it is possible to declare a finalizer which is called when the object is destroyed. A @code{gc_mark} method can be provided so that the cycle removal algorithm can find the other objects referenced by this object. Other methods are available to define exotic object behaviors. The Class ID are globally allocated (i.e. for all runtimes). The JSClass are allocated per @code{JSRuntime}. @code{JS_SetClassProto()} is used to define a prototype for a given class in a given JSContext. @code{JS_NewObjectClass()} sets this prototype in the created object. Examples are available in @file{js_libc.c}. @subsection C Modules Native ES6 modules are supported and can be dynamically or statically linked. Look at the @file{test_bjson} and @file{bjson.so} examples. The standard library @file{js_libc.c} is also a good example of a native module. @subsection Memory handling Use @code{JS_SetMemoryLimit()} to set a global memory allocation limit to a given JSRuntime. Custom memory allocation functions can be provided with @code{JS_NewRuntime2()}. The maximum system stack size can be set with @code{JS_SetMaxStackSize()}. @subsection Execution timeout and interrupts Use @code{JS_SetInterruptHandler()} to set a callback which is regularly called by the engine when it is executing code. This callback can be used to implement an execution timeout. It is used by the command line interpreter to implement a @code{Ctrl-C} handler. @chapter Internals @section Bytecode The compiler generates bytecode directly with no intermediate representation such as a parse tree, hence it is very fast. Several optimizations passes are done over the generated bytecode. A stack-based bytecode was chosen because it is simple and generates compact code. For each function, the maximum stack size is computed at compile time so that no runtime stack overflow tests are needed. A separate compressed line number table is maintained for the debug information. Access to closure variables is optimized and is almost as fast as local variables. Direct @code{eval} in strict mode is optimized. @section Executable generation @subsection @code{qjsc} compiler The @code{qjsc} compiler generates C sources from Javascript files. By default the C sources are compiled with the system compiler (@code{gcc} or @code{clang}). The generated C source contains the bytecode of the compiled functions or modules. If a full complete executable is needed, it also contains a @code{main()} function with the necessary C code to initialize the Javascript engine and to load and execute the compiled functions and modules. Javascript code can be mixed with C modules. In order to have smaller executables, specific Javascript features can be disabled, in particular @code{eval} or the regular expressions. The code removal relies on the Link Time Optimization of the system compiler. @subsection Binary JSON @code{qjsc} works by compiling scripts or modules and then serializing them to a binary format. A subset of this format (without functions or modules) can be used as binary JSON. The example @file{test_bjson.js} shows how to use it. Warning: the binary JSON format may change without notice, so it should not be used to store persistent data. The @file{test_bjson.js} example is only used to test the binary object format functions. @section Runtime @subsection Strings Strings are stored either as an 8 bit or a 16 bit array of characters. Hence random access to characters is always fast. The C API provides functions to convert Javascript Strings to C UTF-8 encoded strings. The most common case where the Javascript string contains only ASCII characters involves no copying. @subsection Objects The object shapes (object prototype, property names and flags) are shared between objects to save memory. Arrays with no holes (except at the end of the array) are optimized. TypedArray accesses are optimized. @subsection Atoms Object property names and some strings are stored as Atoms (unique strings) to save memory and allow fast comparison. Atoms are represented as a 32 bit integer. Half of the atom range is reserved for immediate integer literals from @math{0} to @math{2^{31}-1}. @subsection Numbers Numbers are represented either as 32-bit signed integers or 64-bit IEEE-754 floating point values. Most operations have fast paths for the 32-bit integer case. @subsection Garbage collection Reference counting is used to free objects automatically and deterministically. A separate cycle removal pass is done when the allocated memory becomes too large. The cycle removal algorithm only uses the reference counts and the object content, so no explicit garbage collection roots need to be manipulated in the C code. @subsection JSValue It is a Javascript value which can be a primitive type (such as Number, String, ...) or an Object. NaN boxing is used in the 32-bit version to store 64-bit floating point numbers. The representation is optimized so that 32-bit integers and reference counted values can be efficiently tested. In 64-bit code, JSValue are 128-bit large and no NaN boxing is used. The rationale is that in 64-bit code memory usage is less critical. In both cases (32 or 64 bits), JSValue exactly fits two CPU registers, so it can be efficiently returned by C functions. @subsection Function call The engine is optimized so that function calls are fast. The system stack holds the Javascript parameters and local variables. @section RegExp A specific regular expression engine was developped. It is both small and efficient and supports all the ES2019 features including the Unicode properties. As the Javascript compiler, it directly generates bytecode without a parse tree. Backtracking with an explicit stack is used so that there is no recursion on the system stack. Simple quantizers are specifically optimized to avoid recursions. Infinite recursions coming from quantizers with empty terms are avoided. The full regexp library weights about 15 KiB (x86 code), excluding the Unicode library. @section Unicode A specific Unicode library was developped so that there is no dependency on an external large Unicode library such as ICU. All the Unicode tables are compressed while keeping a reasonnable access speed. The library supports case conversion, Unicode normalization, Unicode script queries, Unicode general category queries and all Unicode binary properties. The full Unicode library weights about 45 KiB (x86 code). @section BigInt and BigFloat BigInt and BigFloat are implemented with the @code{libbf} library@footnote{@url{https://bellard.org/libbf}}. It weights about 60 KiB (x86 code) and provides arbitrary precision IEEE 754 floating point operations and transcendental functions with exact rounding. @chapter License QuickJS is released under the MIT license. Unless otherwise specified, the QuickJS sources are copyright Fabrice Bellard and Charlie Gordon. @bye