tree-sitter-souffle

Crates.iotree-sitter-souffle
lib.rstree-sitter-souffle
version0.4.0
sourcesrc
created_at2022-10-21 22:44:10.567489
updated_at2022-10-24 18:54:28.984072
descriptionSoufflé grammar for the tree-sitter parsing library
homepagehttps://github.com/langston-barrett/tree-sitter-souffle
repositoryhttps://github.com/langston-barrett/tree-sitter-souffle
max_upload_size
id693973
size776,408
Langston Barrett (langston-barrett)

documentation

README

tree-sitter-souffle

crates.io

A tree-sitter grammar for Soufflé.

Contents

Table of Contents

Status

The grammar is fairly complete. It parses:

There are a few known issues.

Use-Cases

This grammar aims to support the following use-cases:

  • Editor integration
    • Code navigation (e.g., jump-to-definition, show references)
    • Syntax highlighting
  • Evaluation
  • Linting
  • Static analysis

It does not currently aim to support round-trip printing (i.e., code formatting and refactoring).

On the C Pre-Processor

This parser has limited support for parsing C pre-processor #line directives, which may be helpful for analysis tasks. To avoid over-complicating the grammar, it does not handle directives in the middle of top-level entities (e.g., in between two conjuncts of a rule). You can configure your preprocessor to not emit such tokens (-P for mcpp).

Design

The grammar doesn't mirror the structure of the Soufflé C++ parser implementation nor the grammar as presented in the Soufflé documentation. It instead tries to map nonterminals to more abstract categories, i.e., categories that match the way we think about the language. For instance, the Soufflé C++ parser doesn't have a nonterminal for constants, whereas this parser does. As another example, the Soufflé documentation has this specification for type declarations:

type_decl ::= '.type' IDENT ("<:" type_name | "=" ( type_name ( "|" type_name )* | record_list | adt_branch ( "|" adt_branch )* ))

This grammar instead has something more like

type_decl ::= subtype | type_synonym | type_union | record | adt

so that the parse tree encodes a bit more "semantic" information. The hope is that these choices make it easier to write tree-sitter queries.

For the same reason, the grammar doesn't specify keywords and operators like .functor, eqrel, *, and :- as their own nonterminals, but rather inlines them into the grammar.

Development

Tests

Run

tree-sitter test

or equivalently

npm test

This script downloads the Soufflé, ddisasm, cclyzer++, and other repos and attempts to parse their Datalog files.

./scripts/parse-examples

References

Releasing

  1. Update CHANGELOG.md
  2. Update the version number in Cargo.toml
  3. git checkout main && git pull origin && git tag -a vX.Y.Z -m vX.Y.Z && git push --tags
  4. cargo publish
  5. Release the pre-release created by CI
Commit count: 46

cargo fmt