linguisto

Crates.iolinguisto
lib.rslinguisto
version0.1.2
created_at2026-01-15 09:30:34.118248+00
updated_at2026-01-15 12:18:52.504811+00
descriptionA high-performance code language analysis tool based on github-linguist
homepage
repositoryhttps://github.com/HomyeeKing/linguist
max_upload_size
id2045061
size222,578
Homyee King ホミイ (HomyeeKing)

documentation

README

Linguisto

简体中文

NPM version NPM downloads License

Introduction

Linguisto is a high-performance code language analysis tool based on github-linguist. Built with Rust and providing Node.js bindings via NAPI-RS, it quickly scans directories to count files, calculate byte sizes, and determine language percentages, while intelligently filtering out third-party dependencies and ignored files.

Features

  • Superior Performance: Written in Rust, leveraging multi-threading for fast file system traversal.
  • Smart Filtering: Automatically respects .gitignore, skips hidden files, and excludes vendored files (e.g., node_modules).
  • Precise Detection: Based on robust language detection algorithms, supporting filename, extension, and content-based disambiguation.
  • Beautiful Output: Provides a colorful terminal UI with progress bars, supporting sorting by bytes or file count.
  • Data Integration: Supports JSON output for easy integration with other tools.
  • Cross-platform: Supports macOS, Linux, Windows, and WASI environments.

Table of Contents

Install

For CLI

If you have Rust installed, you can install it via Cargo:

cargo install linguisto

Or install it globally via npm:

npm install -g @homy/linguist

For API

Install it as a dependency in your Node.js project:

npm install @homy/linguist

Usage

CLI Usage

Run it in the current directory to see an intuitive language distribution chart (sorted by byte size by default):

linguisto

Analyze a specific directory:

linguisto /path/to/your/project

Common Options

  • --json: Output results in JSON format.
  • --all: Show all calculated language statistics. (Currently behaves the same as the default view, kept for compatibility.)
  • --sort <type>: Sort results. type can be file_count (descending) or bytes (descending, default).
  • --max-lang <number>: Maximum number of languages to display individually. Remaining languages will be grouped into "Other" (default: 6).

Example

# Get JSON stats for the current project sorted by file count
linguisto . --json --sort=file_count

Programmatic Usage

You can call the API provided by @homy/linguist directly in your Node.js or TypeScript code.

const { analyzeDirectory, analyzeDirectorySync } = require('@homy/linguist');

// Asynchronous analysis (recommended for large directories)
async function run() {
  const stats = await analyzeDirectory('./src');
  console.log(stats);
}

run();

// Synchronous analysis
const syncStats = analyzeDirectorySync('./src');
console.log(syncStats);

Benchmark

This project includes a simple benchmark comparing linguisto (native Rust + NAPI) with the pure JavaScript implementation linguist-js.

The benchmark script is located at benchmark/bench.ts and can be run with:

pnpm bench

Below is a sample result on this repository (macOS, Node.js, default settings):

Running benchmark...
┌─────────┬──────────────────────┬────────────────────┬─────────────────────┬────────────────────────┬────────────────────────┬─────────┐
│ (index) │ Task name            │ Latency avg (ns)   │ Latency med (ns)    │ Throughput avg (ops/s) │ Throughput med (ops/s) │ Samples │
├─────────┼──────────────────────┼────────────────────┼─────────────────────┼────────────────────────┼────────────────────────┼─────────┤
│ 0       │ 'linguist-js'        │ '87352945 ± 0.86%' │ '86415417 ± 955312' │ '11 ± 0.79%'           │ '12 ± 0'               │ 64      │
│ 1       │ 'linguisto (native)' │ '2903259 ± 2.20%'  │ '3118667 ± 52375'   │ '363 ± 2.64%'          │ '321 ± 5'              │ 345     │
└─────────┴──────────────────────┴────────────────────┴─────────────────────┴────────────────────────┴────────────────────────┴─────────┘

From this run, linguisto (native) achieves roughly 30× higher throughput than linguist-js on this project, thanks to Rust's multi-threaded file system traversal and native execution.

References

analyzeDirectory(dir)

  • Type: (dir: string) => Promise<LanguageStat[]>

Asynchronously analyzes the target directory and returns an array of language statistics.

analyzeDirectorySync(dir)

  • Type: (dir: string) => LanguageStat[]

Synchronously analyzes the target directory and returns an array of language statistics.

LanguageStat

Each statistical object contains the following fields:

Field Type Description
lang string Detected language name (e.g., "Rust", "TypeScript")
count number Number of files for this language
bytes number Total bytes occupied by files of this language
ratio number Percentage in the overall project (0.0 - 1.0)

Credits

License

MIT © Homyee King

Commit count: 0

cargo fmt