lindera-sqlite

Crates.iolindera-sqlite
lib.rslindera-sqlite
version1.0.2
created_at2024-12-06 06:38:19.644845+00
updated_at2025-09-14 16:12:43.253715+00
descriptionLindera tokenizer for SQLite FTS5 extention
homepagehttps://github.com/lindera/lindera-sqlite
repositoryhttps://github.com/lindera/lindera-sqlite
max_upload_size
id1473966
size135,065
Minoru OSUKA (mosuka)

documentation

https://docs.rs/lindera-sqlite

README

Overview

lindera-sqlite is a C ABI library which exposes a FTS5 tokenizer function.

When used as a custom FTS5 tokenizer this enables application to support Chinese, Japanese and Korean in full-text search.

Build extension

% cargo build --features=embedded-cjk

Set enviromment variable for Lindera configuration

% export LINDERA_CONFIG_PATH=./resources/lindera.yml

Then start SQLite

% sqlite3 example.db

Load extension

sqlite> .load ./target/debug/liblindera_sqlite lindera_fts5_tokenizer_init

Create table using FTS5 with Lindera tokenizer

sqlite> CREATE VIRTUAL TABLE example USING fts5(content, tokenize='lindera_tokenizer');

Insert data

sqlite> INSERT INTO example(content) VALUES ("Linderaは形態素解析エンジンです。ユーザー辞書も利用可能です。");

Search data

sqlite> SELECT * FROM example WHERE content MATCH "Lindera" ORDER BY bm25(example) LIMIT 10;
Commit count: 62

cargo fmt