string-mumu

Crates.iostring-mumu
lib.rsstring-mumu
version0.2.0-rc.2
created_at2025-07-09 06:24:52.463849+00
updated_at2025-10-07 15:28:57.932635+00
descriptionString functions and tools plugin for the Lava / Mumu language
homepagehttps://lava.nu11.uk
repositoryhttps://gitlab.com/tofo/mumu-string
max_upload_size
id1744283
size139,224
(justifiedmumu)

documentation

README

string-mumu

A MuMu/Lava plugin to provide fast, Unicode-aware string tools with friendly currying and data-last ergonomics.

Repository: https://gitlab.com/tofo/string-mumu
License: MIT OR Apache-2.0
Engine compatibility: core-mumu = 0.9.0-rc.3 (host and wasm builds)


What this plugin aims to do

  • to expose a comprehensive set of string:* functions to MuMu/Lava
  • to enable placeholder-driven partial application (_) and data-last calling styles
  • to interoperate cleanly with the MuMu value system (arrays, keyed arrays, refs, regexes, iterators)
  • to behave predictably on Unicode strings (indices in characters, not bytes)
  • to remain wasm-friendly while offering host niceties when available

The plugin registers all symbols via register_all(interp). On native builds it also exports a dynamic entrypoint Cargo_lock(...) so extend("string") can load it at runtime.


Feature overview

1) Unary transforms

  • string:lower(s) — to lowercase
  • string:upper(s) — to uppercase
  • string:length(s) — to return character count
  • string:trim(s) / string:trim_start(s) / string:trim_end(s)
  • string:to_string(v) — to stringify any MuMu Value (arrays, objects, refs, etc.)

All unary functions are curry-able and accept _ as a placeholder.

2) Concatenation & casting

  • string:multi(...args) — to concatenate any number of values after string-coercion
  • string:concat(a, b) — to concatenate two strings (curry-able)

3) Search & containment

  • string:contains(s, pat) — to test substring
  • string:starts_with(prefix, s) / string:ends_with(suffix, s)data-last
  • string:index_of(s, pat) / string:last_index_of(s, pat) — to return character index or -1
  • string:search(pat, s) — to return first match index or -1 (supports regex)

4) Substrings, slices, and positions

  • string:slice(s, start, end) — to slice by character offsets (supports negatives)
  • string:substring(start, end, s) — to use JS-style semantics (clamp/swap; negatives → 0)
  • string:at(index, s) — to get the character at index (negatives from end) or return _
  • string:code_point_at(index, s) — to return the Unicode code point (int) or _

5) Padding & repetition

  • string:pad_start(len, pad, s) / string:pad_end(len, pad, s)data-last; pad defaults to space
  • string:repeat(s, n)

6) Split/join & replace

  • string:split(sep, s) — to return StrArray
  • string:join(sep, arr) — to join, accepting empty arrays of any element type
  • string:replace(s, from, to) — to replace all non-overlapping occurrences
  • string:replace_all(pat, repl, s) — to replace with string or regex pattern
  • string:replace_fn(pat, repl, s) — to accept string or regex; function replacers are intentionally unsupported in this build (use string:replace_all instead)

7) Matching

  • string:match(pat, s) — to return the first match (capturing groups) or _
  • string:match_all(pat, s) — to return a MixedArray of match arrays (supports regex)

8) Locale & normalization

  • string:to_locale_lower(locale, s) / string:to_locale_upper(locale, s) — locale parameter accepted for forward-compat (current build uses Unicode case maps)
  • string:locale_compare(other, options, s) — to compare with optional { sensitivity: "base" | "case" | "accent" | "variant" }
  • string:normalize(form, s) — to accept "NFC" | "NFD" | "NFKC" | "NFKD"; current build returns s unchanged (ICU-backed normalization can be wired later)

Calling style, currying & placeholders

All multi-arg functions support:

  • currying with _ placeholders:

    extend("string")
    has_http = string:starts_with("http")
    has_http("https://lava.nu11.uk")  // true
    
  • data-last ergonomics for many functions (starts_with, ends_with, substring, pad_*, to_locale_*, locale_compare, etc.):

    string:substring(2, 5, "mumu") // "mu"
    
  • Unicode-safe indexing by character (not byte):

    string:at(-1, "naïve") // "e"
    

Return conventions

  • Out-of-range character access: returns _
  • Not found (index_of, search): returns -1
  • No regex match (match, match_all): returns _ or empty list accordingly

Interop & coercion rules

  • Most string-consuming functions also accept a single-element StrArray(["..."]).
  • string:to_string(v) pretty-prints MuMu values:
    • arrays/2D arrays → bracketed forms
    • keyed arrays → { k: v, ... }
    • refs/streams/regex/functions → short readable markers
  • string:multi(...) applies the same stringification per argument and concatenates.

Minimal examples

extend("string")

string:lower("MuMu")                    // "mumu"
string:contains("MuMu language", "Mu")  // true
string:index_of("banana", "na")         // 2
string:slice("abcdef", 2, -1)           // "cde"
string:join(", ", ["a", "b", "c"])      // "a, b, c"

// Currying & placeholders
greet = string:concat("hi ", _)
greet("Guru")                           // "hi Guru"

// Regex
rx = /na+/i
string:match(rx, "BAnAna")              // ["Ana"]
string:match_all(rx, "banana")          // [["na"], ["na"]]

Design notes

  • Unicode & indices — to operate by chars() counts; all index math uses character length
  • Predictable partials — to return functions until all non-placeholder arguments are supplied
  • Host vs. wasm — to compile on both:
    • host: enables dynamic loading (extend("string")) via Cargo_lock symbol
    • wasm: uses the engine’s wasm feature; host-only deps are gated
  • Safety & ergonomics — to use clear error strings for type mismatches and cap extremes (e.g., repeat count)

Status & caveats

  • string:normalize currently acknowledges forms but returns the input unchanged (ICU hooks can be added).
  • string:replace_fn rejects function replacers in this build; prefer string:replace_all.
  • Locale methods accept a locale or options object but default to standard Unicode case/compare in this crate.

Contributing

Issues and merge requests are welcome at:
https://gitlab.com/tofo/string-mumu

Please align changes with the MuMu core conventions:

  • character-based indexing
  • placeholder-aware currying
  • consistent error messages ("string:<fn> => ...")
  • wasm compatibility

License

Licensed under either of:

  • MIT license
  • Apache License, Version 2.0

at your option.

Commit count: 6

cargo fmt