Crates.io | gguf-utils |
lib.rs | gguf-utils |
version | |
source | src |
created_at | 2024-11-25 09:00:46.645314 |
updated_at | 2024-12-10 04:29:18.011378 |
description | Utilities for handling gguf files |
homepage | |
repository | https://github.com/InfiniTensor/gguf |
max_upload_size | |
id | 1460087 |
Cargo.toml error: | TOML parse error at line 18, column 1 | 18 | autolib = false | ^^^^^^^ unknown field `autolib`, expected one of `name`, `version`, `edition`, `authors`, `description`, `readme`, `license`, `repository`, `homepage`, `documentation`, `build`, `resolver`, `links`, `default-run`, `default_dash_run`, `rust-version`, `rust_dash_version`, `rust_version`, `license-file`, `license_dash_file`, `license_file`, `licenseFile`, `license_capital_file`, `forced-target`, `forced_dash_target`, `autobins`, `autotests`, `autoexamples`, `autobenches`, `publish`, `metadata`, `keywords`, `categories`, `exclude`, `include` |
size | 0 |
cargo xtask --help
gguf-utils is a command-line tool for working with gguf files
Usage: xtask.exe <COMMAND>
Commands:
show Show the contents of gguf files
split Split gguf files into shards
merge Merge shards into a single gguf file
filter Filter gguf files based on wildcard patterns
convert Convert gguf files to different format
set-meta Set metadata of gguf files
help Print this message or the help of the given subcommand(s)
Options:
-h, --help Print help
-V, --version Print version
cargo show --help
Show the contents of gguf files
Usage: xtask.exe show [OPTIONS] <FILE>
Arguments:
<FILE> The file to show
Options:
--shards If set, show all shards in the directory
-n, --array-detail <ARRAY_DETAIL> How many elements to show in arrays, `all` for all elements [default: 8]
-m, --filter-meta <FILTER_META> Meta to show [default: *]
-t, --filter-tensor <FILTER_TENSOR> Tensors to show [default: *]
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
cargo split --help
Split gguf files into shards
Usage: xtask.exe split [OPTIONS] <FILE>
Arguments:
<FILE> File to split
Options:
-o, --output-dir <OUTPUT_DIR> Output directory for splited shards
-t, --max-tensors <MAX_TENSORS> Max count of tensors per shard
-s, --max-bytes <MAX_BYTES> Max size in bytes per shard
-n, --no-tensor-first If set, the first shard will not contain any tensor
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
cargo merge --help
Merge shards into a single gguf file
Usage: xtask.exe merge [OPTIONS] <FILE>
Arguments:
<FILE> One of the shards to merge
Options:
-o, --output-dir <OUTPUT_DIR> Output directory for merged file
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
cargo cast --help
Cast data types in gguf files
Usage: gguf-utils.exe cast [OPTIONS] --types <TYPES> <FILE>
Arguments:
<FILE> File to convert
Options:
-t, --types <TYPES>
-o, --output-dir <OUTPUT_DIR> Output directory for converted files
-t, --max-tensors <MAX_TENSORS> Max count of tensors per shard
-s, --max-bytes <MAX_BYTES> Max size in bytes per shard
-n, --no-tensor-first If set, the first shard will not contain any tensor
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
cargo convert --help
Convert gguf files to different format
Usage: xtask.exe convert [OPTIONS] --steps <STEPS> <FILE>
Arguments:
<FILE> File to convert
Options:
-o, --output-dir <OUTPUT_DIR> Output directory for converted files
-x, --steps <STEPS> Steps to apply, separated by "->", maybe "sort", "merge-linear", "split-linear", "filter-meta:<key>", "filter-tensor:<name>" or "cast:<dtype>"
-t, --max-tensors <MAX_TENSORS> Max count of tensors per shard
-s, --max-bytes <MAX_BYTES> Max size in bytes per shard
-n, --no-tensor-first If set, the first shard will not contain any tensor
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
cargo to-llama --help
Convert gguf files to Llama format
Usage: xtask.exe to-llama [OPTIONS] <FILE>
Arguments:
<FILE> File to convert
Options:
-x, --extra <EXTRA> Extra metadata for convertion
-o, --output-dir <OUTPUT_DIR> Output directory for converted files
-t, --max-tensors <MAX_TENSORS> Max count of tensors per shard
-s, --max-bytes <MAX_BYTES> Max size in bytes per shard
-n, --no-tensor-first If set, the first shard will not contain any tensor
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
cargo set-meta --help
Set metadata of gguf files
Usage: xtask.exe set-meta [OPTIONS] <FILE> <META_KVS>
Arguments:
<FILE> File to set metadata
<META_KVS> Meta data to set for the file
Options:
-o, --output-dir <OUTPUT_DIR> Output directory for changed file
--log <LOG> Log level, may be "off", "trace", "debug", "info" or "error"
-h, --help Print help
<META_KVS>
是具有特定格式的字符串或文本文件路径。工具将先检查文件是否为路径,如果是则从文件读取,否则视作字符串字面量。
格式要求如下:
配置代数类型元信息
代数类型包括整型、无符号整型、浮点型和布尔。
'<KEY>'<Ty> <VAL>
配置字符串元信息
单行字符串:
'<KEY>'str "<VAL>"
多行字符串:
'<KEY>'str<Sep>
<Sep> [Content]
<Sep> [Content]
<Sep> [Content]
其中 Sep
是表示字符串继续的分隔符。必须紧邻 str
,之间不能包含空白字符,且分隔符中也不能包含空白字符。
连续的多行字符串,每行必须以分隔符+空格起始,此行后续所有字符(包括换行符)都被视作多行字符串的内容,不转义。
任何不以分隔符开始的行(包括空行)都将结束多行字符串。
配置数组元信息
TODO: 当前此功能未实现。
这是一个配置元信息的示例文件内容:
'llama.block_count' u64 22
'llama.context_length' u64 2048
'llama.embedding_length' u64 2048
'llama.feed_forward_length' u64 5632
'llama.attention.head_count' u64 32
'llama.attention.head_count_kv' u64 4
'llama.rope.dimension_count' u64 64
'tokenizer.chat_template' str|
| {%- for message in messages -%}
| {%- if message['role'] == 'user' -%}
| {{ '<|user|>
| ' + message['content'] + eos_token }}
| {%- elif message['role'] == 'system' -%}
| {{ '<|system|>
| ' + message['content'] + eos_token }}
| {%- elif message['role'] == 'assistant' -%}
| {{ '<|assistant|>
| ' + message['content'] + eos_token }}
| {%- endif -%}
| {%- if loop.last and add_generation_prompt -%}
| {{ '<|assistant|>
| ' }}
| {%- endif -%}
| {%- endfor -%}