# Kana/ASCII Conversion Utility Kana conversion utilities to convert Japanese katakana and hiragana, along with ASCII characters, into full-width (zenkaku) or half-width (hankaku) forms. There are various requirements around storing data, reporting, user presentation, etc. with regard to Japanese scripts and ASCII characters. With many services, data must be sent in a particular format (for example, some financial services require all katakana to be sent as half-width, single-byte (hankaku) characters with kanji and ASCII being sent in full-width, double-byte (zenkaku) characters). In other use cases, data is stored as all double-byte (zenkaku) characters, awaiting transformation as required by separate utilities. This library aims to help in these conversions, making storing and sending Japanese characters easier. ## Terminology Since various terms are used for Japanese character representation, which can be confusing, this list is provided to clarify the definitions: | Term | Description | |------------------|--------------------------------------------------------------------| | Half-width kana | Katakana characters which are represented by a single byte | | Full-width kana | Katakana characters which are represented by two bytes | | Single-byte | Characters whose values are stored in a single byte (half-width katakana, normal ASCII) | Double-byte | Characters whose values are stored in two bytes (full-width katakana, hiragana, kanji, double-byte ASCII) | Hankaku (半角) | The Japanese term for "half-width". Same definition as "Single-byte". | Zenkaku (全角) | The Japanese term for "full-width". Same definition as "Double-byte". | Katakana (カタカナ)| The Japanese Katakana character set, which can be stored in wither one byte (half-width) or two (full-width) | | Hiragana (ひらがな)| The Japanese Hiragana character set, which can only be stored as a two-byte value | | Kanji (漢字) | The Japanese Kanji characters, which can only be stored as a two-byte value | | Romaji (ローマ字) | Literally "Roman letters", the set of characters used to represent Japanese words in Latin alphabetical format (using ASCII characters). | | ASCII | The standard ASCII character set, which is usually stored in one-byte, but also has a two-byte version for Japanese "Romaji" representation | # Usage Simply add to your `Cargo.toml`: ```$xslt [dependencies] kana-conversion = "0.1" ``` and use in your rust code: ```$xslt extern crate kana_conversion; use kana_conversion::to_double_byte; ``` # Conversion Functions The "to_double_byte" function takes a string slice to convert and a mode. If `AsciiOnly` is selected, only normal ASCII chars will be converted to double-byte (zenkaku), while if `KanaOnly` is selected, only half-width (hankaku) katakana will convert, while `KanaAndAscii` will convert both. The return is an owned String which holds double-byte (zenkaku) characters, converted as specified by the `mode`. # License Licensed under: * MIT license ([LICENSE-MIT](LICENSE-MIT) or http://opensource.org/licenses/MIT) # Contribution Contributions are welcome, and will be subject to the regulations presented in the license indicated above.