Crates.io | idx_parser |
lib.rs | idx_parser |
version | 0.3.0 |
source | src |
created_at | 2021-11-06 18:40:24.200334 |
updated_at | 2021-11-18 05:13:57.31117 |
description | Parse IDX files such as the ones used in MNIST database files. |
homepage | |
repository | https://gitlab.com/j-mcavoy/idx_parser |
max_upload_size | |
id | 477766 |
size | 15,272 |
IDX data file parser written in Rust.
THE IDX FILE FORMAT
the IDX file format is a simple format for vectors and multidimensional matrices of various numerical types.
The basic format is
magic number size in dimension 0 size in dimension 1 size in dimension 2 ..... size in dimension N data
The magic number is an integer (MSB first). The first 2 bytes are always 0.
The third byte codes the type of the data:
0x08: unsigned byte 0x09: signed byte 0x0B: short (2 bytes) 0x0C: int (4 bytes) 0x0D: float (4 bytes) 0x0E: double (8 bytes)
The 4-th byte codes the number of dimensions of the vector/matrix: 1 for vectors, 2 for matrices....
The sizes in each dimension are 4-byte integers (MSB first, high endian, like in most non-Intel processors).
The data is stored like in a C array, i.e. the index in the last dimension changes the fastest.
(taken from http://yann.lecun.com/exdb/mnist/ )