# Changes in 0.10.3 - Update dependencies for security fixes # Changes in 0.10.2 - Update dependencies for security fixes # Changes in 0.10.1 - Upgrade fuser to get rid of abandoned users dep - Update dependencies # Changes in 0.10.0 - Upgrade to clap4 - Rework and simplify Cli We now use the clap derive module to simplify the Cli. Also, split and cat are now proper subcommands. Having them introduced as a non-optional flag before was a poor decision. - Update dependencies # Changes in 0.9.2 - Update dependencies to get security fixes - Avoid using deprecated fucntion mount_spawn # Changes in 0.9.1 Update dependencies to fix security issues. Furthermore: - Replace dependency fuse with newer fuser `fuser` (https://github.com/cberner/fuser) is a more maintained and up-to-date fork of `fuse` (https://github.com/zargony/fuse-rs), which ensures smoother future development. - Use running integer as file handle This way, we do not need to rely on the time package to give a pseudo unique value. - Use running integer as inode This way, we do not need to rely on the time package to give a pseudo unique value. - Remove unused time dependency # Changes in 0.9.0 - Check mirror and mountpoint for sanity - Mirror and mountpoint have to exist. - Mirror must not be in a subfolder of mountpoint, to avoid recursive mounts. This is also in accordance to EncFS. - Run scfs by default to easen rapid development Since we build more than just one binary, `cargo run` does not know which one to call by default. For this case, there is the key "default-run", which tells `cargo run` which binary to use when no `--bin` flag is present. - Notify main loop when filesystem is dropped The filesystem implements the Drop trait now, which makes it possible to run a function when the filesystem is unmouted in a way other than by terminating the main loop (most prominently by using `umount` directly). The previous situation was, when the filesystem was unmounted via `umount`, then the main loop would hang infinitely, because there was no way to notify the main loop. Now we send a quit signal when the filesystem is dropped, so the main loop can exit normally. - Canonicalize paths Using absolute paths is necessary for a daemon, since a daemon usually changes its working directory to "/" so as to not lock a directory. - Add daemon flag which puts program in background The daemonizing is done after the filesystem has been created, to let the initialization happen in foreground. This minimizes the time the daemon is running but the filesystem is not mounted yet. - Add flag to create mountpoint directory The mirror will intentionally *not* be created, since the mount is readonly and a missing mirror directory is most likely a typo from the user. - Add converter for symbolic quantities This converter will be used to calculate the blocksize for a SplitFS mount. The size can now be given as an integer or optionally with a quantifier like "K", "M", "G", and "T", each one multiplying the base with 1024. # Changes in 0.8.0 - Implement readlink - Correctly handle symlinks - Replace each metadata with symlink_metadata Symlinks should be presented as-is, so it should never be necessary to traverse them. - Silently ignore unsupported filetypes - Add convenience wrappers for catfs and splitfs With these wrappers, it is possible to mount the respective filesystem without explicitly specifying the mode parameter. # Changes in 0.7.0 - Make blocksize customizable It is now possible to use a custom blocksize in SplitFS. For example, to use 1MB chunks instead of the default size of 2MB, you would go with: scfs --mode=split --blocksize=1048576 Where 1048576 is 1024 * 1024, so one megabyte in bytes. - Short circuit when reading a size of 0 - Do not materialize vector after each chunk This step was highly unnecessary anyway and it needlessly consumed time and memory. An Iterator can be flattened in the same way, but without the penalty that comes with materializing. - Do not calculate size of last chunk to read By simply reading `blocksize` bytes and only taking `size` before materializing, we can save a lot of possible mis-calculation regarding the last chunk. We make use of two properties here: - Reading after EOF is a no-op, so using a higher number on the reading operation does not hurt. - The reading operations take only place once we materialize the byte array. So even if we issue to read much more bytes than necessary on the last chunk, it will not hurt, since we only `take` the correct number of bytes on materializing. - Fix off-by-one error - Correctly handle empty files Create at least one chunk, even if it is empty. This way, we can differentiate between an empty file and an empty directory. - Add test suites to modules With automated tests we now can effectively check if new features work as intended and that they do not break existing code. # Changes in 0.6.1 - Fix misleading part in the README The misleading part in the README said, that most cloud storage providers do not support the upload of a single file. This is of course rubbish. What I meant to say was, that they do not support the concurrent upload of a single file, as in chunked upload. This part is fixed now. - Update README to reflect CatFS precondition CatFS will now refuse to mount a directory that was not generated by SplitFS prior. The README didn't reflect this breaking change. This part is fixed now. # Changes in 0.6.0 - Remove thread limit I no longer enforce a thread limit via a thread pool. If you want to fire up a thousand threads, then go ahead. - Use Option in metadata converter Using an Option is a more natural way of expressing the intention. If I give a ino, use it, if I don't then do something to generate one. At the moment that means to take the ino of the existing file in SplitFS or use the current timestamp in CatFS. The Option object is preferred over conventions like using a special value such as 0. - Use constants for special inos By using named constants it will be easier to understand what the numbers really mean. - Calculate additional offset instead of hardcode If virtual entries like . and .. are added to the directory list, the offset needs to be adjusted accordingly. To provide a scalable solution, calculate this additional offset instead of just hardcode a specific number. - Make virtual offset code more readable To explicitly show when the offset has to be adjusted, I added another, yet redundant, conditional block. It is not strictly necessary but it makes it possible of adding more such virtual rules in the future without adjusting existing code. - Add Config struct This struct will contain possible changeable configuration parameters in the future. For now it is empty. - Use clap to parse cli arguments - Add file name to database To provide more efficient queries, use the file name in addition to the complete path in the database. - Use file name in query By using the file name in the SELECT query instead of iterating over all items, the lookups are handled in a much shorter time. This way, even directories with a huge amount of files can be listed in reasonable time. - Create index on parent_ino and file_name This results in faster queries in lookups. - Short circuit readdir When the readdir buffer is full, the iteration can be suspended. It will be resumed by the next filesystem call with the appropriate offset. This results in a performance boost by not needlessly iterating over entries that will be ignored anyway. - Remove DISTINCT from SELECT statement DISTINCT is not necessary in this case and only increases the time needed to complete the query. - Let FileInfo derive from Default This way it is possible to easily create a default instance with default values for each member. - Use converters for correct DB representation By going the extra mile via the FileInfo-FileInfoRow converter, the correct representation of the members in the database can be ensured, even with more complex conversion methods, like encoding with JSON or the like. - Use u8 Vector to represent paths in the db The conversion to String is lossy and can result in problems with special characters that can not be correctly represented. By using a byte Vector the paths can be stored raw, without any encoding. # Changes in 0.5.0 - Use timestamp as filehandle key This is an addition to commit 9468cdd884c631e450d6fdaa506e59f1bd2a77e3. The described race condition is now also fixed in CatFS. - Add threadpool as dependency - Cache filenames instead of file handles By opening the files only when actually called I can avoid race conditions that would arise if multiple threads access the same file handle. - Read files in separate thread This way, the filesystem is no longer blocked until each portion of a file has been read. This also ensures that the kernel may decide to read portions of a file in parallel. - Include . and .. in directory listing - Use timestamp as inode numbers This way I can save the quite expensive calls to the database for each new inode number and hence decreasing the time needed for the initial population of the database. - Split lib into module files # Changes in 0.4.0 ## Breaking changes - From now on, a --mode flag has to be given as first parameter when mounting, with either --mode=cat or --mount=split ## Chronologically - Add ctrlc as dependency - Unmount automatically on SIGINT - Use i64 instead of JSON Strings Converting to JSON is inefficient and prevents proper database operations. Using i64 to store a u64 in the database is a way better approach. Via the From-trait I can transparently keep using the exisiting methods. - Use String conversion instead of JSON Strings Converting OsStrings to Strings is much more efficient than encoding them to JSON Strings. Also, JSON Strings in the database prevent proper usage of certain database operations, like searching for substrings. - Remove serde dependency Without the need to use JSON Strings, I can get rid of the serde dependency. - Move populate into SplitFS Since CatFS will use its own version of populate, it makes sense to put the methods to their respective struct implementations. - Use termination feature of ctrlc With this feature enabled, the unmounting will happen not only when a SIGINT is caught, but also on SIGTERM. - Change return type to Self This makes it easier to adjust the code for different implementations. - Add vdir parameter This parameter will denote if a directory is just a "virtual directory", actually referencing parts of a regular file. - Move identical code block to dedicated method - Remove offset field from FileHandle Seeking to a byte position is a neglectable operation, not worth the hassle of maintaining an offset field. - Implement CatFS CatFS is the reverse operation to SplitFS. It presents the file parts, as displayed by SplitFS as single files again and handles file access to the parts transparently. - Add mount flag to use CatFS or SplitFS When mounting a directory, the user now has to give a mode flag. If the want to use SplitFS, they will need: scfs --mode=split If they want to use CatFS: scfs --mode=cat - Use timestamp as filehandle key The previous implementation used the latest created key plus one. This leads to a race condition on parallel file access. By using the current timestamp in nanoseconds, it is nearly impossible to assign the same key to two separate filehandles. - Increase TTL to 24 hours Since the base directory will be mounted read-only and is expected to never change during mounting time (at least for now), it is quite senseless for such a small timeout like one second. By increasing it to 24 hours, the kernel might cache lookup and getattr results, hence avoiding expensive online checks. # Changes in 0.3.0 First public release on crates.io. No other changes. # Changes in 0.2.0 - Do not include . and .. for now These two entries are not strictly necessary and at the moment it interferes with the offset calculation. - Use offset parameter in readdir If it is not used, directories with many files are not displayed correctly. # Changes in 0.1.0 Initial release, first working prototype of SplitFS.