# Issues with this crate AKA How and why I broke it, and why it stays (somewhat) broken/limited. Update: And how I broke it again, because I made an assumption based on my own implementation of core being paranoid, only to find out that I was using it, and not the *actual* Rust core for testing. *Woo.* ## Foreword When you're working on a `Target` for your platform specification, things get *really* weird. When you're working on it for an extended period of time, and there are a few platforms you're looking at at the same time, and you're trying to write something that'll compile away the ugly parts, expect weird things to jump out and bite you. Also expect minor things (like booting into Little Endian to run a test for Big Endian) to creep up and bite you so things that look really good at first turn out to be *huge* mistakes. **Write this first.** **If you are relatively new to microprocessors, and crazy low level things, rest assured, you're mostly going to see Big and Little Endian types.** There are some pretty strange exotic beasts out there that contain multiple cores of different types (think APUs and SoCs) that do what appear to be *extremely stupid things*, that are actually *really clever* (some *actually* are really stupid, but those are mostly university or research engineering samples, stay away from those, particularly if a friend designs them and you value that friendship). Chances are you won't come across them, but when you do, hope someone else has dealt with them enough that you're not just reading manufacturer documentation. As a further note, improperly documented data models are a nightmare. ## The problem The problem is that certain microprocessors have really weird characteristics that don't conform to 'Big' or 'Little' Endian types. On the one hand, they can be one, or both, or either. Some boot into one (you opt into the model at boot time), others behave as one *most* of the time, and then proceed to do extremely weird things later ('Mixed' Endian). I'd argue that *most* of the weird processors are old (e.g. PDP-11 style behaviour but you handle something like a `u64` and it proceeds to do 'Middle' Endian style behaviour but feeds it *the inverse order* for no reason that you can make sense of; and modifying the core code to invert this results in performance degradation, so the solution isn't to do that, but rather to tolerate it instead (despite it being mind boggling, and knowing full well there's likely some completely rational reason)). PDP-11 is by no means the only other type of Endian either, and so to prevent quirks from popping up this macro crate was put together to catch it for writing microprocessor core code (it then became a `const fn` crate). As it's an 'in order' load-in, it dropped core when that became feasible (only to pick it back up to the limitations I'll dive into in a moment). This crate was then designed to be used to throw a quick `panic!` here or there when something unexpected turned up in my code so I could investigate it and fix the problem within the core implementation (so much for that). In trying to solve the problem of how to handle ABCD vs DCBA vs ACBD vc CDAB (etc.), I broke the crate in the confusion of how to handle the constantly morphing byte structures. ### The assumption The base assumptions remain the same: 1. All *bytes* remain as bytes. The minimum value for endian manipulation is *byte order* (hence the crate name); anything below that is not subject to our concerns, and that level of insanity (hello 7-bit character encoding) belongs in its own special place (if I made a crate for that it'll probably contain the name `-circle-of-` somewhere); 2. The order *shall not change for a given integral type during execution* (and I'm having nothing to do with the one chipset I've seen that does do this). These are considered hard assumptions. The following is a soft assumption that I'm willing to bend on: * The order of the unsigned variant and the signed variant *should be the same* (as I've *never* seen anything violate that, though I'll bend as it's not hard to add variants for them). ## The initial attempt to solve this The straightforward approach to this was to declare the values as one might a C or C++ union (or botch one together using raw bytes *without* shifting due to when `Shl` and other `Bit*` related `ops` are implemented). By declaring the raw bytes, the order is retained in the way intended, it is then cast, and the order is retained. Right? Or so you'd think. We can't do that because neither transmute, nor unions, are allowed. This means that unless we do something unfathomable in stable Rust (or sideload a nightly compiled rlib into stable, which is probably worse), we're not getting that into stable (unless we run it through our libcore, which, really, we're going to eventually, but chickens and eggs). *When the solution starts to look like patching part of `std` on a Tier 1 platform you know you're looking in the wrong place.* Reading the bytes straight from the data source (assume it's the wire) using `str_from_radix` gives a clean interpretation as intended (though it's non-const, but we're still just desk checking the logic): ```rust // Update 2: These are backwards, but being inverted, we're still on the money. // We'd want these to turn up like this in raw memory. /// Big Endian u128 const B_U128: u128 = 0x000102030405060708090A0B0C0D0E0F; /// Little Endian u128 const L_U128: u128 = 0x0F0E0D0C0B0A09080706050403020100; ``` So far, so good. They're inverses of one another, and they represent the order we'd pull the bytes from the wire to feed into it. We can hardcode these (as the crate has done since the now yanked `0.1.0`). Taking a smaller example so as to not waste too much time grokking: ```rust /// Big Endian (ABCD) const B_U32: u32 = 0x00010203; // Little Endian (DCBA) const L_U32: u32 = 0x03020100; // Our PDP-11 type (BADC) const L_U32: u32 = 0x01000302; ``` Looking at this new addition, we have a problem from typical Rust: there is no "middle" *by default*. PDP-11 gets left out in the cold (and for good reason). And that's fine, because, frankly, we don't really want Rust wasting its time on ancient architectures when it's something that's not *really* used. But at the same time when there are things out there that start behaving strangely we don't really want to do something like this, and get quirky results because it isn't a known. The issue here is that we really only have (and want) `from_be` and `from_le`, and `to_be` and `to_le`. Non-issue, in that if we're doing `{to,from}_host` and we have a common data exchange format it's *likely* to be one of those, or the host format anyway. So, again, that's a non-issue. What's problematic is only if the core implementation is wrong, at which point we get the following scenario: ```rust /// Wire value const BE_VALUE: u32 = 0x00010203; // Should be 0x01000302, is 0x03020100 let my_value = u32::from_be(BE_VALUE); // Explosion happens here ``` However, this would be on us (as there is no core for PDP-11 or our hypothetical architecture; we're the ones adding it). So to solve this we add a `spec`, which contains the `Target` (see the [rustc guide](https://rust-lang.github.io/rustc-guide/) for more on this). In that we'd replace the "little" with our own string so that we don't get caught up in the wrong Endian catch-all tags, and we can start doing magical data matching. It doesn't mean can suddenly expect people to know how to handle `from_pdp11` (or `from_me`) though, it just means that we can safely do `cfg(target_endian = "pdp11")` and expect it to either match (or not). (From there we have to implement our own `from_be`, and so on, and hello our own core...) But all of this has an underlying problem: how do we set the initial value in a way that we can compare the values to the Big Endian or Little Endian test value? And how do we hardcode those? ## Limitations There are four sane paths to solving this cleanly (and, yes, they're all broken). ### Transmute Basically, `const fn` and transmute outside of unstable/nightly is a no-no: 1. `const_transmute` blocker ([#53605](https://github.com/rust-lang/rust/issues/53605)). This is a huge blocker. It could be overcome with unions, but they're actually somehow worse. If this worked we'd essentially declare the value as a u64, transmute it as bytes, write the bytes in the order we want (Big, Little, whatever), and call it a day. ### Unions Unions in `const fn` are subject to a couple of major, breaking, issues: 1. Assignment breaks things ([RFC1444 / #32836](https://github.com/rust-lang/rust/issues/32836)); 2. Field access inside `const fn` is basically a transmute ([#51909](https://github.com/rust-lang/rust/issues/51909)). If this worked we'd declare the union (globally, locally, whatever), assign the raw bytes, and use it. **But it doesn't work properly, not even in nightly**. ### Pointer hacks Also can't dereference pointers in stable ([#51911](https://github.com/rust-lang/rust/issues/51911)). We can with an unstable feature, but we can't use mutables in `const fn`, at all. **So that one is a hard no.** If it worked, we could use pointer magic to assign (and we still kind of can, but then we get alignment issues, and then an explosion with the error about it never goig to work). ### Shift assignment Due to where this sits, and the fact you need to know the assignment order prior, that's a no-go. Bit operations and shifting aren't implemented in my code prior to this crate mixing into the core code, so for me this isn't a good plan. It's also where I got into trouble about four dozen attempts into re-writing this. *Just. Compile. Already.* ## Fixing the breakage Shift assignments are dead. We can't shift assign from a fixed data point to get our platform value. We just can't. *We can use the data model* and pre-defined base to get our `from_be` value, which is where we are now. It's gross, unkempt, and a total disaster, but it works. It removes the need for shifts and relies *solely* on primitives. It lands us back at copies and removes the need for endian tests. Gross, right? But here's the catch, we have to implement `from_be` before we can implement the conversion system to do it for us. There's the chicken and egg. I mean, we all knew it was coming because we had to know the data model, and memory layout, and everything else (there is no free lunch). The real problem here isn't so much the implementations, it's that to achieve it is in a `const fn` the underlying implementation almost certainly has to violate one of the above limitations. Fortunately, because it's in core, there's a good chance it can violate it within its own constraints. How, precisely, it chooses to do that then falls to the core implementation. (In my own case shifting is an early concern, and is almost always one of the first things I worry about along with bit operations, so it's not a *huge* deal, it's just frustrating because we chicken and egg ourselves into another chicken and egg situation before chicken and egging our way out of it by chicken and egging our way out...) ## Fixing the broken breakage (Part 2) *The moral to the story is the usual: don't work on code when you're dog tired or sick, let alone both. (Not that that's ever stopped anyone.)* So `from_be` and `from_le` aren't the solution I'd hoped for, except maybe they are. In the core I wrote for a toy kernel (turned into a real kernel to replace the exploitable one in an old embedded device I own) I took great pains to actually be sure that they did just that. If something was `from_be` it assumed that unless the platform was Big Endian it would attempt the conversion; unless it was Little Endian it would attempt the `from_le` conversion. It turns out, according to tests in the playground, and then on a desktop system without my custom core on it, *that that doesn't seem to take place in stable, or nightly in quite the way I'd anticipated*, or rather, because the system I'm using can be either Big or Little (regardless of boot state), the reliability of the calls differs (so that's my bad in a big way). As a result, the fixed constants are exactly as broken as they would appear to be in any of the above attempts, and are precisely as useless (also because I didn't think to test them properly on the desktop). Not only that, in my haste to fix things (and in my if/else and magic other things), I hardcoded fixes that resolved it against *my core* (because the conversion took place), and put the inverted values into `L_` and `B_`. *Excellent.* To compound the error, the test code I'd previously written was apparently inverting the outputs of the test data anyway, which explains why I kept thinking it looked backwards. So not only am I a fool, I'm a fool twice over. *Cool.* So my own error checking and paranoia (and lack of throwing warnings at me) bit me, again, and it's my own fault, and I'll wear it, in a few public places. But it also shows something I already suspected: this didn't solve the issue.