## Visitor pattern overhead The use of the visitor pattern (while elegant) imposes a significant cost overhead that I was surprised by. When everything was in a single translation unit, the performance was fine (2.6 seconds for the standard 100M clock benchmark). However, when things were split up into multiple translation units, the performance dropped to 13+ seconds. That is a significant loss. Probably too much to really tolerate at this stage. It is a given that we will give away performance over time as we stretch the core design to handle more scenarios and more sophisticated features. However, I don't feel comfortable giving away the performance up front. Using the macro engine to generate the `update_all` and `has_changed` functions seems like a small cost to pay to keep most of the performance gains in eliminating the complex ARC/RC based design I had before. ## Performance and multi-threading To achieve true multi-threading of the simulation, we must separate out the state from the manipulation of the state. For example ```rust struct Widget { clock: Signal, enable: Signal, counter: DFF>, strobe: Signal, } ``` Now suppose that the logic does not mutate, but instead creates a new structure for the signals. Since the signals do not have wiring internal to them anymore (they are just signals), we can do something like this for `update`: ```rust fn update(mut w: Widget) -> Widget { w.counter.clk.next = w.clock.val; if w.enable.val { w.counter.d.next = w.counter.q.val + 1; } } ``` Next steps... - Need to add constants, state enums, and local signal support. ## Async test benches An async test bench looks something like this: ```rust pub async fn fifo_packet_drain( clock: Clock, mut reader: FIFOReaderClient, len: usize, packet_len: usize, prob_pause: f64, pause_len: u64, ) -> Result, HDLError> { let mut ret = vec![]; let mut index = 0; reader.read.set(false); clock.next_negedge().await; while index < len { while reader.almost_empty.get() { reader.read.set(false); clock.next_negedge().await; } if rand::thread_rng().gen::() < prob_pause { reader.read.set(false); clock.delay_negedge(pause_len).await; } for _p in 0..packet_len { ret.push(reader.output.get()); reader.read.set(true); index += 1; clock.next_negedge().await; } reader.read.set(false); } reader.read.set(false); clock.next_negedge().await; Ok(ret) } ``` There are several characteristics of an async test bench that are important - The use of `async/await` syntax means that you can write sequential logic in a polled context (the resulting FSM is auto-generated by Rust) - You can use local temporary values that are preserved as part of the state for you. - Test benches are composable! These are all fine, but I can get the same with threads, and at a fraction of the intellectual burden. Here is the idea. First, define an action function as something that has a signature like ```rust type Action = fn(time: u64, x: T) -> T; ``` 1. We have a top level `Simulation` struct that is generic over a type `T`. 2. We can register `Action` functions to be called at the beginning when the simulation is initialized (i.e., `time = 0`). 3. We can register `Action` functions to be called at the end with the simulation is complete. (i.e., `time = end_time`). 4. We can register `Action` functions to be called periodically with a specified interval of time `DeltaT`. These will be called with the time, and must consume and return `T`. 5. We can register `Action` functions to be called at a specific time `t_0`. This allows various stateless operation, but we also need to provide stateful operation. As such, we want a thread to be able to 1. Pause itself and indicate when it wants to be woken up. 2. Be woken up at that time, with the object to mutate. 3. Return the object to mutate along with some indication of when it wants to be re-awoken. There seem to be multiple ways to accomplish this. We want something like ```rust fn my_rust_testbench(sim: &Sim) { let x = sim.start(); // Do stuff to x let x = sim.at_time(t0, x); // Do stuff to x let x = sim.at_time(t1, x); // Do stuff to x sim.finish(x) } ``` Done. Need to add an error type and insert `Result` return types. ## TODO - Add auto enum handling/code generation to the macro - Finish VCD probe