Sidekiq.rs (aka `rusty-sidekiq`) ================================ [![crates.io](https://img.shields.io/crates/v/rusty-sidekiq.svg)](https://crates.io/crates/rusty-sidekiq/) [![MIT licensed](https://img.shields.io/badge/license-MIT-blue.svg)](./LICENSE.md) [![Documentation](https://docs.rs/rusty-sidekiq/badge.svg)](https://docs.rs/rusty-sidekiq/) This is a reimplementation of sidekiq in rust. It is compatible with sidekiq.rb for both submitting and processing jobs. Sidekiq.rb is obviously much more mature than this repo, but I hope you enjoy using it. This library is built using tokio so it is async by default. ## The Worker This library uses serde to make worker arguments strongly typed as needed. Below is an example of a worker with strongly typed arguments. It also has custom options that will be used whenever a job is submitted. These can be overridden at enqueue time making it easy to change the queue name, for example, should you need to. ```rust use tracing::info; use sidekiq::Result; #[derive(Clone)] struct PaymentReportWorker {} impl PaymentReportWorker { fn new() -> Self { Self { } } async fn send_report(&self, user_guid: String) -> Result<()> { // TODO: Some actual work goes here... info!({"user_guid" = user_guid}, "Sending payment report to user"); Ok(()) } } #[derive(Deserialize, Debug, Serialize)] struct PaymentReportArgs { user_guid: String, } #[async_trait] impl Worker for PaymentReportWorker { // Default worker options fn opts() -> sidekiq::WorkerOpts { sidekiq::WorkerOpts::new().queue("yolo") } // Worker implementation async fn perform(&self, args: PaymentReportArgs) -> Result<()> { self.send_report(args.user_guid).await } } ``` ## Creating a Job There are several ways to insert a job, but for this example, we'll keep it simple. Given some worker, insert using strongly typed arguments. ```rust PaymentReportWorker::perform_async( &mut redis, PaymentReportArgs { user_guid: "USR-123".into(), }, ) .await?; ``` You can make custom overrides at enqueue time. ```rust PaymentReportWorker::opts() .queue("brolo") .perform_async( &mut redis, PaymentReportArgs { user_guid: "USR-123".into(), }, ) .await?; ``` Or you can have more control by using the crate level method. ```rust sidekiq::perform_async( &mut redis, "PaymentReportWorker".into(), "yolo".into(), PaymentReportArgs { user_guid: "USR-123".to_string(), }, ) .await?; ``` See more examples in `examples/demo.rs`. #### Unique jobs Unique jobs are supported via the `unique_for` option which can be defined by default on the worker or via `SomeWorker::opts().unique_for(duration)`. See the `examples/unique.rs` example to only enqueue a job that is unique via (worker_name, queue_name, sha256_hash_of_job_args) for some defined `ttl`. Note: This is using `SET key value NX EX duration` under the hood as a "good enough" lock on the job. ## Starting the Server Below is an example of how you should create a `Processor`, register workers, include any custom middlewares, and start the server. ```rust // Redis let manager = sidekiq::RedisConnectionManager::new("redis://127.0.0.1/").unwrap(); let mut redis = bb8::Pool::builder().build(manager).await.unwrap(); // Sidekiq server let mut p = Processor::new( redis, vec!["yolo".to_string(), "brolo".to_string()], ); // Add known workers p.register(PaymentReportWorker::new()); // Custom Middlewares p.using(FilterExpiredUsersMiddleware::new()) .await; // Start the server p.run().await; ``` ## Periodic Jobs Periodic cron jobs are supported out of the box. All you need to specify is a valid cron string and a worker instance. You can optionally supply arguments, a queue, a retry flag, and a name that will be logged when a worker is submitted. Example: ```rust // Clear out all periodic jobs and their schedules periodic::destroy_all(redis).await?; // Add a new periodic job periodic::builder("0 0 8 * * *")? .name("Email clients with an oustanding balance daily at 8am UTC") .queue("reminders") .args(EmailReminderArgs { report_type: "outstanding_balance", })? .register(&mut p, EmailReminderWorker) .await?; ``` Periodic jobs are not removed automatically. If your project adds a periodic job and then later removes the `periodic::builder` call, the periodic job will still exist in redis. You can call `periodic::destroy_all(redis).await?` at the start of your program to ensure only the periodic jobs added by the latest version of your program will be executed. The implementation relies on a sorted set in redis. It stores a json payload of the periodic job with a score equal to the next scheduled UTC time of the cron string. All processes will periodically poll for changes and atomically update the score to the new next scheduled UTC time for the cron string. The worker that successfully changes the score atomically will enqueue a new job. Processes that don't successfully update the score will move on. This implementation detail means periodic jobs never leave redis. Another detail is that json when decoded and then encoded might not produce the same value as the original string. Ex: `{"a":"b","c":"d"}` might become `{"c":"d","a":b"}`. To keep the json representation consistent, when updating a periodic job with its new score in redis, the original json string will be used again to keep things consistent. ## Server Middleware One great feature of sidekiq is its middleware pattern. This library reimplements the sidekiq server middleware pattern in rust. In the example below supposes you have an app that performs work only for paying customers. The middleware below will hault jobs from being executed if the customers have expired. One thing kind of interesting about the implementation is that we can rely on serde to conditionally type-check workers. For example, suppose I only care about user-centric workers, and I identify those by their `user_guid` as a parameter. With serde it's easy to validate your paramters. ```rust use tracing::info; struct FilterExpiredUsersMiddleware {} impl FilterExpiredUsersMiddleware { fn new() -> Self { Self { } } } #[derive(Deserialize)] struct FiltereExpiredUsersArgs { user_guid: String, } impl FiltereExpiredUsersArgs { fn is_expired(&self) -> bool { self.user_guid == "USR-123-EXPIRED" } } #[async_trait] impl ServerMiddleware for FilterExpiredUsersMiddleware { async fn call( &self, chain: ChainIter, job: &Job, worker: Arc, redis: RedisPool, ) -> ServerResult { // Use serde to check if a user_guid is part of the job args. let args: Result<(FiltereExpiredUsersArgs,), serde_json::Error> = serde_json::from_value(job.args.clone()); // If we can safely deserialize then attempt to filter based on user guid. if let Ok((filter,)) = args { if filter.is_expired() { error!({ "class" = job.class, "jid" = job.jid, "user_guid" = filter.user_guid }, "Detected an expired user, skipping this job" ); return Ok(()); } } // This customer is not expired, so we may continue. chain.next(job, worker, redis).await } } ``` ## Best practices ### Separate enqueue vs fetch connection pools Though not required, it's recommended to use separate Redis connection pools for pushing jobs to Redis vs fetching jobs. This has the following benefits: - The pools can have different sizes, each optimized depending on the resource usage/constraints of your application. - If the `sidekiq::Processor` is configured to have more worker tasks than the max size of the connection pool, then there may be a delay in acquiring a connection from the queue. This is a problem for enqueuing jobs, as it's normally desired that enqueuing be as fast as possible to avoid delaying the critical path of another operation (e.g., an API request). With a separate pool for enqueuing, enqueuing jobs is not impacted by the `sidekiq::Processor`'s usage of the pool. ```rust #[tokio::main] async fn main() -> Result<()> { let manager = sidekiq::RedisConnectionManager::new("redis://127.0.0.1/").unwrap(); let redis_enqueue = bb8::Pool::builder().build(manager).await.unwrap(); let redis_fetch = bb8::Pool::builder().build(manager).await.unwrap(); let p = Processor::new( redis_fetch, vec!["default".to_string()], ); p.run().await; // ... ExampleWorker::perform_async(&redis_enqueue, ExampleArgs { foo: "bar".to_string() }).await?; Ok(()) } ``` ## Customization Details ### Namespacing the workers It's still very common to use the `redis-namespace` gem with ruby sidekiq workers. This library supports namespacing redis commands by using a connection customizer when you build the connection pool. ```rust let manager = sidekiq::RedisConnectionManager::new("redis://127.0.0.1/")?; let redis = bb8::Pool::builder() .connection_customizer(sidekiq::with_custom_namespace("my_cool_app".to_string())) .build(manager) .await?; ``` Now all commands used by this library will be prefixed with `my_cool_app:`, example: `ZDEL my_cool_app:scheduled {...}`. ### Passing database connections into the workers Workers will often need access to other software components like database connections, http clients, etc. You can define these on your worker struct so long as they implement `Clone`. Example: ```rust use tracing::debug; use sidekiq::Result; #[derive(Clone)] struct ExampleWorker { redis: RedisPool, } #[async_trait] impl Worker<()> for ExampleWorker { async fn perform(&self, args: PaymentReportArgs) -> Result<()> { use redis::AsyncCommands; // And then they are available here... let times_called: usize = self .redis .get() .await? .unnamespaced_borrow_mut() .incr("example_of_accessing_the_raw_redis_connection", 1) .await?; debug!({"times_called" = times_called}, "Called this worker"); } } #[tokio::main] async fn main() -> Result<()> { // ... let mut p = Processor::new( redis.clone(), vec!["low_priority".to_string()], ); p.register(ExampleWorker{ redis: redis.clone() }); } ``` ### Customizing the worker name for workers under a nested ruby module You mind find that your worker under a module does not match with a ruby worker under a module. A nested rusty-sidekiq worker `workers::MyWorker` will only keep the final type name `MyWorker` when registering the worker for some "class name". Meaning, if a ruby worker is enqueued with the class `Workers::MyWorker`, the `workers::MyWorker` type will not process that work. This is because by default the class name is generated at compile time based on the worker struct name. To override this, redefine one of the default trait methods: ```rust pub struct MyWorker; use sidekiq::Result; #[async_trait] impl Worker<()> for MyWorker { async fn perform(&self, _args: ()) -> Result<()> { Ok(()) } fn class_name() -> String where Self: Sized, { "Workers::MyWorker".to_string() } } ``` And now when ruby enqueues a `Workers::MyWorker` job, it will be picked up by rust-sidekiq. ### Customizing the number of worker tasks spawned by the `sidekiq::Processor` If an app's workload is largely IO bound (querying a DB, making web requests and waiting for responses, etc), its workers will spend a large percentage of time idle `await`ing for futures to complete. This in turn means the will CPU sit idle a large percentage of the time (if nothing else is running on the host), resulting in under-utilizing available CPU resources. By default, the number of worker tasks spawned by the `sidekiq::Processor` is the host's CPU count, but this can be configured depending on the needs of the app, allowing to use CPU resources more efficiently. ```rust #[tokio::main] async fn main() -> Result<()> { // ... let num_workers = usize::from_str(&env::var("NUM_WORKERS").unwrap()).unwrap(); let config: ProcessorConfig = Default::default(); let config = config.num_workers(num_workers); let processor = Processor::new(redis_fetch, queues.clone()) .with_config(config); // ... } ``` ## License MIT