| Crates.io | fav_core |
| lib.rs | fav_core |
| version | 0.1.8 |
| created_at | 2024-02-08 04:15:44.423915+00 |
| updated_at | 2025-03-01 22:14:42.88271+00 |
| description | Fav's core crate; A collection of traits. |
| homepage | |
| repository | https://github.com/kingwingfly/fav |
| max_upload_size | |
| id | 1131610 |
| size | 108,558 |
fav_core is the core library of fav_cli (A cli tool to download remote resources and keep a local state in protobuf). In simple words, fav_core is a helper to build a stateful crawler.
fav_utils provides the utils for fav_cli, which now only support BiliBili(Like Chinese YouTube). You can see it as an example for using this crate.
To save status, instead of using json, this crate uses protobuf since it is faster. You need to define data structures with protobuf like this example (To derive trait for code generated by protobuf, see example).
Sets contains Sets, Set contains Ress(resource). The workflow is:
Sets to refresh SetsSet to refresh RessRes to downloadTo implement this workflow and maintain a local state, fav_core has many useful traits:
Api: help defining the APIsApiProvider: make app able to provide API based on ApiKind enumNet: make app able to use the InternetConfig: HttpConfig + ProtoLocal mark the app able to be config and persisted
HttpConfig: define the default headers, cookies
Sets: iterate over and get subset of setsSet: iterate over and get subset of resourcesRes: MetaMeta: the metadata of resource, Meta: Attr + StatusAttr: provide resource's id and titleStatus: the status of resource, like saved, fetched, tracked and expiredOps: Ops: AuthOps + SetsOps + SetOps + ResOps, means the app can perform all needed operationsAuthOps: used to login and logoutSetsOps: used to fetch_sets info, for example, add English Chinese Japanese as new movie collections to Sets defined in protobuf.SetOps: used to fetch_set info, for example, add 《Oliver Twist》《Roman Holiday》《Twelve Angry Men》to English collection.ResOps: used to fetch and pull , for example, fetch id of 《Oliver Twist》 in target website, pull the resources to local disk based on the fetched id.PathInfo: defined where to store status and configProtoLocal: ProtoLocal: PathInfo + MessageFull used to read and write status and configSaveLocal: make app able to download Res, and modify local status.SetOpsExt: SetOps batch fetch set in setsResOpsExt: ResOps batch fetch resources in setXXStatusExt: batch modify children's StatusFlagsTo draw a conclusion, this crate contains all traits you need to build a stateful crawler. You can define data structures with protobuf for fast read and write. Make them stateful, configurable, and able to be persisted. Many network helper is provided, you can request_json and resquest_protobuf directly. And Ext traits are provided so that you can batch fetch and pull data or modify the resources' StatusFlags.
An example can be found in fav repo.
XXOpsExt needs batch_size passed so that users can define the number of jobs concurrently.Ops related traits' methods need Fut: Future<...>, if Future is ready, one can cleanup, shutdown gracefully and return FavCoreError::Cancel. And OpsExt methods handle SIGINT based on this, keeps things reliable.