Crates.io | fav_core |
lib.rs | fav_core |
version | 0.1.6 |
source | src |
created_at | 2024-02-08 04:15:44.423915 |
updated_at | 2024-11-10 09:47:29.366547 |
description | Fav's core crate; A collection of traits. |
homepage | |
repository | https://github.com/kingwingfly/fav |
max_upload_size | |
id | 1131610 |
size | 57,701 |
fav_core is the core library of fav_cli (A cli tool to download remote resources and keep a local state in protobuf). In simple words, fav_core
is a helper to build a stateful crawler.
fav_utils provides the utils for fav_cli, which now only support BiliBili(Like Chinese YouTube). You can see it as an example for using this crate.
To save status, instead of using json, this crate uses protobuf
since it is faster. You need to define data structures with protobuf like this example (To derive trait for code generated by protobuf, see example).
Sets
contains Set
s, Set
contains Res
s(resource). The workflow is:
Sets
to refresh Set
sSet
to refresh Res
sRes
to downloadTo implement this workflow and maintain a local state, fav_core
has many useful traits:
Api
: help defining the APIsApiProvider
: make app able to provide API based on ApiKind
enumNet
: make app able to use the InternetConfig: HttpConfig + ProtoLocal
mark the app able to be config and persisted
HttpConfig
: define the default headers, cookies
Sets
: iterate over and get subset of setsSet
: iterate over and get subset of resourcesRes: Meta
Meta
: the metadata of resource, Meta: Attr + Status
Attr
: provide resource's id and titleStatus
: the status of resource, like saved, fetched, tracked and expiredOps
: Ops: AuthOps + SetsOps + SetOps + ResOps
, means the app can perform all needed operationsAuthOps
: used to login and logoutSetsOps
: used to fetch_sets
info, for example, add English
Chinese
Japanese
as new movie collections to Sets
defined in protobuf.SetOps
: used to fetch_set
info, for example, add 《Oliver Twist》《Roman Holiday》《Twelve Angry Men》to English
collection.ResOps
: used to fetch
and pull
, for example, fetch
id of 《Oliver Twist》 in target website, pull
the resources to local disk based on the fetched id.PathInfo
: defined where to store status and configProtoLocal
: ProtoLocal: PathInfo + MessageFull
used to read and write status and configSaveLocal
: make app able to download Res
, and modify local status.SetOpsExt: SetOps
batch fetch set in setsResOpsExt: ResOps
batch fetch resources in setXXStatusExt
: batch modify children's StatusFlagsTo draw a conclusion, this crate contains all traits you need to build a stateful crawler. You can define data structures with protobuf
for fast read and write. Make them stateful, configurable, and able to be persisted. Many network helper is provided, you can request_json
and resquest_protobuf
directly. And Ext
traits are provided so that you can batch fetch and pull data or modify the resources' StatusFlags.
An example can be found in fav repo.
XXOpsExt
needs batch_size
passed so that users can define the number of jobs concurrently.Ops
related traits' methods need Fut: Future<...>
, if Future is ready, one can cleanup, shutdown gracefully and return FavCoreError::Cancel
. And OpsExt
methods handle SIGINT based on this, keeps things reliable.