| Crates.io | mall-portrait-common |
| lib.rs | mall-portrait-common |
| version | 0.1.1 |
| created_at | 2026-01-19 00:52:33.8767+00 |
| updated_at | 2026-01-23 03:54:45.501562+00 |
| description | Common utilities and types for mall portrait projects. |
| homepage | https://github.com/JiabinTang/mall-portrait-common |
| repository | https://github.com/JiabinTang/mall-portrait-common |
| max_upload_size | |
| id | 2053403 |
| size | 108,196 |
统一的画像数据结构与工具集,面向商城画像后台的数据接入、清洗与聚类流程。
UserEvent、UserProfile 与 RawEvent,可串联多个数据源并标准化事件元信息。RecordParser、LabelGenerator 和 DataSink 等扩展点,方便各类加工、标签计算与落盘实现。serde/polars/chrono 等可靠依赖,兼顾异步写入与多源数据校验。UserEvent 等标准化结构定义在 src/normalized.rs,包含事件上下文、载荷、多态 payload 与质量得分。RawEvent 与原始记录模型在 src/raw.rs,用于采集 CSV/Excel 行数据,并保留字段哈希便于追踪。UserProfile 封装二维/多维行为指标与 RFM 分层,位于 src/profile.rs。RecordParser 在 src/utils.rs 提供通用字段读取、日期解析与置信度计算,便于从原始字段提取高质量参数。LabelGenerator 抽象见 src/labels.rs,输出 polars::Expr 便于在 Polar LazyFrame 中拼接特征算子。DataSink trait(src/data_sink.rs)定义异步写入接口,搭配任何异步存储或消息队列。PortraitError/Result 描述数据接入、解析、算法、IO 等常见故障。Cargo.toml 中添加依赖:[dependencies]
mall-portrait-common = "0.1"
use chrono::Utc;
use mall_portrait_common::{EventPayload, EventType, RecordParser, UserEvent, DataSource};
use serde_json::json;
use std::collections::HashMap;
use uuid::Uuid;
let mut raw_fields = HashMap::new();
raw_fields.insert("event_type".to_string(), "TicketOrderPlaced".to_string());
raw_fields.insert("mall_id".to_string(), "mall-42".to_string());
let parser = RecordParser::new(&raw_fields);
let quality = parser.calculate_quality_score() as f32;
let mut payload = HashMap::new();
payload.insert("order_no".to_string(), json!("T123"));
payload.insert("total_price".to_string(), json!(699.0));
let event = UserEvent {
event_id: Uuid::new_v4(),
mall_id: "mall-42".to_string(),
user_id: "user-x".to_string(),
event_time: chrono::Utc::now().timestamp(),
ingestion_time: chrono::Utc::now().timestamp(),
source: DataSource::TicketOrderCsv { path: "s3://bucket/orders.csv".to_string() },
global_id: "gid-001".to_string(),
event_type: EventType::TicketOrderPlaced,
event_payload: EventPayload::Raw(payload),
context: Default::default(),
quality_score: quality,
validation_errors: Vec::new(),
};
LabelGenerator 与 DataSink 即可在 pipeline 中计算特征并写入下游仓库。polars + lazy、serde + derive、async-trait、uuid 等)。polars::LazyFrame 结合 exprs() 生成标签,并在写入时复用 DataSink 扩展。cargo publish 发布新版本,推荐在发布前运行 cargo package 与 cargo fmt。