Crates.io | ukma_url_parser |
lib.rs | ukma_url_parser |
version | 0.1.0 |
source | src |
created_at | 2024-11-13 17:38:15.136631 |
updated_at | 2024-11-13 17:38:15.136631 |
description | This Rust project provides functionality for parsing URLs and their components (such as protocol, domains, parameters, etc.) using the rust-pest library. It is designed for analyzing parts of a URL. |
homepage | |
repository | |
max_upload_size | |
id | 1446871 |
size | 107,319 |
This Rust project provides functionality for parsing URLs and their components (such as protocol, domains, parameters, etc.) using the rust-pest library. It is designed for analyzing parts of a URL.
A URL (Uniform Resource Locator) is the address of a unique resource on the internet. It is one of the key mechanisms used by browsers to retrieve resources such as HTML pages, CSS documents, images, and more. This library parses URL parts step-by-step, so it is crucial to understand the structure of a URL:
As well as many lesser known schemes like:
A scheme is required for a URL address.
Technical aspects of parsing:
For parsing purposes, created an enum of all official protocols and method that returns their string value. The first rule of this parser checks whether a provided string is a valid URL scheme. It is important to mention, that URL scheme is case insensitive.
Technical aspects of parsing:
A domain name is required for a URL address. A label is a part of a domain name, separated by a dot ("."). For simplicity, domain names and subdomains are parsed into a general vector as string values.
Technical aspects of parsing:
A port number is a 16-bit unsigned integer that ranges from 0 to 65535. So this is how it is parsed. A port value separated from a domain value by a colon. A port can be optional.
Technical aspects of parsing:
The path can be optional. It is stored as list of strings. Example: /page/new/3
Technical aspects of parsing:
Parameters are parsed as hashmap with key value structure (Key and value are strings).
Technical aspects of parsing:
It is parsed as a usual string/
make run
make test
make fmt
make clippy
make doc
make help