= nng_url_parse(3) // // Copyright 2018 Staysail Systems, Inc. // Copyright 2018 Capitar IT Group BV // // This document is supplied under the terms of the MIT License, a // copy of which should be located in the distribution where this // file was obtained (LICENSE.txt). A copy of the license may also be // found online at https://opensource.org/licenses/MIT. // == NAME nng_url_parse - create URL structure from a string == SYNOPSIS [source, c] ---- #include int nng_url_parse(nng_url **urlp, const char *str); ---- == DESCRIPTION The `nng_url_parse()` function parses the string _str_ containing an https://tools.ietf.org/html/rfc3986[RFC 3986] compliant URL, and creates a structure containing the results. A pointer to the resulting structure is stored in _urlp_. The `nng_url` structure has at least the following members: [source, c] ---- struct nng_url { char *u_rawurl; // Unparsed URL, with minimal canonicalization. char *u_scheme; // Scheme, such as "http"; always lower case. char *u_userinfo; // Userinfo component, or NULL. char *u_host; // Full host, including port if present. char *u_hostname; // Hostname only (or address), or empty string. char *u_port; // Port number, may be default or empty string. char *u_path; // Path if present, empty string otherwise. char *u_query; // Query info if present, NULL otherwise. char *u_fragment; // Fragment if present, NULL otherwise. char *u_requri; // Request-URI (path[?query][#fragment]) }; ---- === URL Canonicalization The `nng_url_parse()` function also canonicalizes the results, as follows: 1. The URL is parsed into the various components. 2. The `u_scheme`, `u_hostname`, `u_host`, and `u_port` members are converted to lower case. 3. Percent-encoded values for https://tools.ietf.org/html/rfc3986#section-2.3[unreserved characters] converted to their unencoded forms. 4. Additionally URL percent-encoded values for characters in the path and with numeric values larger than 127 (i.e. not ASCII) are decoded. 5. The resulting `u_path` is checked for invalid UTF-8 sequences, consisting of surrogate pairs, illegal byte sequences, or overlong encodings. If this check fails, then the entire URL is considered invalid, and the function returns `NNG_EINVAL`. 6. Path segments consisting of `.` and `..` are resolved as per https://tools.ietf.org/html/rfc3986#section-6.2.2.3[RFC 3986 6.2.2.3]. 7. Further, empty path segments are removed, meaning that duplicate slash (`/`) separators are removed from the path. 8. If a port was not specified, but the scheme defines a default port, then `u_port` will be filled in with the value of the default port. TIP: Only the `u_userinfo`, `u_query`, and `u_fragment` members will ever be `NULL`. The other members will be filled in with either default values or the empty string if they cannot be determined from _str_. == RETURN VALUES This function returns 0 on success, and non-zero otherwise. == ERRORS [horizontal] `NNG_ENOMEM`:: Insufficient free memory exists to allocate a message. `NNG_EINVAL`:: An invalid URL was supplied. == SEE ALSO [.text-left] xref:nng_url_clone.3.adoc[nng_url_clone(3)], xref:nng_url_free.3.adoc[nng_url_free(3)], xref:nng_strerror.3.adoc[nng_strerror(3)], xref:nng.7.adoc[nng(7)]