py-license-auditor

Crates.iopy-license-auditor
lib.rspy-license-auditor
version0.1.0
created_at2025-09-22 08:12:39.219711+00
updated_at2025-09-22 08:12:39.219711+00
descriptionA fast, reliable command-line tool to extract and analyze license information from Python packages with policy-based violation detection
homepagehttps://github.com/yayami3/py-license-auditor
repositoryhttps://github.com/yayami3/py-license-auditor
max_upload_size
id1849653
size71,237
(yayami3)

documentation

https://docs.rs/py-license-auditor

README

py-license-auditor

Crates.io Documentation License

A fast, reliable command-line tool to extract and analyze license information from Python packages installed in your environment.

✨ Features

  • 🔍 Comprehensive Detection: Extracts license info from .dist-info and .egg-info directories
  • 📊 Multiple Output Formats: JSON, TOML, and CSV support
  • 🎯 Smart Categorization: Separates OSI-approved from non-OSI licenses
  • 📈 Usage Statistics: Shows license distribution with counts
  • 🚀 Fast Performance: Written in Rust for speed
  • 🔧 CI/CD Ready: Perfect for automated license compliance checks
  • ⚖️ License Violation Detection: Automatically detect policy violations with customizable rules
  • 🚨 Policy Enforcement: Fail builds on forbidden licenses for compliance automation
  • 🎛️ Flexible Configuration: TOML-based policy files with exact matching and glob patterns

🚀 Installation

From crates.io (Recommended)

cargo install py-license-auditor

From source

git clone https://github.com/yayami3/py-license-auditor
cd py-license-auditor
cargo install --path .

📖 Usage

📚 Quick Start: See QUICKSTART.md for a step-by-step guide

Quick Start

# Auto-detect .venv in current directory
py-license-auditor

# Specify site-packages directory
py-license-auditor --path /path/to/site-packages

# Save to file
py-license-auditor --output licenses.json

Output Formats

# JSON (default)
py-license-auditor --format json

# TOML
py-license-auditor --format toml

# CSV for spreadsheets
py-license-auditor --format csv

Advanced Options

# Include packages without license info
py-license-auditor --include-unknown

# Combine options
py-license-auditor --format csv --output report.csv --include-unknown

License Violation Detection

# Use built-in policies (no setup required)
py-license-auditor --policy corporate --check-violations
py-license-auditor --policy permissive --check-violations  
py-license-auditor --policy strict --check-violations

# Use custom policy file
py-license-auditor --policy-file policy.toml --check-violations

# Fail build on forbidden licenses (for CI/CD)
py-license-auditor --policy corporate --check-violations --fail-on-violations

# Generate compliance report with violations
py-license-auditor --policy strict --check-violations --output compliance.json

📊 Output Example

JSON Format

{
  "packages": [
    {
      "name": "requests",
      "version": "2.31.0",
      "license": "Apache-2.0",
      "license_classifiers": [
        "License :: OSI Approved :: Apache Software License"
      ],
      "metadata_source": "METADATA"
    }
  ],
  "summary": {
    "total_packages": 50,
    "with_license": 45,
    "without_license": 5,
    "license_types": {
      "osi_approved": {
        "MIT": 20,
        "Apache-2.0": 15,
        "BSD": 8
      },
      "non_osi": {
        "MIT License": 2
      }
    }
  },
  "violations": {
    "total": 2,
    "errors": 1,
    "warnings": 1,
    "details": [
      {
        "package_name": "some-gpl-lib",
        "package_version": "2.1.0",
        "license": "GPL-3.0",
        "violation_level": "Forbidden",
        "matched_rule": "exact: GPL-3.0",
        "message": "License 'GPL-3.0' is forbidden by policy"
      }
    ]
  }
    }
  }
}

CSV Format

name,version,license,license_classifiers,metadata_source
requests,2.31.0,Apache-2.0,"License :: OSI Approved :: Apache Software License",METADATA
click,8.1.7,BSD-3-Clause,"License :: OSI Approved :: BSD License",METADATA

🎛️ Policy Configuration

Built-in Policies

Three ready-to-use policies are included:

# Corporate: Conservative policy for proprietary software
py-license-auditor --policy corporate --check-violations

# Permissive: Balanced policy for open source projects  
py-license-auditor --policy permissive --check-violations

# Strict: Very restrictive - only MIT, Apache-2.0, BSD-3-Clause
py-license-auditor --policy strict --check-violations
Policy Allowed Forbidden Review Required
Corporate MIT, Apache-2.0, BSD-* GPL-, AGPL-, LGPL-* MPL-2.0
Permissive MIT, Apache-2.0, BSD-*, MPL-2.0 None GPL-, AGPL-
Strict MIT, Apache-2.0, BSD-3-Clause GPL-, AGPL-, LGPL-*, MPL-2.0 ISC, BSD-*

Custom Policy File Format

Create a policy.toml file to define your license compliance rules:

name = "Corporate License Policy"
description = "License policy for proprietary software development"

[allowed_licenses]
exact = ["MIT", "Apache-2.0", "BSD-3-Clause", "ISC"]
patterns = ["BSD-*"]

[forbidden_licenses]
exact = ["GPL-3.0", "AGPL-3.0"]
patterns = ["GPL-*", "AGPL-*"]

[review_required]
exact = ["MPL-2.0", "LGPL-2.1"]
patterns = ["LGPL-*"]

[[exceptions]]
name = "legacy-package"
version = "1.0.0"
reason = "Approved by legal team for legacy compatibility"

Policy Rules

  • allowed_licenses: Licenses that are automatically approved
  • forbidden_licenses: Licenses that cause build failures
  • review_required: Licenses that need manual review (warnings)
  • exceptions: Package-specific overrides with justification

Pattern Matching

Use glob patterns for flexible license matching:

  • "GPL-*" matches GPL-2.0, GPL-3.0, etc.
  • "BSD-*" matches BSD-2-Clause, BSD-3-Clause, etc.

🎯 Use Cases

License Compliance

Generate comprehensive reports for legal review and compliance auditing.

# Generate compliance report
py-license-auditor --format json --output compliance-report.json

CI/CD Integration

Automate license checking in your deployment pipeline.

# GitHub Actions example
- name: Check license compliance
  run: |
    py-license-auditor --policy corporate --check-violations --fail-on-violations
    
- name: Generate license report
  run: |
    py-license-auditor --format json --output license-report.json
# Basic license extraction
py-license-auditor --format json > licenses.json

Dependency Auditing

Understand your project's license obligations and risks.

# Focus on non-OSI licenses that need manual review
py-license-auditor --format json | jq '.summary.license_types.non_osi'

🔍 License Categories

The tool categorizes licenses into two groups:

  • OSI Approved: Licenses approved by the Open Source Initiative (legally vetted)
  • Non-OSI: Custom licenses, proprietary licenses, or unrecognized formats

This helps you quickly identify which licenses need manual legal review.

🛠️ Development

Building from Source

git clone https://github.com/yayami3/py-license-auditor
cd py-license-auditor
cargo build --release

Running Tests

cargo test

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

📄 License

This project is licensed under either of

at your option.

🙏 Acknowledgments

  • Built with Clap for CLI parsing
  • Uses Serde for serialization
  • Inspired by the need for better Python license compliance tools
Commit count: 9

cargo fmt