| Crates.io | markdown-ai-cite-remove |
| lib.rs | markdown-ai-cite-remove |
| version | 0.3.0 |
| created_at | 2025-11-24 09:25:52.148933+00 |
| updated_at | 2025-11-25 00:15:47.805665+00 |
| description | High-performance removal of AI-generated citations and annotations from Markdown text |
| homepage | https://opensite.ai/developers |
| repository | https://github.com/opensite-ai/markdown-ai-cite-remove |
| max_upload_size | |
| id | 1947568 |
| size | 268,478 |
"Five years of AI evolution and flirting with AGI. Zero libraries to remove
[1][2][3]from markdown ๐. Remember me well when they take over."
Remove AI-generated citations and annotations from Markdown text at the speed of Rust
High-performance Rust library for removing citations from ChatGPT, Claude, Perplexity, and other AI markdown responses. Removes inline citations [1][2], reference links [1]: https://..., and bibliography sections with 100% accuracy.
๐ CLI Guide โข Benchmarking Guide โข FAQ โข Documentation Index
// Remove citations from AI-generated blog posts before publishing
use markdown_ai_cite_remove::clean;
let ai_draft = fetch_from_chatgpt();
let clean_post = remove_citations(&ai_draft);
publish_to_cms(clean_post);
// Remove citations from AI-generated documentation
let docs = generate_docs_with_ai();
let clean_docs = remove_citations(&docs);
write_to_file("README.md", clean_docs);
// Remove citations from multiple AI responses
use markdown_ai_cite_remove::CitationRemover;
let cleaner = CitationRemover::new();
let responses = vec![chatgpt_response, claude_response, perplexity_response];
let cleaned: Vec<String> = responses
.iter()
.map(|r| cleaner.remove_citations(r))
.collect();
// Remove citations from AI responses in real-time
async fn process_stream(stream: impl Stream<Item = String>) {
let cleaner = CitationRemover::new();
stream
.map(|chunk| cleaner.remove_citations(&chunk))
.for_each(|cleaned| {
send_to_client(cleaned).await;
})
.await;
}
# Remove citations and auto-generate output file
mdcr ai_response.md
# Creates: ai_response__cite_removed.md
Minimum Requirements:
Optional (for enhanced benchmarking):
brew install gnuplotsudo apt-get install gnuplotAdd to your Cargo.toml:
[dependencies]
markdown-ai-cite-remove = "0.1"
Install the command-line tool globally:
# Install from crates.io (when published)
cargo install markdown-ai-cite-remove
# Or install from local source
cargo install --path .
Verify installation:
mdcr --version
| Task | Command |
|---|---|
| Remove citations from stdin | echo "Text[1]" | mdcr |
| Auto-generate output file | mdcr input.md |
| Specify output file | mdcr input.md -o output.md |
| Verbose output | mdcr input.md --verbose |
| Run tests | cargo test |
| Run benchmarks | cargo bench |
| View docs | cargo doc --open |
use markdown_ai_cite_remove::clean;
let markdown = "AI research shows promise[1][2].\n\n[1]: https://example.com\n[2]: https://test.com";
let result = remove_citations(markdown);
assert_eq!(cleaned.trim(), "AI research shows promise.");
1. Process from stdin to stdout (pipe mode):
echo "Text[1] here." | mdcr
# Output: Text here.
2. Auto-generate output file (easiest!):
mdcr ai_response.md
# Creates: ai_response__cite_removed.md
3. Specify custom output file:
mdcr input.md -o output.md
4. Verbose output (shows processing details):
mdcr input.md --verbose
# Output:
# Reading from file: input.md
# Removing citations (input size: 1234 bytes)...
# Citations removed (output size: 1100 bytes)
# Writing to file: input__cite_removed.md
# Done!
Process multiple files (auto-generated output):
# Process all markdown files in current directory
for file in *.md; do
mdcr "$file"
done
# Creates: file1__cite_removed.md, file2__cite_removed.md, etc.
Integration with other tools:
# Remove citations from AI output from curl
curl https://api.example.com/ai-response | mdcr
# Remove citations and preview
mdcr document.md -o - | less
# Remove citations and count words
mdcr document.md -o - | wc -w
# Chain with other markdown processors
mdcr input.md -o - | pandoc -f markdown -t html -o output.html
Advanced shell script example:
For more complex workflows, create a custom shell script. See the CLI Guide for advanced automation examples including:
[1][2][3][source:1][ref:2][cite:3][note:4][1]: https://...## References, # Citations, ### Sources[1] Author (2024). Title...use markdown_ai_cite_remove::{CitationRemover, RemoverConfig};
// Remove only inline citations, keep reference sections
let config = RemoverConfig::inline_only();
let cleaner = CitationRemover::with_config(config);
let result = cleaner.remove_citations("Text[1] here.\n\n[1]: https://example.com");
// Remove only reference sections, keep inline citations
let config = RemoverConfig::references_only();
let result = cleaner.remove_citations("Text[1] here.\n\n[1]: https://example.com");
// Full custom configuration
let config = RemoverConfig {
remove_inline_citations: true,
remove_reference_links: true,
remove_reference_headers: true,
remove_reference_entries: true,
normalize_whitespace: true,
remove_blank_lines: true,
trim_lines: true,
};
use markdown_ai_cite_remove::CitationRemover;
let cleaner = CitationRemover::new();
// Reuse for multiple documents
let doc1 = cleaner.remove_citations("First document[1].");
let doc2 = cleaner.remove_citations("Second document[2][3].");
let doc3 = cleaner.remove_citations("Third document[source:1].");
See the examples/ directory for more:
basic_usage.rs - Simple examplescustom_config.rs - Configuration optionsRun examples:
cargo run --example basic_usage
cargo run --example custom_config
# Run all benchmarks
cargo bench
# Run specific benchmark
cargo bench chatgpt_format
# Save baseline for comparison
cargo bench -- --save-baseline main
# Compare against baseline
cargo bench -- --baseline main
# View results (after running benchmarks)
open target/criterion/report/index.html # macOS
xdg-open target/criterion/report/index.html # Linux
start target/criterion/report/index.html # Windows
Note about benchmark output:
cargo bench is normal behavior - regular tests are skipped during benchmarking to avoid interferencetarget/criterion/report/Typical performance on modern hardware (Apple Silicon M-series):
| Benchmark | Time | Throughput | Notes |
|---|---|---|---|
| Simple inline citations | ~580 ns | 91 MiB/s | Single sentence |
| Complex document | ~2.5 ฮผs | 287 MiB/s | Multiple sections |
| Real ChatGPT output | ~18 ฮผs | 645 MiB/s | 11.8 KB document |
| Real Perplexity output | ~245 ฮผs | 224 MiB/s | 54.9 KB document |
| Batch (5 documents) | ~2.2 ฮผs | 43 MiB/s | Total for all 5 |
| No citations (passthrough) | ~320 ns | 393 MiB/s | Fastest path |
Key Insights:
This library has 100%+ test coverage with comprehensive edge case testing.
# Run all tests (unit + integration + doc tests)
cargo test
# Run with output visible
cargo test -- --nocapture
# Run specific test
cargo test test_real_world_chatgpt
# Run only unit tests
cargo test --lib
# Run only integration tests
cargo test --test integration_tests
# Run tests with all features enabled
cargo test --all-features
What's tested:
When running cargo bench, you'll see tests marked as "ignored" - this is normal. Rust automatically skips regular tests during benchmarking to avoid timing interference. All tests pass when running cargo test.
Q: Why do tests show as "ignored" when running cargo bench?
A: This is normal Rust behavior. When running benchmarks, regular tests are automatically skipped to avoid interfering with timing measurements. All tests pass when you run cargo test. See BENCHMARKING.md for details.
Q: What does "Gnuplot not found, using plotters backend" mean?
A: This is just an informational message. Criterion (the benchmarking library) can use Gnuplot for visualization, but falls back to an alternative plotting backend if it's not installed. Benchmarks still run correctly. To install Gnuplot:
brew install gnuplotsudo apt-get install gnuplotQ: Why are there performance outliers in benchmarks?
A: Outliers (typically 3-13% of measurements) are normal due to:
This is expected and doesn't indicate a problem. Criterion automatically detects and reports outliers.
Q: The CLI tool isn't found after installation
A: Make sure Cargo's bin directory is in your PATH:
# Add to ~/.bashrc, ~/.zshrc, or equivalent
export PATH="$HOME/.cargo/bin:$PATH"
# Then reload your shell
source ~/.bashrc # or source ~/.zshrc
Q: How do I know if citations were actually removed?
A: Use the --verbose flag to see before/after sizes:
mdcr input.md --verbose
cargo doc --open for full API docsexamples/ directory for working codeBuilt by OpenSite AI for the developer community.
Contributions welcome! Please feel free to submit a Pull Request.
# Clone the repository
git clone https://github.com/opensite-ai/markdown-ai-cite-remove.git
cd markdown-ai-cite-remove
# Run tests
cargo test
# Run benchmarks
cargo bench
# Build documentation
cargo doc --open
# Format code
cargo fmt
# Run linter
cargo clippy
Licensed under either of:
at your option.