Crates.io | pandoras_pot |
lib.rs | pandoras_pot |
version | 0.7.1 |
source | src |
created_at | 2024-01-25 21:10:28.932624 |
updated_at | 2024-10-05 12:21:10.884462 |
description | Honeypot designed to send huge amounts of data to rude web scrapers |
homepage | |
repository | https://github.com/ginger51011/pandoras_pot/ |
max_upload_size | |
id | 1114447 |
size | 143,949 |
Inspired by HellPot, pandoras_pot
is an HTTP honeypot that aims to bring even more misery on unruly web crawlers that
don't respect your robots.txt
.
The goal with pandoras_pot
is to have maximum data output sent to incoming
unwanted connections, while not using up all the resources of your webserver
that probably could be doing better things with its time.
To ensure that bots don't detect pandoras_pot
, it generates random data that kind
of looks like a website (to a bot), really really fast. Like crazy fast. One could even
say blazingly fast. Hopefully.
pandoras_pot
supports multiple modes of generation, depending on its
configuration. It can for example generate random strings as data, or "actual"
sentances using Markov chains. Neato!
The most likely use-case is to use another server as a reverse proxy, and then
select some paths that should be forwarded to pandoras_pot
, like
/wp-login.php
, /.git/config
, and /.env
.
Note that the URIs you use should have Disallow
set in your /robots.txt
,
otherwise you might get in trouble from things like googlebot who will dislike
your strange page of death. For the paths above, you could have a robots.txt
like the one below:
User-agent: *
Disallow: /wp-login.php
Disallow: /.git
Disallow: /.env
Common reverse proxies include nginx
, httpd
(apache), and Caddy
.
In Caddy you could add the following to match the /robots.txt
we have already created:
(pandorust) {
@pandorust_paths {
path /wp-login.php /.git* /.env*
}
handle @pandorust_paths {
reverse_proxy localhost:6669 # Or whatever you run pandoras_pot on
}
}
# ...
example.com {
# ...
# Your actual website
# ...
import pandorust
}
After this you can simply run (if you installed using cargo install pandoras_pot
):
pandoras_pot --help
to get more info.
Done!
The easiest way to set up pandoras_pot
is using docker. You can optionally
pass an argument to a config file using the docker --build-arg CONFIG=<path to your config>
flag (but it should be available in the build context).
Start by cloning the repo by running
git clone git@github.com:ginger51011/pandoras_pot.git
cd pandoras_pot
Then you can build an image and deploy it, here naming and tagging it with pandoras_pot
and making it available on port localhost:6669
:
docker build -t pandoras_pot . # You can add --build-arg CONFIG=<...> here
docker run --name=pandoras_pot --restart=always -p 6669:8080 -d pandoras_pot
systemd
ServiceYou can also easily set up a systemd
service. This requires you to
install Rust, but requires one less
bloated docker image and makes reloading configurations easier. In this example
I will set up a new user, pandora-user
, but you can use any user you want
(but we will lock pandora-user
down).
Note: With the exception of cloning and building pandoras_pot, most commands here will require root.
Start by cloning the repo and building pandoras_pot
(after installing Rust):
git clone git@github.com:ginger51011/pandoras_pot.git
cd pandoras_pot
cargo build --release
# Move the binary to a better place
cp ./target/release/pandoras_pot /usr/bin/
We then create the user that will run the process; this user won't be root and cannot even login:
adduser --disabled-password --gecos '' --shell /sbin/nologin --no-create-home --home /iamadirandidontexist 'pandora-user'
Then we create a directory to keep our configuration (and also things like the
data
file for some generators):
mkdir /etc/pandoras_pot
# Ensure the config file exists; you can copy the default one in this README
# into this file
touch /etc/pandoras_pot/config.toml
# Optionally you can create your data file here. You need to point to it from
# the config.
# Make pandora-user the owner of this dir
chown -R pandora-user:pandora-user /etc/pandoras_pot
Now we create the actual service. If you have used the examples here, you can
just copy-paste this into a new file at /etc/systemd/system/pandorad.service
:
[Unit]
Description=Pandora's Pot "service"
After=network.target
StartLimitIntervalSec=0
[Service]
# Change to another user/group if needed
User=pandora-user
Group=pandora-user
Restart=always
RestartSec=1
WorkingDirectory=/etc/pandoras_pot/
# Requires that the file /etc/pandoras_pot/config.toml exists; you can also
# remove config.toml to use plain default settings.
ExecStart=/usr/bin/pandoras_pot config.toml
###
## Hardening; this is optional and can be commented out, but is generally
## good practice. Some might prevent pandoras_pot from functioning, see below.
##
## Other settings may exist and be suitable.
##
## For more info, see systemd.exec(5)
##
MemoryDenyWriteExecute=yes
NoNewPrivileges=yes
PrivateDevices=yes
PrivateTmp=yes
PrivateUsers=yes
ProtectClock=yes
ProtectControlGroups=yes
ProtectHostname=yes
ProtectKernelLogs=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
RestrictNamespaces=yes
RestrictSUIDSGID=yes
# These might prevent pandoras_pot from writing to a log file if ReadWritePaths is misconfigured.
ProtectHome=yes
ProtectSystem=strict
# This should point to the output log file; this is the default value.
# It should be the same as `logging.output_path` in the config.toml.
# A sane alternative is `/var/log/pandoras.log`.
ReadWritePaths=/etc/pandoras_pot/pandoras.log
##
## End of hardening
###
[Install]
WantedBy=multi-user.target
Then you need to reload some daemons, enable and start your service:
systemctl daemon-reload
systemctl enable pandorad.service
systemctl start pandorad.service
You can check if everything looks good:
systemctl status pandorad.service
Done!
pandoras_pot
uses toml as a configuration format. If you are not using docker,
you can either pass a config like an argument like so:
pandoras_pot <path-to-config>
or put it in a file at $HOME/.config/pandoras_pot/config.toml
.
You can always get the default configuration using
pandoras_pot --print-default-config
A sample file can be found below:
[http]
# Make sure this matches your Dockerfile's "EXPOSE" if using Docker
port = "8080"
# Routes to send misery to. Is overridden by `http.catch_all`
routes = ["/wp-login.php", "/.env"]
# If all routes are to be served.
catch_all = true
# How many connections that can be made over `http.rate_limit_period` seconds. Will
# not set any limit if set to 0.
rate_limit = 0
# Amount of seconds that `http.rate_limit` checks on. Does nothing if rate limit is set
# to 0.
rate_limit_period = 300 # 5 minutes
# Enables `http.health_port` to be used for health checks (to see if
# `pandoras_pot` is running). Useful if you want to use your chad gaming PC
# that might not always be up and running to back up an instance running on
# your RPi 3 web server.
health_port_enabled = false
# Port to be used for health checks. Should probably not be accessible from the
# outside. Has no effect if `http.health_port_enabled` is `false`.
health_port = "8081"
# The `Content-Type` header set in responses.
content_type = "text/html; charset=utf-8"
[generator]
# The size of each generated chunk in bytes. Has a big impact on performance, so
# play around a bit! Note that if this is set too low (like 10 bytes), `pandoras_pot`
# will refuse to run.
chunk_size = 16384 # 1024 * 16
# The type of generator to be used
type = { name = "random" }
# For generator.type it is also possible to set a markov chain generator, using
# a text file as a source of data. Then you can use this (but uncommented, duh):
# type = { name = "markov_chain", data = "<path to some text file>" }
# Another alternative is a static generator, that always outputs the full contents
# of a file. Does not respect chunking.
# type = { name = "static", data = "<path to some file>" }
# The max amount of simultaneous generators that can produce output.
# Useful for preventing abuse. `0` means no limit.
max_concurrent = 100
# The amount of time in seconds a generator can be active before
# it stops sending. `0` means no limit.
time_limit = 0
# The amount of data in bytes that a generator can
# send before it stops sending. `0` means no limit.
size_limit = 0
# How many chunks should be buffered for each connection. Higher values mean
# more memory usage, but may lead to increased performance. Must be >= 1.
chunk_buffer = 20
# Prefix that will be used for the first message to an incoming connection.
# Usually used to set an HTML prefix. Can be set to "" to disable.
#
# Example usage: Set to "{" for a static generator using a JSON file to make
# output look like a valid stream of JSON that will eventually end (it won't).
prefix = "<!DOCTYPE html><html><body>"
[logging]
# Output file for logs.
output_path = "pandoras.log"
# If pretty logs should be written to standard output.
print_pretty_logs = true
# If no logs at all should be printed to stdout. Overrides other stdout logging
# settings.
no_stdout = false
You can easily measure how fast your setup sends data by using curl
. Note that using
localhost
might not be reliable, as it does not show what an outsider might see. A better
option might be to use another machine.
This example assume that you have http.catch_all
enabled, otherwise you should add a
valid route.
curl localhost:8080/ >> /dev/null
I do not accept any donations. If you however find any software I
write for fun useful, please consider donating to an efficient charity that
save or improve lives the most per $CURRENCY
.
GiveWell.org is an excellent website that can help you donate to the world's most efficient charities. Alternatives listing the current best charities for helping our planet is Founders Pledge, and for animal welfare Animal Charity Evaluators.
This list is not exhaustive; your country may have an equivalent.