| Crates.io | pprog |
| lib.rs | pprog |
| version | 0.0.8 |
| created_at | 2024-12-31 06:36:50.175547+00 |
| updated_at | 2025-01-27 06:52:39.855056+00 |
| description | An LLM pair programming server with web interface |
| homepage | |
| repository | https://github.com/foomprep/pprog |
| max_upload_size | |
| id | 1499765 |
| size | 1,302,311 |
pprog is an LLM based pair programmer for working on coding projects. it can generate, edit and answer questions about your code.
This is experimental and unstable code, it may change at any time. I created this project for personal use because I didn't want to be locked in to a specific editor. More tools and feature will be added over time.
To install Rust, go to their website.
To build
cargo install pprog
or download binary
cargo binstall pprog
To use pprog, cd into the directory of an existing or template project. pprog depends on git and also uses .gitignore to communicate the available files to LLM, so the project must have git initialized. For this example, we'll create a basic NodeJS project.
mkdir example-project
cd example-project
npm init -y && git init
pprog init
This will generate a config file pprog.toml with sensible defaults depending on the type of project. For this example the pprog.toml will contain
provider = "anthropic"
model = "claude-3-5-haiku-latest"
check_cmd = "timeout 3s node index.js"
check_enabled = false
api_url = "https://api.anthropic.com/v1/messages"
api_key = "<ANTHROPIC API KEY>"
max_context = 128000
max_output_tokens = 8096
An Anthropic account is assumed on init, but OpenAI-compatible APIs can be used as well. For example, to use OpenAI you can change config to
provider = "openai"
model = "gpt-4o"
check_enabled = false
check_cmd = "timeout 3s node index.js"
api_url = "https://api.openai.com/v1/chat/completions"
api_key = "<OPENAI API KEY>"
max_context = 100000
max_output_tokens = 8096
The tooling logic is intended to be as simple as possible so the model has more flexibility to maneuver. To run enter
pprog serve
and then enter http://localhost:8080 in your browser. A chat interface will load and you can begin making changes to your code. For example, in this example project you can type in a message like Create an index.js file with basic express server and it will create file and check that it runs properly by using check_cmd command. Then another message like Add GET /ping endpoint and it will make changes to the code and check again.
You can run pprog serve for multiple projects at the same time by assigning different ports
pprog serve --port 3002
currently hacking together something to make o1 work. as people will probably ask llama models can be used through OpenAI-compatible APIs like Fireworks, but i've found even 405b to be utterly useless.
pprog uses the check_cmd to check compilation or successful operation. In the example above timeout 3s node index.js will run to check for any runtime errors correct them until all errors are gone. You're free to change check_cmd to anything you want for the given program. For compiled projects using a langauge like Rust, check_cmd would be "cargo check". For intepreted languages it will depend on the type of program. For long lived programs like a web server, you can use the timeout trick above (gtimeout on Macbooks) to check for any initial runtime errors. For intepreted programs that are not long lived simply running the program (like node short-lived-script.js) should work. Note that if not using a timeout for interpreted programs, the chat will not continue until the program completes.
Depending on the project, check_cmd can be extremely verbose and therefore costly. For example, if building a React Native I would use something like
check_cmd = "gtimeout 10s npx react-native run-android"
This produces A LOT of text that gets passed into the context of message calls, most of which is not helpful at all and usually increases cost of task by 3x or more. For this reason check is disabled by default. Set config variable check_enabled = true to enable.
pprog uses a very small set of tools to make changes. currently it has four.
read_file - read entire file contents
write_file - replace entire file with contents
execute - run general bash, sometimes used by agent to install packages when check fails
compile_check - check for compilation errors, or for interpreted programs checks runtime errors on startup
When messages go beyond the max_context config amount messages will be pruned automatically until total token count is below max. When using Anthropic models, dedicated endpoint at v1/messages/count_tokens is used to get count. For OpenAI/OpenAI-compatible models a conservative estimate of 2 characters / token is used to get count. This is because different providers may use different tokenizers behind their OpenAI-compatible API. The conversative estimate is also because most of the text will be code which has a lower character / token ratio on average. As a general rule of thumb you should set your max_context to be around 70% of context length of model.
If errors occur while the chat is in a tool loop, all tool use and tool result messages following the user request will be pruned and a single empty assistant message will be added to maintain a valid conversation format. The error will then be forwarded to user. This is a quick hack and will probably change in the future, but is required by constraints of most APIs and how models are trained.
The model may make tool calls using execute that require sudo priveleges. When this happens, the tool loop will block and wait for user to input password. The password prompt will appear in the terminal window where you run pprog serve. Enter password and press ENTER. This happens entirely on the local system where pprog was ran. Your sudo password is never sent in any messages to the model.
git restore . or the like. You can request to commit all changes in chat and it will do so with a good log message but I usually do not do this because the chats are quite large and a simple commit request will have all previous messages and can be expensive.happy hacking!