| Crates.io | awful_knowledge_synthesizer |
| lib.rs | awful_knowledge_synthesizer |
| version | 0.1.4 |
| created_at | 2025-10-07 02:31:14.020347+00 |
| updated_at | 2025-10-07 02:42:07.384132+00 |
| description | Generate LLM-powered exam questions from YAML books, manpages, mdbooks, tealdeer pages, and code. |
| homepage | https://github.com/graves/awful_knowledge_synthesizer |
| repository | https://github.com/graves/awful_knowledge_synthesizer |
| max_upload_size | |
| id | 1871155 |
| size | 275,111 |
A tool to generate LLM-powered exam questions from YAML books, manpages, mdbooks, and more.
_______________________________________________________
|:::::: o o o o . |..... . .. . | [45] o o o o o ::::::|
|:::::: o o o o | .. . ..... | o o o o o ::::::|
|::::::___________|__..._...__._|_________________::::::|
| # # | # # # | # # | # # # | # # | # # # | # # | # # # |
| # # | # # # | # # | # # # | # # | # # # | # # | # # # |
| # # | # # # | # # | # # # | # # | # # # | # # | # # # |
| | | | | | | | | | | | | | | | | | | | | | | | | | | | |
|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|
-Mr R J Craggs-
Ξ» awful_knowledge_synthesizer --help
Generate final exam questions from YAML book chunks
Usage: awful_knowledge_synthesizer [OPTIONS] --input-dir <INPUT_DIR> --config <CONFIG> --source-type <SOURCE_TYPE> --output-dir <OUTPUT_DIR>
Options:
-i, --input-dir <INPUT_DIR> Path to directory of inputs
-c, --config <CONFIG> Configuration file
-s, --source-type <SOURCE_TYPE> Source type [possible values: book, manpage, mdbook, tealdeer, code]
-m, --mdbook-name <MDBOOK_NAME> mdbook project name
-o, --output-dir <OUTPUT_DIR> Path to directory to output files
-l, --language <LANGUAGE> Language of the code repository [possible values: asm, c, rust]
-p, --project-name <PROJECT_NAME> Code repo project name
-h, --help Print help
awful_knowledge_synthesizer is a command-line tool that takes YAML files (and other text formats) containing book excerpts, manpages, or code snippets and generates exam questions for Large Language Models (LLMs).
yaml, manpage, mdbook, tealdeer, and code.SQLite_questions.yaml).code, manpage, mdbook, book, tealdeer, or yaml sources.config.yaml.This tool transforms text from various sources into exam questions using Large Language Models (LLMs). Hereβs a breakdown of how each input type is processed.
GrammarLogicRhetoricMath.yaml)._questions.yaml files (e.g., GrammarLogicRhetoricMath_questions.yaml)..txt files containing macOS manpage content (e.g., 4ccconv.txt)..txt files and splits them into chunks._questions.yaml files (e.g., 4ccconv_questions.yaml).cargo/ for Cargo documentation)..md files.mdbook_name_questions.yaml (e.g., Cargo_questions.yaml).tldr Commands)tldr command outputs (e.g., aa.md).aa.md β tldr aa).tldr output.Tealdeer_questions.yaml..c, .rs, or .asm.project_name_questions.yaml (e.g., SQLite_questions.yaml).Input Parsing:
.txt/.md files, or source code.run_for_books, run_for_manpages, etc.).Chunking:
tree-sitter parsers for C/Rust).LLM Prompting:
Output:
project_name_questions.yaml).Ξ» awful_knowledge_synthesizer --help
Generate final exam questions from YAML book chunks
Usage: awful_knowledge_synthesizer [OPTIONS] --input-dir <INPUT_DIR> --config <CONFIG> --source-type <SOURCE_TYPE> --output-dir <OUTPUT_DIR>
Options:
-i, --input-dir <INPUT_DIR> Path to directory of inputs
-c, --config <CONFIG> Configuration file
-s, --source-type <SOURCE_TYPE> Source type [possible values: book, manpage, mdbook, tealdeer, code]
-m, --mdbook-name <MDBOOK_NAME> mdbook project name
-o, --output-dir <OUTPUT_DIR> Path to directory to output files
-l, --language <LANGUAGE> Language of the code repository [possible values: asm, c, rust]
-p, --project-name <PROJECT_NAME> Code repo project name
-h, --help Print help
Ξ» awful_knowledge_synthesizer --input-dir inputs/code/sqlite --config config.yaml --source-type code --language c --output-dir . --project-name "SQLite"
Reading "inputs/code/sqlite"
File: jimsh0.c
Processing chunk 1/116
Wrote to ./SQLite_questions.yaml
Processing chunk 2/116
SQLite_questions.yaml:
- prompt: "You are playing the role of a senior software engineer developing questions for a code review. Here is some source code from inputs/code/sqlite/autosetup/jimsh0.c. It is part of the SQLite project.\n\n\n\nSource Code:\n\n```c\n/* This is single source file, bootstrap version of Jim Tcl. See http://jim.tcl.tk/ */\n#define JIM_COMPAT\n#define JIM_ANSIC\n#define JIM_REGEXP\n#define HAVE_NO_AUTOCONF\n#define JIM_TINY\n#define _JIMAUTOCONF_H\n#define TCL_LIBRARY \".\"\n#define jim_ext_bootstrap\n#define jim_ext_aio\n#define jim_ext_readdir\n#define jim_ext_regexp\n#define jim_ext_file\n#define jim_ext_glob\n#define jim_ext_exec\n#define jim_ext_clock\n#define jim_ext_array\n#define jim_ext_stdlib\n#define jim_ext_tclcompat\n#if defined(_MSC_VER)\n#define TCL_PLATFORM_OS \"windows\"\n#define TCL_PLATFORM_PLATFORM \"windows\"\n#define TCL_PLATFORM_PATH_SEPARATOR \";\"\n#define HAVE_MKDIR_ONE_ARG\n#define HAVE_SYSTEM\n#elif defined(__MINGW32__)\n#define TCL_PLATFORM_OS \"mingw\"\n#define TCL_PLATFORM_PLATFORM \"windows\"\n#define TCL_PLATFORM_PATH_SEPARATOR \";\"\n#define HAVE_MKDIR_ONE_ARG\n#define HAVE_SYSTEM\n#define HAVE_SYS_TIME_H\n#define HAVE_DIRENT_H\n#define HAVE_UNISTD_H\n#define HAVE_UMASK\n#include <sys/stat.h>\n#ifndef S_IRWXG\n#define S_IRWXG 0\n#endif\n#ifndef S_IRWXO\n#define S_IRWXO 0\n#endif\n#else\n#define TCL_PLATFORM_OS \"unknown\"\n#define TCL_PLATFORM_PLATFORM \"unix\"\n#define TCL_PLATFORM_PATH_SEPARATOR \":\"\n#ifdef _MINIX\n#define vfork fork\n#define _POSIX_SOURCE\n#else\n#define _GNU_SOURCE\n#endif\n#define HAVE_FORK\n#define HAVE_WAITPID\n#define HAVE_ISATTY\n#define HAVE_MKSTEMP\n#define HAVE_LINK\n#define HAVE_SYS_TIME_H\n#define HAVE_DIRENT_H\n#define HAVE_UNISTD_H\n#define HAVE_UMASK\n#define HAVE_PIPE\n#define _FILE_OFFSET_BITS 64\n#endif\n#define JIM_VERSION 84\n#ifndef JIM_WIN32COMPAT_H\n#define JIM_WIN32COMPAT_H\n\n\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n\n\n#if defined(_WIN32) || defined(WIN32)\n\n#define HAVE_DLOPEN\nvoid *dlopen(const char *path, int mode);\nint dlclose(void *handle);\nvoid *dlsym(void *handle, const char *symbol);\nchar *dlerror(void);\n\n\n#if defined(__MINGW32__)\n #define JIM_SPRINTF_DOUBLE_NEEDS_FIX\n#endif\n\n#ifdef _MSC_VER\n\n\n#if _MSC_VER >= 1000\n\t#pragma warning(disable:4146)\n#endif\n\n#include <limits.h>\n#define jim_wide _int64\n#ifndef HAVE_LONG_LONG\n#define HAVE_LONG_LONG\n#endif\n#ifndef LLONG_MAX\n\t#define LLONG_MAX 9223372036854775807I64\n#endif\n#ifndef LLONG_MIN\n\t#define LLONG_MIN (-LLONG_MAX - 1I64)\n#endif\n#define JIM_WIDE_MIN LLONG_MIN\n#define JIM_WIDE_MAX LLONG_MAX\n#define JIM_WIDE_MODIFIER \"I64d\"\n#define strcasecmp _stricmp\n#define strtoull _strtoui64\n\n#include <io.h>\n\n#include <winsock.h>\nint gettimeofday(struct timeval *tv, void *unused);\n\n#define HAVE_OPENDIR\nstruct dirent {\n\tchar *d_name;\n};\n\ntypedef struct DIR {\n\tlong handle;\n\tstruct _finddata_t info;\n\tstruct dirent result;\n\tchar *name;\n} DIR;\n\nDIR *opendir(const char *name);\nint closedir(DIR *dir);\nstruct dirent *readdir(DIR *dir);\n\n#endif\n\n#endif\n\n#ifdef __cplusplus\n}\n#endif\n\n#endif\n#ifndef UTF8_UTIL_H\n#define UTF8_UTIL_H\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n\n\n\n#define MAX_UTF8_LEN 4\n\nint utf8_fromunicode(char *p, unsigned uc);\n\n#ifndef JIM_UTF8\n#include <ctype.h>\n\n\n#define utf8_strlen(S, B) ((B) < 0 ? (int)strlen(S) : (B))\n#define utf8_strwidth(S, B) utf8_strlen((S), (B))\n#define utf8_tounicode(S, CP) (*(CP) = (unsigned char)*(S), 1)\n#define utf8_getchars(CP, C) (*(CP) = (C), 1)\n#define utf8_upper(C) toupper(C)\n#define utf8_title(C) toupper(C)\n#define utf8_lower(C) tolower(C)\n#define utf8_index(C, I) (I)\n#define utf8_charlen(C) 1\n#define utf8_prev_len(S, L) 1\n#define utf8_width(C) 1\n\n#else\n\n#endif\n\n#ifdef __cplusplus\n}\n#endif\n\n#endif\n\n#ifndef __JIM__H\n#define __JIM__H\n\n#ifdef __cplusplus\nextern \"C\" {\n#endif\n\n#include <time.h>\n#include <limits.h>\n#include <stdlib.h>\n#include <stdarg.h>\n\n\n#ifndef HAVE_NO_AUTOCONF\n#endif\n\n\n\n#ifndef jim_wide\n# ifdef HAVE_LONG_LONG\n# define jim_wide long long\n# ifndef LLONG_MAX\n# define LLONG_MAX 9223372036854775807LL\n# endif\n# ifndef LLONG_MIN\n# define LLONG_MIN (-LLONG_MAX - 1LL)\n# endif\n# define JIM_WIDE_MIN LLONG_MIN\n# define JIM_WIDE_MAX LLONG_MAX\n# else\n# define jim_wide long\n# define JIM_WIDE_MIN LONG_MIN\n# define JIM_WIDE_MAX LONG_MAX\n# endif\n\n\n# ifdef HAVE_LONG_LONG\n# define JIM_WIDE_MODIFIER \"lld\"\n# else\n# define JIM_WIDE_MODIFIER \"ld\"\n# define strtoull strtoul\n# endif\n#endif\n\n#define UCHAR(c) ((unsigned char)(c))\n\n\n\n#define JIM_ABI_VERSION 101\n\n#define JIM_OK 0\n#define JIM_ERR 1\n#define JIM_RETURN 2\n#define JIM_BREAK 3\n#define JIM_CONTINUE 4\n#define JIM_SIGNAL 5\n#define JIM_EXIT 6\n\n#define JIM_EVAL 7\n\n#define JIM_MAX_CALLFRAME_DEPTH 1000\n#define JIM_MAX_EVAL_DEPTH 2000\n\n\n#define JIM_PRIV_FLAG_SHIFT 20\n\n#define JIM_NONE 0\n#define JIM_ERRMSG 1\n#define JIM_ENUM_ABBREV 2\n#define JIM_UNSHARED 4\n#define JIM_MUSTEXIST 8\n#define JIM_NORESULT 16\n\n\n#define JIM_SUBST_NOVAR 1\n#define JIM_SUBST_NOCMD 2\n#define JIM_SUBST_NOESC 4\n#define JIM_SUBST_FLAG 128\n\n\n#define JIM_CASESENS 0\n#define JIM_NOCASE 1\n#define JIM_OPT_END 2\n\n\n#define JIM_PATH_LEN 1024\n\n\n#define JIM_NOTUSED(V) ((void) V)\n\n#define JIM_LIBPATH \"auto_path\"\n#define JIM_INTERACTIVE \"tcl_interactive\"\n\n\ntypedef struct Jim_Stack {\n int len;\n int maxlen;\n void **vector;\n} Jim_Stack;\n```"
codeQuestion1: What is the purpose of this code?
codeQuestion2: How can a user initiate a new game after losing, and what system calls are involved in handling the input for this action?
codeQuestion3: What steps are taken to handle terminal input and output settings?
I've left all of the corpora inputs in inputs and all of the completed question/prompt items in complete.
complete/
βββ books/
β βββ GrammarLogicRhetoricMath/
β β βββ SQLite_questions.yaml
βββ code/
β βββ SQLite_questions.yaml
βββ mdbooks/
βββ Rust_questions.yaml
api_key: your-openai-api-key
api_base: http://127.0.0.1:1234/v1
model: qwen3-4B-mlx
context_max_tokens: 32768
assistant_minimum_context_tokens: 2048
stop_words:
- |-
This is a sample text...
session_db_url: /path/to/aj.db
Place these in a directory like ~/Library/Application Support/com.awful-sec.aj/templates/:
templates/book_knowledge_synthesizer.yaml
templates/code_knowledge_synthesizer.yaml
templates/manpage_knowledge_synthesizer.yaml
templates/mdbook_knowledge_synthesizer.yaml
templates/tealdeer_knowledge_synthesizer.yaml
| Type | Description |
|---|---|
yaml |
Sanitized text chunks (e.g., from books). |
manpage |
Manpages or system docs (txt files). |
mdbook |
Nested markdown directories (e.g., Cargo, Rust). |
tealdeer |
Markdown files (e.g., AArch64_Assembly.md). |
code |
Code snippets (e.g., C, Rust). |
- prompt: "What is the purpose of this code?"
answer: "To implement a database engine..."
Note: The actual questions depend on the LLM and template used. Use Awful Jade to test the results.
Install dependencies:
cargo install awful_knowledge_synthesizer
Run it:
awful_knowledge_synthesizer --help
Explore the examples:
tree inputs
tree complete
Now go forth and synthesize! π§ π