Crates.io | resymgen |
lib.rs | resymgen |
version | 0.6.0 |
source | src |
created_at | 2022-01-29 04:07:29.293464 |
updated_at | 2024-01-04 23:33:03.611397 |
description | Generates symbol tables for reverse engineering applications from a YAML specification |
homepage | |
repository | https://github.com/UsernameFodder/pmdsky-debug |
max_upload_size | |
id | 523430 |
size | 424,280 |
resymgen
, a Reverse Engineering Symbol Table Generatorresymgen
is a command line utility that generates symbol tables of various formats, given YAML configuration files. The output symbol tables are meant to be directly imported into different reverse engineering tools, whereas the YAML configuration files are meant to be generic, flexible, and supportive of documentation.
resymgen
is a tool meant to ease collaboration in specialized reverse engineering efforts that are subject to one or more of the following constraints:
The above constraints are particularly likely to appear in domains like video game reverse engineering. Given these constraints, sharing information can be very difficult to do well. With resymgen
, a group can share address-to-symbol associations on a per-version and per-overlay basis, and in a human-readable YAML format that can double as documentation. The generic YAML format can then be transpiled into different symbol table formats for direct use with different reverse engineering tools, all without sharing the binary itself.
What resymgen
does may not seem generally useful to most reverse engineering projects, and that's because it's not meant to be. If none of the aforementioned constraints apply to your reverse engineering problem, there are likely better ways to share information with collaborators than with resymgen
. For example, if all collaborators standardize on one reverse engineering tool (e.g., Ghidra or IDA), it would be easier and more effective to use the tool's built-in import/export and collaboration features, especially if the binary can be legally shared in a full-project export. If the binary of interest only has one version (or only one version is cared about) and does not have overlays, it would likely be simpler to share debugging information through industry-standard formats such as PDB, DWARF, linker maps, etc.
The resymgen
binary is provided with this package. Run resymgen --help
for detailed usage information. Each of the subcommands also have their own --help
flag to print detailed usage information. The following list provides an overview of resymgen
's different subcommands.
gen
: Generate symbol tables for specified versions and output formats, given a resymgen
YAML file.fmt
: Formatter for resymgen
YAML files.check
: Validator for resymgen
YAML files. Provides a collection of different checks that can be run on the contents of a file to ensure correctness.merge
: Merge symbols from various structured input formats into another resymgen
YAML file. This is in some sense the opposite of the gen
subcommand.resymgen
YAML specificationA resymgen
YAML file consists of one or more named blocks.
Each block is tagged with some metadata, including a starting memory address, a length, an optional version list, and an optional description. The address and length are allowed to be version-dependent. Each block also contains two lists of symbols, one for functions and one for data, and optionally a list of subregions.
A symbol represents one or more memory regions containing an identifiable chunk of instructions or data. Each symbol has the following fields:
A subregion represents a nested resymgen
YAML file, which has one or more of its own named blocks, that is contained within the parent block. In a resymgen
YAML file, a subregion is represented as a file name (note that it should not be a file path with multiple components). If the parent file has the file path /path/to/parent.yml
, and one of its blocks has a subregion with the name sub.yml
, then this subregion name references a corresponding subregion file with the file path /path/to/parent/sub.yml
.
Subregions are useful for splitting up large resymgen
YAML files. If a parent file has one or more subregion files, blocks in the parent file can still contain metadata describing the region as a whole, and the parent file can be treated as an aggregate entity by resymgen
subcommands.
<block1_name>:
versions (optional):
- <string>
...
address: MaybeVersionDep[number]
length: MaybeVersionDep[number]
description (optional): <string>
subregions (optional):
- <file name>
...
functions:
- name: <string>
aliases (optional):
- <string>
...
address: MaybeVersionDep[ScalarOrList[number]]
length (optional): MaybeVersionDep[number]
description (optional): <string>
...
data:
- name: <string>
aliases (optional):
- <string>
...
address: MaybeVersionDep[ScalarOrList[number]]
length (optional): MaybeVersionDep[number]
description (optional): <string>
...
...
Assuming the following type definitions:
MaybeVersionDep[T] = <T> OR {<string>: <T>, ...}
ScalarOrList[T] = <T> OR [<T>, ...]
main:
versions:
- v1
- v2
address:
v1: 0x2000000
v2: 0x2010000
length:
v1: 0x100000
v2: 0x100000
description: The main memory region
subregions:
- sub1.yml
- sub2.yml
functions:
- name: function1
aliases:
- function1_alias1
- function1_alias2
address:
v1: 0x2001000
v2: 0x2012000
description: |-
multi
line
description
- name: function2
address:
v1:
- 0x2002000
- 0x2003000
v2: 0x2013000
description: simple description
data:
- name: SOME_DATA
address:
v1: 0x2000000
v2: 0x2010000
length:
v1: 0x1000
v2: 0x1600
other:
address: 0x2400000
length: 0x100000
functions: []
data:
- name: OTHER_DATA
address: 0x2400000
gen
)ImportSymbolsScript.py
script)merge
)resymgen
YAML