Concurrent Memory Allocator Benchmarks Emery Berger http://www.cs.umass.edu/~emery This is the suite of concurrent benchmarks used in the paper "Hoard: A Scalable Memory Allocator for Multithreaded Applications". If you use these benchmarks, please cite that paper as follows: @inproceedings{Berger:2000:HSM:378993.379232, author = {Berger, Emery D. and McKinley, Kathryn S. and Blumofe, Robert D. and Wilson, Paul R.}, title = {Hoard: a scalable memory allocator for multithreaded applications}, booktitle = {Proceedings of the ninth international conference on Architectural support for programming languages and operating systems}, series = {ASPLOS-IX}, year = {2000}, isbn = {1-58113-317-0}, location = {Cambridge, Massachusetts, United States}, pages = {117--128}, numpages = {12}, url = {http://doi.acm.org/10.1145/378993.379232}, doi = {http://doi.acm.org/10.1145/378993.379232}, acmid = {379232}, publisher = {ACM}, address = {New York, NY, USA}, } Here is a brief description of the benchmarks, including usage and sample parameters. P denotes the number of processors (cores) in your system. * cache-scratch: This benchmark is referred to in the paper as "passive-false", and tests resilience against passive false sharing (see the paper for details). Parameters: % cache-scratch 1 100 8 1000000 % cache-scratch P 100 8 1000000 * cache-thrash: This benchmark is referred to in the paper as "active-false", and also tests resilience against active false sharing (see the paper for details). It uses the same parameters as cache-scratch. * larson: This benchmark is courtesy of Paul Larson at Microsoft Research. It simulates a server: each thread allocates and deallocates objects, and then transfers some objects (randomly selected) to other threads to be freed. Parameters: (from the command line) Suggested parameters: ./larson 10 7 8 1000 10000 1 P * shbench: A benchmark where each thread allocates and randomly frees a number of random-sized objects. You may download the source from: http://www.microquill.com/smartheap/shbench/bench.zip NOTE: this benchmark differs from the one used in the Hoard paper, but it is unfortunately not possible to include that code here. * threadtest: A basic benchmark that measures raw per-thread allocation throughput: each thread repeatedly allocates and then deallocates 100,000/P objects. Parameters: % threadtest P 1000 10000 0 8 Additional benchmarks not in the original Hoard paper: * linux-scalability: This benchmark from the University of Michigan also tests allocator throughput. Parameters: Example: 8 10000000 P