This repository contains implementations in C of the preferred 32-bit and 64-bit variants of Melissa O'Neill's PCG family of random number algorithms, with Daniel Lemire's nearly-divisionless algorithm for unbiased bounded random numbers, and functions for random byte strings, random floating point numbers, and random shuffles.
Type make
.
The files pcg{32,64}.[ch]
are generated. They are designed to be
self-contained so that you can copy them into your own projects.
Documentation and commentary is stripped out.
The files pcg32_xsh_rr.c
and pcg64_dxsm.c
contain the
size-specific algorithms for generating random integer and floating
point values.
The files pcg.[ch]
contain code that is generic over the bit size:
- the RNG state type
- seeding the RNG
- Lemire's algorithm
- random bytes and shuffles
The files pcg{32,64}.def
contain macros to configure the generic
code for 32 bits and 64 bits, respectively. The files pcg_blurb.[ch]
are the prefixes of the generated files.
The PCG functions are constructed from a collection of linear congruential generators and a collection of output permutations.
A linear congruential random number generator looks like:
state = state * mul + inc;
The multiplier mul
is usually fixed; the increment inc
can be
fixed, but PCG implementations usually allow it to be configured
when the RNG is seeded.
A bare LCG is a bad RNG. PCG turns an LCG into a good RNG:
- The LCG state is twice the size of the RNG output
- The LCG state is permuted to produce the RNG output
PCG is cunningly designed to use instruction-level parallelism to overlap the LCG state update and the output permutation. Different bitfields of the LCG state are used as the target of the permutation and to control the permutation.
The reference implementation of PCG in C++ allows you to mix and match LCGs and output permutations at a variety of integer sizes. My implementation provides just one 32-bit RNG and one 64-bit RNG.
The file pcg32_xsh_rr.c
contains the preferred 32-bit variant of
PCG. The output permutation's name "XSH RR" is short for "xor shift
random rotate".
In C++ PCG this variant is called
pcg_engines::setseq_xsh_rr_64_32
or simply pcg32
for short.
It is the only variant provided by O'Neill's basic C PCG.
The file pcg64_dxsm.c
contains the preferred 64-bit variant of PCG.
It is harder to find out about it on the PCG web site, though
it is increasingly popular. Melissa O'Neill describes
it as follows:
DXSM -- double xor shift multiply
This is a new, more powerful output permutation (added in 2019). It's a more comprehensive scrambling than RXS M, but runs faster on 128-bit types. Although primarily intended for use at large sizes, also works at smaller sizes as well.
O'Neill wrote a longer description of the design of pcg64_dxsm elsewhere.
As well as the DXSM output permutation, this pcg64
variant uses a
"cheap multiplier", i.e. a 64-bit value half the width of the state,
instead of a 128-bit value the same width as the state. The same
multiplier is used for the LCG and the output permutation.
In C++ PCG its full name is pcg_engines::cm_setseq_dxsm_128_64
.
(The C++ PCG typedef pcg64
still refers to the previously preferred
xsl_rr
variant.)
Random numbers up to some limit are generated by pcg32_rand()
and
pcg64_rand()
using Daniel Lemire's nearly-divisionless unbiased
rejection sampling algorithm for bounded random
numbers.
There are also pcg32_random_float()
and pcg64_random_double()
functions that generate random numbers in 0.0 <= ... < 1.0 using
shift-cast-and-multiply.
The two primary functions, pcg_random()
and pcg_rand()
automatically select inline or extern variants according to the
compiler flags. You can call _fast()
or _small()
variants to
override this choice.
The pcg_rand()
function is specialized for constant limit arguments
so that the rejection loop is truly divisionless, and even omitted
completely when the limit is a constant power of two.
Run make bytes
to build a benchmark that tests various sizes of
unrolled and vectorized pcg32. This demonstrates SIMD optimization of
a single instance of pcg32. Vectorized pcg32 produces the same
sequence of random numbers generated in the same order as normal
scalar pcg32, but much faster.
Written by Tony Finch <dot@dotat.at> in Cambridge.
Permission is hereby granted to use, copy, modify, and/or
distribute this software for any purpose with or without fee.
This software is provided 'as is', without warranty of any kind.
In no event shall the authors be liable for any damages arising
from the use of this software.
SPDX-License-Identifier: 0BSD OR MIT-0