Random numbers with PCG

This repository contains implementations in C of the preferred 32-bit and 64-bit variants of Melissa O'Neill's PCG family of random number algorithms, with Daniel Lemire's nearly-divisionless algorithm for unbiased bounded random numbers, and functions for random byte strings, random floating point numbers, and random shuffles.

guided tour

Type make.

The files pcg{32,64}.[ch] are generated. They are designed to be self-contained so that you can copy them into your own projects. Documentation and commentary is stripped out.

The files pcg32_xsh_rr.c and pcg64_dxsm.c contain the size-specific algorithms for generating random integer and floating point values.

The files pcg.[ch] contain code that is generic over the bit size:

the RNG state type
seeding the RNG
Lemire's algorithm
random bytes and shuffles

The files pcg{32,64}.def contain macros to configure the generic code for 32 bits and 64 bits, respectively. The files pcg_blurb.[ch] are the prefixes of the generated files.

permuted congruential generator

The PCG functions are constructed from a collection of linear congruential generators and a collection of output permutations.

A linear congruential random number generator looks like:

    state = state * mul + inc;

The multiplier mul is usually fixed; the increment inc can be fixed, but PCG implementations usually allow it to be configured when the RNG is seeded.

A bare LCG is a bad RNG. PCG turns an LCG into a good RNG:

The LCG state is twice the size of the RNG output
The LCG state is permuted to produce the RNG output

PCG is cunningly designed to use instruction-level parallelism to overlap the LCG state update and the output permutation. Different bitfields of the LCG state are used as the target of the permutation and to control the permutation.

The reference implementation of PCG in C++ allows you to mix and match LCGs and output permutations at a variety of integer sizes. My implementation provides just one 32-bit RNG and one 64-bit RNG.

pcg32

The file pcg32_xsh_rr.c contains the preferred 32-bit variant of PCG. The output permutation's name "XSH RR" is short for "xor shift random rotate".

In C++ PCG this variant is called pcg_engines::setseq_xsh_rr_64_32 or simply pcg32 for short. It is the only variant provided by O'Neill's basic C PCG.

pcg64

The file pcg64_dxsm.c contains the preferred 64-bit variant of PCG. It is harder to find out about it on the PCG web site, though it is increasingly popular. Melissa O'Neill describes it as follows:

DXSM -- double xor shift multiply

This is a new, more powerful output permutation (added in 2019). It's a more comprehensive scrambling than RXS M, but runs faster on 128-bit types. Although primarily intended for use at large sizes, also works at smaller sizes as well.

O'Neill wrote a longer description of the design of pcg64_dxsm elsewhere.

As well as the DXSM output permutation, this pcg64 variant uses a "cheap multiplier", i.e. a 64-bit value half the width of the state, instead of a 128-bit value the same width as the state. The same multiplier is used for the LCG and the output permutation.

In C++ PCG its full name is pcg_engines::cm_setseq_dxsm_128_64. (The C++ PCG typedef pcg64 still refers to the previously preferred xsl_rr variant.)

unbiased bounded random numbers

Random numbers up to some limit are generated by pcg32_rand() and pcg64_rand() using Daniel Lemire's nearly-divisionless unbiased rejection sampling algorithm for bounded random numbers.

There are also pcg32_random_float() and pcg64_random_double() functions that generate random numbers in 0.0 <= ... < 1.0 using shift-cast-and-multiply.

speed vs size

The two primary functions, pcg_random() and pcg_rand() automatically select inline or extern variants according to the compiler flags. You can call _fast() or _small() variants to override this choice.

The pcg_rand() function is specialized for constant limit arguments so that the rejection loop is truly divisionless, and even omitted completely when the limit is a constant power of two.

vectorized pcg32

Run make bytes to build a benchmark that tests various sizes of unrolled and vectorized pcg32. This demonstrates SIMD optimization of a single instance of pcg32. Vectorized pcg32 produces the same sequence of random numbers generated in the same order as normal scalar pcg32, but much faster.

license

Written by Tony Finch <dot@dotat.at> in Cambridge.

Permission is hereby granted to use, copy, modify, and/or
distribute this software for any purpose with or without fee.

This software is provided 'as is', without warranty of any kind.
In no event shall the authors be liable for any damages arising
from the use of this software.

SPDX-License-Identifier: 0BSD OR MIT-0

Name		Name	Last commit message	Last commit date
Latest commit History 76 Commits
.clang-format		.clang-format
.gitignore		.gitignore
Makefile		Makefile
README.md		README.md
bytes.c		bytes.c
pcg.c		pcg.c
pcg.h		pcg.h
pcg32.def		pcg32.def
pcg32_xsh_rr.c		pcg32_xsh_rr.c
pcg64.def		pcg64.def
pcg64_dxsm.c		pcg64_dxsm.c
pcg_blurb.c		pcg_blurb.c
pcg_blurb.h		pcg_blurb.h
test.c		test.c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Random numbers with PCG

guided tour

permuted congruential generator

pcg32

pcg64

unbiased bounded random numbers

speed vs size

vectorized pcg32

license

About

Releases

Packages

Languages

fanf2/pcg-dxsm

Folders and files

Latest commit

History

Repository files navigation

Random numbers with PCG

guided tour

permuted congruential generator

pcg32

pcg64

unbiased bounded random numbers

speed vs size

vectorized pcg32

license

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages