This program executes and monitors another program, recording its inputs and outputs using $LD_PRELOAD
.
These inputs and outputs can be joined in a provenance graph.
The provenance graph tells us where a particular file came from.
The provenance graph can help us re-execute the program, containerize the program, turn it into a workflow, or tell us which version of the data did this program use.
-
Install Nix with flakes. This can be done on any Linux (including Ubuntu, RedHat, Arch Linux, not just NixOS), MacOS X, or even Windows Subsystem for Linux. See Determinate Nix Installer documentation for more details.
curl -fsSL https://install.determinate.systems/nix | sh -s -- install # In container, #curl -fsSL https://install.determinate.systems/nix | sh -s -- install linux --extra-conf "sandbox = false" --init none --no-confirm
-
Re-log-in or activate Nix in the current shell.
export PATH="${PATH}:/nix/var/nix/profiles/default/bin"
-
Optionally, use our public binary cache to speed up the installation.
nix profile install --accept-flake-config nixpkgs#cachix cachix use charmonium
-
Install PROBE or run PROBE without permanently installing. In the latter case, the
nix profile install github:charmoniumQ/PROBE probe --help # Or run without installing. # Nix will install PROBE into a virtual environment that is only activated in the current shell. #nix run github:charmoniumQ/PROBE -- [probe args go here] # Use `nix store gc` to reclaim disk space consumed by the previous command's virtual environment.
-
Now you should be able to run
probe record [-f] [-o probe_log] <cmd...>
.-f
is needed to overwrite a pre-existingprobe_log
.
probe record ./script.py --foo bar.txt
probe export debug-text
The simplest invocation of the probe
cli is:
probe record <CMD...>
This will run <CMD...>
under the benevolent supervision of libprobe, outputting the probe record to a temporary directory. Upon the process exiting, probe
it will transcribe the record directory and write a probe log file named probe_log
in the current directory.
If you run this again you'll notice it throws an error that the output file already exists, solve this by passing -o <PATH>
to specify a new file to write the log to, or by passing -f
to overwrite the previous log.
probe record
does not pass your command through a shell, any subshell or environment substitutions will still be performed by your shell before the arguments are passed to probe
. But it won't understand flow control statements like if
and for
, shell builtins like cd
, or shell aliases/functions.
If you need these you can either write a shell script and invoke probe record
on that, or else run:
probe record bash -c '<SHELL_CODE>'
Any flag after the first positional argument is treated as an argument to the command, not probe
.
This creates a file called probe_log
. If you already have that file from a previous recording, give probe record -f
to overwrite.
If you get tired of typing probe record ...
in front of every command you wish to record, consider recording your entire shell session:
$ probe record bash
bash$ ls -l
bash$ # do other commands
bash$ exit
$ probe dump
<dumps history for entire bash session>
That's a huge work in progress.
Try exporting to different formats.
probe export --help
-
Follow the previous step to install Nix.
-
Acquire the source code:
git clone https://github.com/charmoniumQ/PROBE && cd PROBE
-
Run
nix develop
. This will leave you in a Nix development shell, with all the development tools you need to develop and build PROBE. It is like a virtualenv, in that it is isolated from your system's pre-existing tools. In the development shell, we all have the same version of Python with all the same packages. You can exit it by dypingexit
. -
From within the development shell, type
just compile
. This compiles the Rust, C, and generated-Python components. If you hack on either, runjust compile
again before continuing. -
The manually-written Python scripts should already be added to the
$PYTHONPATH
. You should be able to edit them in place. -
Run
probe <args...>
orpython -m probe_py.manual.cli <args...>
to invoke the Rust or Python code respectively. -
Before submitting a PR, run
just pre-commit
which will run pre-commit checks.
libprobe
: Library that implements interposition (C, Make, Python; happens to be manual and code-gen).libprobe/include
: Headers that will be used by the Rust wrapper to read PROBE data.libprobe/src
: Main C sources oflibprobe
.libprobe/generator
: Python and C-template code-generator.libprobe/generated
: (Generated, not committed to Git) output of code-generation.libprobe/Makefile
: Makefile that runs all oflibprobe
; runjust compile-cli
to invoke.
cli-wrapper
: (Cargo workspace) code that wraps libprobe.cli-wrapper/cli
: (Cargo crate) main CLI.cli-wrapper/lib
: (Cargo crate) supporting library functions.cli-wrapper/macros
: (Cargo crate) supporting macros; they use structs fromlibprobe/include
to create Rust structs and Python dataclasses.cli-wrapper/frontend.nix
: Nix code that builds the Cargo workspace; Gets included inflake.nix
.
probe_py
: Python Code that implements analysis of PROBE data (happens to be manual and code-gen), should be added to$PYTHONPATH
bynix develop
probe_py/probe_py
: Main package to be imported or run.probe_py/pyproject.toml
: Definition of main package and dependencies.probe_py/tests
: Python unittests, i.e.,from probe_py import foobar; test_foobar()
; Runjust test-py
.probe_py/mypy_stubs
: "Stub" files that tell Mypy how to check untyped library code. Should be added to$MYPYPATH
bynix develop
.
tests
: End-to-end opaque-box tests. They will be run with Pytest, but they will not test Python directly; they should alwayssubprocess.run(["probe", ...])
. Additionally, some tests have to be manually invoked.docs
: Documentation and papers.benchmark
: Programs and infrastructure for benchmarking.benchmark/REPRODUCING.md
: Read this first!
flake.nix
: Nix code that defines packages and the devshell.setup_devshell.sh
: Helps instantiate Nix devshell.Justfile
: "Shortcuts" for defining and running common commands (e.g.,just --list
).