Skip to content

Commit

Permalink
Merge bitcoin#25667: assumeutxo: snapshot initialization
Browse files Browse the repository at this point in the history
bf95976 doc: add note about snapshot chainstate init (James O'Beirne)
e4d7995 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc924 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c36139 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c validation: add ResetChainstates() (James O'Beirne)
3a29dfb move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff3 validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d1590 add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1 init: add utxo snapshot detection (James O'Beirne)
f9f1735 validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)

Pull request description:

  This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: bitcoin#15606)

  ---

  Half of the replacement for bitcoin#24232. The original PR grew larger than expected throughout the review process.

  This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.

  Don't be scared! There are some big move-only commits in here.

  Accompanying changes include:

  - moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](bitcoin#24232 (comment)).
  - adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
  - improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.

ACKs for top commit:
  naumenkogs:
    utACK bf95976
  ariard:
    Code Review ACK bf95976
  ryanofsky:
    Code review ACK bf95976. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
  fjahr:
    utACK bf95976
  aureleoules:
    ACK bf95976

Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
  • Loading branch information
achow101 committed Oct 13, 2022
2 parents 147d64d + bf95976 commit 6912a28
Show file tree
Hide file tree
Showing 19 changed files with 662 additions and 186 deletions.
5 changes: 3 additions & 2 deletions doc/design/assumeutxo.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,9 @@ original chainstate remains in use as active.

Once the snapshot chainstate is loaded and validated, it is promoted to active
chainstate and a sync to tip begins. A new chainstate directory is created in the
datadir for the snapshot chainstate called
`chainstate_[SHA256 blockhash of snapshot base block]`.
datadir for the snapshot chainstate called `chainstate_snapshot`. When this directory
is present in the datadir, the snapshot chainstate will be detected and loaded as
active on node startup (via `DetectSnapshotChainstate()`).

| | |
| ---------- | ----------- |
Expand Down
2 changes: 2 additions & 0 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -396,6 +396,7 @@ libbitcoin_node_a_SOURCES = \
node/minisketchwrapper.cpp \
node/psbt.cpp \
node/transaction.cpp \
node/utxo_snapshot.cpp \
node/validation_cache_args.cpp \
noui.cpp \
policy/fees.cpp \
Expand Down Expand Up @@ -902,6 +903,7 @@ libbitcoinkernel_la_SOURCES = \
node/blockstorage.cpp \
node/chainstate.cpp \
node/interface_ui.cpp \
node/utxo_snapshot.cpp \
policy/feerate.cpp \
policy/fees.cpp \
policy/packages.cpp \
Expand Down
2 changes: 1 addition & 1 deletion src/dbwrapper.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ static leveldb::Options GetOptions(size_t nCacheSize)
}

CDBWrapper::CDBWrapper(const fs::path& path, size_t nCacheSize, bool fMemory, bool fWipe, bool obfuscate)
: m_name{fs::PathToString(path.stem())}
: m_name{fs::PathToString(path.stem())}, m_path{path}, m_is_memory{fMemory}
{
penv = nullptr;
readoptions.verify_checksums = true;
Expand Down
18 changes: 18 additions & 0 deletions src/dbwrapper.h
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ class dbwrapper_error : public std::runtime_error

class CDBWrapper;

namespace dbwrapper {
using leveldb::DestroyDB;
}

/** These should be considered an implementation detail of the specific database.
*/
namespace dbwrapper_private {
Expand Down Expand Up @@ -219,6 +223,12 @@ class CDBWrapper

std::vector<unsigned char> CreateObfuscateKey() const;

//! path to filesystem storage
const fs::path m_path;

//! whether or not the database resides in memory
bool m_is_memory;

public:
/**
* @param[in] path Location in the filesystem where leveldb data will be stored.
Expand Down Expand Up @@ -268,6 +278,14 @@ class CDBWrapper
return WriteBatch(batch, fSync);
}

//! @returns filesystem path to the on-disk data.
std::optional<fs::path> StoragePath() {
if (m_is_memory) {
return {};
}
return m_path;
}

template <typename K>
bool Exists(const K& key) const
{
Expand Down
10 changes: 10 additions & 0 deletions src/node/blockstorage.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -524,6 +524,16 @@ void BlockManager::FlushUndoFile(int block_file, bool finalize)
void BlockManager::FlushBlockFile(bool fFinalize, bool finalize_undo)
{
LOCK(cs_LastBlockFile);

if (m_blockfile_info.size() < 1) {
// Return if we haven't loaded any blockfiles yet. This happens during
// chainstate init, when we call ChainstateManager::MaybeRebalanceCaches() (which
// then calls FlushStateToDisk()), resulting in a call to this function before we
// have populated `m_blockfile_info` via LoadBlockIndexDB().
return;
}
assert(static_cast<int>(m_blockfile_info.size()) > m_last_blockfile);

FlatFilePos block_pos_old(m_last_blockfile, m_blockfile_info[m_last_blockfile].nSize);
if (!BlockFileSeq().Flush(block_pos_old, fFinalize)) {
AbortNode("Flushing block file to disk failed. This is likely the result of an I/O error.");
Expand Down
24 changes: 21 additions & 3 deletions src/node/chainstate.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -48,10 +48,15 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
}

LOCK(cs_main);
chainman.InitializeChainstate(options.mempool);
chainman.m_total_coinstip_cache = cache_sizes.coins;
chainman.m_total_coinsdb_cache = cache_sizes.coins_db;

// Load the fully validated chainstate.
chainman.InitializeChainstate(options.mempool);

// Load a chain created from a UTXO snapshot, if any exist.
chainman.DetectSnapshotChainstate(options.mempool);

auto& pblocktree{chainman.m_blockman.m_block_tree_db};
// new CBlockTreeDB tries to delete the existing file, which
// fails if it's still open from the previous loop. Close it first:
Expand Down Expand Up @@ -98,12 +103,20 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
return {ChainstateLoadStatus::FAILURE, _("Error initializing block database")};
}

// Conservative value which is arbitrarily chosen, as it will ultimately be changed
// by a call to `chainman.MaybeRebalanceCaches()`. We just need to make sure
// that the sum of the two caches (40%) does not exceed the allowable amount
// during this temporary initialization state.
double init_cache_fraction = 0.2;

// At this point we're either in reindex or we've loaded a useful
// block tree into BlockIndex()!

for (Chainstate* chainstate : chainman.GetAll()) {
LogPrintf("Initializing chainstate %s\n", chainstate->ToString());

chainstate->InitCoinsDB(
/*cache_size_bytes=*/cache_sizes.coins_db,
/*cache_size_bytes=*/chainman.m_total_coinsdb_cache * init_cache_fraction,
/*in_memory=*/options.coins_db_in_memory,
/*should_wipe=*/options.reindex || options.reindex_chainstate);

Expand All @@ -125,7 +138,7 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
}

// The on-disk coinsdb is now in a good state, create the cache
chainstate->InitCoinsCache(cache_sizes.coins);
chainstate->InitCoinsCache(chainman.m_total_coinstip_cache * init_cache_fraction);
assert(chainstate->CanFlushToDisk());

if (!is_coinsview_empty(chainstate)) {
Expand All @@ -146,6 +159,11 @@ ChainstateLoadResult LoadChainstate(ChainstateManager& chainman, const CacheSize
};
}

// Now that chainstates are loaded and we're able to flush to
// disk, rebalance the coins caches to desired levels based
// on the condition of each chainstate.
chainman.MaybeRebalanceCaches();

return {ChainstateLoadStatus::SUCCESS, {}};
}

Expand Down
91 changes: 91 additions & 0 deletions src/node/utxo_snapshot.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
// Copyright (c) 2022 The Bitcoin Core developers
// Distributed under the MIT software license, see the accompanying
// file COPYING or http://www.opensource.org/licenses/mit-license.php.

#include <node/utxo_snapshot.h>

#include <fs.h>
#include <logging.h>
#include <streams.h>
#include <uint256.h>
#include <util/system.h>
#include <validation.h>

#include <cstdio>
#include <optional>

namespace node {

bool WriteSnapshotBaseBlockhash(Chainstate& snapshot_chainstate)
{
AssertLockHeld(::cs_main);
assert(snapshot_chainstate.m_from_snapshot_blockhash);

const std::optional<fs::path> chaindir = snapshot_chainstate.CoinsDB().StoragePath();
assert(chaindir); // Sanity check that chainstate isn't in-memory.
const fs::path write_to = *chaindir / node::SNAPSHOT_BLOCKHASH_FILENAME;

FILE* file{fsbridge::fopen(write_to, "wb")};
AutoFile afile{file};
if (afile.IsNull()) {
LogPrintf("[snapshot] failed to open base blockhash file for writing: %s\n",
fs::PathToString(write_to));
return false;
}
afile << *snapshot_chainstate.m_from_snapshot_blockhash;

if (afile.fclose() != 0) {
LogPrintf("[snapshot] failed to close base blockhash file %s after writing\n",
fs::PathToString(write_to));
return false;
}
return true;
}

std::optional<uint256> ReadSnapshotBaseBlockhash(fs::path chaindir)
{
if (!fs::exists(chaindir)) {
LogPrintf("[snapshot] cannot read base blockhash: no chainstate dir " /* Continued */
"exists at path %s\n", fs::PathToString(chaindir));
return std::nullopt;
}
const fs::path read_from = chaindir / node::SNAPSHOT_BLOCKHASH_FILENAME;
const std::string read_from_str = fs::PathToString(read_from);

if (!fs::exists(read_from)) {
LogPrintf("[snapshot] snapshot chainstate dir is malformed! no base blockhash file " /* Continued */
"exists at path %s. Try deleting %s and calling loadtxoutset again?\n",
fs::PathToString(chaindir), read_from_str);
return std::nullopt;
}

uint256 base_blockhash;
FILE* file{fsbridge::fopen(read_from, "rb")};
AutoFile afile{file};
if (afile.IsNull()) {
LogPrintf("[snapshot] failed to open base blockhash file for reading: %s\n",
read_from_str);
return std::nullopt;
}
afile >> base_blockhash;

if (std::fgetc(afile.Get()) != EOF) {
LogPrintf("[snapshot] warning: unexpected trailing data in %s\n", read_from_str);
} else if (std::ferror(afile.Get())) {
LogPrintf("[snapshot] warning: i/o error reading %s\n", read_from_str);
}
return base_blockhash;
}

std::optional<fs::path> FindSnapshotChainstateDir()
{
fs::path possible_dir =
gArgs.GetDataDirNet() / fs::u8path(strprintf("chainstate%s", SNAPSHOT_CHAINSTATE_SUFFIX));

if (fs::exists(possible_dir)) {
return possible_dir;
}
return std::nullopt;
}

} // namespace node
33 changes: 33 additions & 0 deletions src/node/utxo_snapshot.h
Original file line number Diff line number Diff line change
Expand Up @@ -6,8 +6,14 @@
#ifndef BITCOIN_NODE_UTXO_SNAPSHOT_H
#define BITCOIN_NODE_UTXO_SNAPSHOT_H

#include <fs.h>
#include <uint256.h>
#include <serialize.h>
#include <validation.h>

#include <optional>

extern RecursiveMutex cs_main;

namespace node {
//! Metadata describing a serialized version of a UTXO set from which an
Expand All @@ -33,6 +39,33 @@ class SnapshotMetadata

SERIALIZE_METHODS(SnapshotMetadata, obj) { READWRITE(obj.m_base_blockhash, obj.m_coins_count); }
};

//! The file in the snapshot chainstate dir which stores the base blockhash. This is
//! needed to reconstruct snapshot chainstates on init.
//!
//! Because we only allow loading a single snapshot at a time, there will only be one
//! chainstate directory with this filename present within it.
const fs::path SNAPSHOT_BLOCKHASH_FILENAME{"base_blockhash"};

//! Write out the blockhash of the snapshot base block that was used to construct
//! this chainstate. This value is read in during subsequent initializations and
//! used to reconstruct snapshot-based chainstates.
bool WriteSnapshotBaseBlockhash(Chainstate& snapshot_chainstate)
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);

//! Read the blockhash of the snapshot base block that was used to construct the
//! chainstate.
std::optional<uint256> ReadSnapshotBaseBlockhash(fs::path chaindir)
EXCLUSIVE_LOCKS_REQUIRED(::cs_main);

//! Suffix appended to the chainstate (leveldb) dir when created based upon
//! a snapshot.
constexpr std::string_view SNAPSHOT_CHAINSTATE_SUFFIX = "_snapshot";


//! Return a path to the snapshot-based chainstate dir, if one exists.
std::optional<fs::path> FindSnapshotChainstateDir();

} // namespace node

#endif // BITCOIN_NODE_UTXO_SNAPSHOT_H
6 changes: 4 additions & 2 deletions src/streams.h
Original file line number Diff line number Diff line change
Expand Up @@ -487,12 +487,14 @@ class AutoFile
AutoFile(const AutoFile&) = delete;
AutoFile& operator=(const AutoFile&) = delete;

void fclose()
int fclose()
{
int retval{0};
if (file) {
::fclose(file);
retval = ::fclose(file);
file = nullptr;
}
return retval;
}

/** Get wrapped FILE* with transfer of ownership.
Expand Down
49 changes: 47 additions & 2 deletions src/test/util/chainstate.h
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
#include <node/context.h>
#include <node/utxo_snapshot.h>
#include <rpc/blockchain.h>
#include <test/util/setup_common.h>
#include <validation.h>

#include <univalue.h>
Expand All @@ -20,11 +21,24 @@ const auto NoMalleation = [](AutoFile& file, node::SnapshotMetadata& meta){};
/**
* Create and activate a UTXO snapshot, optionally providing a function to
* malleate the snapshot.
*
* If `reset_chainstate` is true, reset the original chainstate back to the genesis
* block. This allows us to simulate more realistic conditions in which a snapshot is
* loaded into an otherwise mostly-uninitialized datadir. It also allows us to test
* conditions that would otherwise cause shutdowns based on the IBD chainstate going
* past the snapshot it generated.
*/
template<typename F = decltype(NoMalleation)>
static bool
CreateAndActivateUTXOSnapshot(node::NodeContext& node, const fs::path root, F malleation = NoMalleation)
CreateAndActivateUTXOSnapshot(
TestingSetup* fixture,
F malleation = NoMalleation,
bool reset_chainstate = false,
bool in_memory_chainstate = false)
{
node::NodeContext& node = fixture->m_node;
fs::path root = fixture->m_path_root;

// Write out a snapshot to the test's tempdir.
//
int height;
Expand All @@ -47,7 +61,38 @@ CreateAndActivateUTXOSnapshot(node::NodeContext& node, const fs::path root, F ma

malleation(auto_infile, metadata);

return node.chainman->ActivateSnapshot(auto_infile, metadata, /*in_memory=*/true);
if (reset_chainstate) {
{
// What follows is code to selectively reset chainstate data without
// disturbing the existing BlockManager instance, which is needed to
// recognize the headers chain previously generated by the chainstate we're
// removing. Without those headers, we can't activate the snapshot below.
//
// This is a stripped-down version of node::LoadChainstate which
// preserves the block index.
LOCK(::cs_main);
uint256 gen_hash = node.chainman->ActiveChainstate().m_chain[0]->GetBlockHash();
node.chainman->ResetChainstates();
node.chainman->InitializeChainstate(node.mempool.get());
Chainstate& chain = node.chainman->ActiveChainstate();
Assert(chain.LoadGenesisBlock());
// These cache values will be corrected shortly in `MaybeRebalanceCaches`.
chain.InitCoinsDB(1 << 20, true, false, "");
chain.InitCoinsCache(1 << 20);
chain.CoinsTip().SetBestBlock(gen_hash);
chain.setBlockIndexCandidates.insert(node.chainman->m_blockman.LookupBlockIndex(gen_hash));
chain.LoadChainTip();
node.chainman->MaybeRebalanceCaches();
}
BlockValidationState state;
if (!node.chainman->ActiveChainstate().ActivateBestChain(state)) {
throw std::runtime_error(strprintf("ActivateBestChain failed. (%s)", state.ToString()));
}
Assert(
0 == WITH_LOCK(node.chainman->GetMutex(), return node.chainman->ActiveHeight()));
}

return node.chainman->ActivateSnapshot(auto_infile, metadata, in_memory_chainstate);
}


Expand Down
Loading

0 comments on commit 6912a28

Please sign in to comment.