6b605b91c1 [fuzz] Add MiniMiner target + diff fuzz against BlockAssembler (glozow)
3f3f2d59ea [unit test] GatherClusters and MiniMiner unit tests (glozow)
59afcc8354 Implement Mini version of BlockAssembler to calculate mining scores (glozow)
56484f0fdc [mempool] find connected mempool entries with GatherClusters(…) (glozow)
Pull request description:
Implement Mini version of BlockAssembler to calculate mining scores
Run the mining algorithm on a subset of the mempool, only disturbing the
mempool to copy out fee information for relevant entries. Intended to be
used by wallet to calculate amounts needed for fee-bumping unconfirmed
transactions.
From comments of sipa and glozow below:
> > In what way does the code added here differ from the real block assembly code?
>
> * Only operates on the relevant transactions rather than full mempool
> * Has the ability to remove transactions that will be replaced so they don't impact their ancestors
> * Does not hold mempool lock outside of the constructor, makes copies of the entries it needs instead (though I'm not sure if this has an effect in practice)
> * Doesn't do the sanity checks like keeping weight within max block weight and `IsFinalTx()`
> * After the block template is built, additionally calculates fees to bump remaining ancestor packages to target feerate
ACKs for top commit:
achow101:
ACK 6b605b91c1
Xekyo:
> ACK [6b605b9](6b605b91c1) modulo `miniminer_overlap` test.
furszy:
ACK 6b605b91 modulo `miniminer_overlap` test.
theStack:
Code-review ACK 6b605b91c1
Tree-SHA512: f86a8b4ae0506858a7b15d90f417ebceea5038b395c05c825e3796123ad3b6cb8a98ebb948521316802a4c6d60ebd7041093356b1e2c2922a06b3b96b3b8acb6
This is achieved by letting the index sync thread wait until
reindex-chainstate is finished.
This also disables the pruning check when reindexing the chainstate (which is
incompatible with prune mode) because there would be no chain at this point
in init.
Add a stop_after_block_import field to the BlockManager options. Use
this field instead of the global gArgs.
This should allow users of the BlockManager to not rely on the global
Args.
Add a blocks_dir field to the BlockManager options. Move functions
relying on the global gArgs to get the blocks_dir into the BlockManager
class.
This should eventually allow users of the BlockManager to not rely on
the global Args and instead pass in their own options.
Remove access to the global gArgs for the fastprune argument and
replace it by adding a field to the existing BlockManager Options
struct.
When running `clang-tidy-diff` on this commit, there is a diagnostic
error: `unknown type name 'uint64_t' [clang-diagnostic-error] uint64_t
prune_target{0};`, which is fixed by including cstdint.
This should eventually allow users of the BlockManager to not rely on
the global gArgs and instead pass in their own options.
This is a commit in preparation for the next few commits. The functions
are moved to methods to avoid their re-declaration for the purpose of
passing in BlockManager options.
The functions that were now moved into the BlockManager should no longer
use the params as an argument, but instead use the member variable.
In the moved ReadBlockFromDisk and UndoReadFromDisk, change
the function signature to accept a reference to a CBlockIndex instead of
a raw pointer. The pointer is expected to be non-null, so reflect that
in the type.
To allow for the move of functions to BlockManager methods all call
sites require an instantiated BlockManager, or a callback to one.
This commit effectively moves the definition of these constants
out of the chainparamsbase to their own file.
Using the ChainType enums provides better type safety compared to
passing around strings.
The commit is part of an ongoing effort to decouple the libbitcoinkernel
library from the ArgsManager and other functionality that should not be
part of the kernel library.
8f14fc8622 test: cover fastprune with excessive block size (Matthew Zipkin)
271c23e87f blockstorage: Adjust fastprune limit if block exceeds blockfile size (Martin Zumsande)
Pull request description:
The debug-only `-fastprune` option used in several tests is not always safe to use:
If a `-fastprune` node receives a block larger than the maximum blockfile size of `64kb` bad things happen: The while loop in `BlockManager::FindBlockPos` never terminates, and the node runs oom because memory for `m_blockfile_info` is allocated in each iteration of the loop.
The same would happen if a naive user used `-fastprune` on anything other than regtest (so this can be tested by syncing on signet for example, the first block that crashes the node is at height 2232).
Change the approach by raising the blockfile size to the size of the block, if that block otherwise wouldn't fit (idea by TheCharlatan).
ACKs for top commit:
ryanofsky:
Code review ACK 8f14fc8622. Added new assert, test, and comment since last review
TheCharlatan:
ACK 8f14fc8622
pinheadmz:
ACK 8f14fc8622
Tree-SHA512: df2fea30613ef9d40ebbc2416eacb574f6d7d96847db5c33dda22a29a2c61a8db831aa9552734ea4477e097f253dbcb6dcb1395d43d2a090cc0588c9ce66eac3
If the added block exceeds the blockfile size in test-only
-fastprune mode, the node would get stuck in an infinite loop and
run out of memory.
Avoid this by raising the blockfile size to the size of the added block
in this situation.
Co-authored-by: TheCharlatan <seb.kung@gmail.com>
This is an extraction of ArgsManager related functions from util/system
into their own common file.
Config file related functions are moved to common/config.cpp.
The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager. The ArgsManager belongs
into the common library, since the kernel library should not depend on
it. See doc/design/libraries.md for more information on this rationale.
00e9b97f37 refactor: Move fs.* to util/fs.* (TheCharlatan)
106b46d9d2 Add missing fs.h includes (TheCharlatan)
b202b3dd63 Add missing cstddef include in assumptions.h (TheCharlatan)
18fb36367a refactor: Extract util/fs_helpers from util/system (Ben Woosley)
Pull request description:
This pull request is part of the `libbitcoinkernel` project https://github.com/bitcoin/bitcoin/issues/24303https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". This commit was originally authored by empact and is taken from its parent PR #25152.
#### Context
There is an ongoing effort to decouple the `ArgsManager` used for command line parsing user-provided arguments from the libbitcoinkernel library (https://github.com/bitcoin/bitcoin/pull/25290, https://github.com/bitcoin/bitcoin/pull/25487, https://github.com/bitcoin/bitcoin/pull/25527, https://github.com/bitcoin/bitcoin/pull/25862, https://github.com/bitcoin/bitcoin/pull/26177, and https://github.com/bitcoin/bitcoin/pull/27125). The `ArgsManager` is defined in `system.h`. A similar pull request extracting functionality from `system.h` has been merged in https://github.com/bitcoin/bitcoin/pull/27238.
#### Changes
Next to providing better code organization, this PR removes some reliance of the tree of libbitcoinkernel header includes on `system.h` (and thus the `ArgsManager` definition) by moving filesystem related functions out of the `system.*` files.
There is already a pair of `fs.h` / `fs.cpp` in the top-level `src/` directory. They were not combined with the files introduced here, to keep the patch cleaner and more importantly because they are often included without the utility functions. The new files are therefore named `fs_helpers` and the existing `fs` files are moved into the util directory.
Further commits splitting more functionality out of `system.h` are still in #25152 and will be submitted in separate PRs once this PR has been processed.
ACKs for top commit:
hebasto:
ACK 00e9b97f37
Tree-SHA512: 31422f148d14ba3c843b99b1550a6fd77c77f350905ca324f93d4f97b652246bc58fa9696c64d1201979cf88733e40be02d262739bb7d417cf22bf506fdb7666
Rewrite the same algo instead of reusing BlockAssembler because we have
a few extra requirements that would make the changes invasive and
difficult to review:
- Only operate on the relevant transactions rather than full mempool
- Remove transactions that will be replaced so they can't bump their ancestors
- Don't hold mempool lock outside of the constructor
- Skip things like max block weight and IsFinalTx
- Additionally calculate fees to bump remaining ancestor packages to target feerate
Co-authored-by: Murch <murch@murch.one>
The fs.* files are already part of the libbitcoin_util library. With the
introduction of the fs_helpers.* it makes sense to move fs.* into the
util/ directory as well.
b3e78dc91d refactor: Don't use global chainparams in chainstatemanager method (TheCharlatan)
382b692a50 Split non/kernel chainparams (Carl Dong)
edabbc78a3 Add factory functions for Main/Test/Sig/Reg chainparams (Carl Dong)
d938098398 Remove UpdateVersionBitsParameters (Carl Dong)
84b85786f0 Decouple RegTestChainParams from ArgsManager (Carl Dong)
76cd4e7c96 Decouple SigNetChainParams from ArgsManager (Carl Dong)
Pull request description:
This pull request is part of the `libbitcoinkernel` project https://github.com/bitcoin/bitcoin/issues/24303https://github.com/bitcoin/bitcoin/projects/18 and more specifically its "Step 2: Decouple most non-consensus code from libbitcoinkernel". dongcarl is the original author of this patchset, these commits were taken from https://github.com/dongcarl/bitcoin/tree/2022-03-libbitcoinkernel-chainparams-args-only.
#### Context
The bitcoin kernel library currently relies on code containing user configurations through the `ArgsManager`. This is not optimal, since as a stand-alone library it should not rely on bitcoind's argument parsing logic. Instead, its interfaces should accept control and options structs that control the kernel library's desired configuration.
Similar work towards decoupling the `ArgsManager` from the kernel has been done in
https://github.com/bitcoin/bitcoin/pull/25290, https://github.com/bitcoin/bitcoin/pull/25487, https://github.com/bitcoin/bitcoin/pull/25527 and https://github.com/bitcoin/bitcoin/pull/25862.
#### Changes
By moving the `CChainParams` class definition into the kernel and giving it new factory functions `CChainParams::{RegTest,SigNet,Main,TestNet}`it can be constructed without an `ArgsManager` reference, unlike the current factory function `CreateChainParams`.
The first few commits remove uses of `ArgsManager` within `CChainParams`. Then the `CChainParams` definition is moved to a new file in the `kernel/` subdirectory.
ACKs for top commit:
MarcoFalke:
re-ACK b3e78dc91d🛁
ryanofsky:
Code review ACK b3e78dc91d. Only changes since last review were recent review suggestions.
ajtowns:
ACK b3e78dc91d
Tree-SHA512: 3835aca1d3e3c75cc3303dd584bab3a77e58f6c678724a5e359fe4b0e17e0763a00931ee6191f516b9fde50496f59cc691f0709c0254206db3863bbf7ab2cacd
Moves chainparams code not using the ArgsManager to the kernel.
Subsequently use the kernel chainparams header now where possible in
order to further decouple chainparams call sites from gArgs.
error is a low-level function with a sole dependency on LogPrintf, which
is defined in logging.h
The background of this commit is an ongoing effort to decouple the
libbitcoinkernel library from the ArgsManager defined in system.h.
Moving the function out of system.h allows including it from a separate
source file without including the ArgsManager definitions from system.h.
fa1b4e5c32 Use steady clock in FlushStateToDisk (MarcoFalke)
1111e2f8b4 Use steady clock in SeedStrengthen and FindBestImplementation (MarcoFalke)
Pull request description:
There may be a theoretical deadlock for the duration of the offset when the system clock is adjusted into a past time while executing `SeedStrengthen`.
Fix this by using steady clock.
Do the same in `FindBestImplementation`, which shouldn't be affected, because it discards outlier measurements. However, doing the same there for consistency seems fine.
Do the same in `FlushStateToDisk`, which should make the flushes more steady, if the system clock is adjusted by a large offset.
ACKs for top commit:
john-moffett:
ACK fa1b4e5c32
willcl-ark:
ACK fa1b4e5c3
Tree-SHA512: cc625e796b186accd53222bd64eb57d0512bc7e588312d254349b542bbc5e5daac348ff2b3b3f7dc5ae0bbbae2ec11fdbf3022cf2164211633765a4b0108e83e
2b373fe49d docs: update assumeutxo.md (James O'Beirne)
87a1108c81 test: add snapshot completion unittests (James O'Beirne)
d70919a88f refactor: make MempoolMutex() public (James O'Beirne)
7300ced9de log: add LoadBlockIndex() message for assumedvalid blocks (James O'Beirne)
d96c59cc5c validation: add ChainMan logic for completing UTXO snapshot validation (James O'Beirne)
f2a4f3376f move-only-ish: init: factor out chainstate initialization (James O'Beirne)
637a90b973 add Chainstate::HasCoinsViews() (James O'Beirne)
c29f26b47b validation: add CChainState::m_disabled and ChainMan::isUsable (James O'Beirne)
5ee22cdafd add ChainstateManager.GetSnapshot{BaseHeight,BaseBlock}() (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
Part two of replacing https://github.com/bitcoin/bitcoin/pull/24232.
---
When a user activates a snapshot, the serialized UTXO set data is used to create an "assumed-valid" chainstate, which becomes active in an attempt to get the node to network tip as quickly as possible. Simultaneously in the background, the already-existing chainstate continues "conventional" IBD to both accumulate full block data and serve as a belt-and-suspenders to validate the assumed-valid chainstate.
Once the background chainstate's tip reaches the base block of the snapshot used, we set `m_stop_use` on that chainstate and immediately take the hash of its UTXO set; we verify that this matches the assumeutxo value in the source code. Note that while we ultimately want to remove this background chainstate, we don't do so until the following initialization process, when we again check the UTXO set hash of the background chainstate, and if it continues to match, we remove the (now unnecessary) background chainstate, and move the (previously) assumed-valid chainstate into its place. We then reinitialize the chainstate in the normal way.
As noted in previous comments, we could do the filesystem operations "inline" immediately when the background validation completes, but that's basically just an optimization that saves disk space until the next restart. It didn't strike me as worth the risk of moving chainstate data around on disk during runtime of the node, though maybe my concerns are overblown.
The final result of this completion process is a fully-validated chain, where the only evidence that the user synced using assumeutxo is the existence of a `base_blockhash` file in the `chainstate` directory.
ACKs for top commit:
achow101:
ACK 2b373fe49d
Tree-SHA512: a204e1d6e6932dd83c799af3606b01a9faf893f04e9ee1a36d63f2f1ccfa9118bdc1c107d86976aa0312814267e6a42074bf3e2bf1dead4b2513efc6d955e13d
Trigger completion when a background validation chainstate reaches the
same height as a UTXO snapshot, and handle cleaning up the chainstate
on subsequent startup.
This is only used in the current PR to avoid ugly
`strprintf(Untranslated("%s:\n%s"), str, MakeUnorderedList(details)`
boilerplate in init code. But in the future the function could be extended and
more widely used to include more details in GUI error messages or display them
in a more readable way, see code comment.
3141eab9c6 test: add functional test for ScanAndUnlinkAlreadyPrunedFiles (Andrew Toth)
e252909e56 test: add unit test for ScanAndUnlinkAlreadyPrunedFiles (Andrew Toth)
77557dda4a prune: scan and unlink already pruned block files on startup (Andrew Toth)
Pull request description:
There are a few cases where we can mark a block and undo file as pruned in our block index, but not actually remove the files from disk.
1. If we call `FindFilesToPrune` or `FindFilesToPruneManual` and crash before `UnlinkPrunedFiles`.
2. If on Windows there is an open file handle to the file somewhere else when calling `fs::remove` in `UnlinkPrunedFiles` (https://en.cppreference.com/w/cpp/filesystem/remove, https://learn.microsoft.com/en-us/windows/win32/api/fileapi/nf-fileapi-deletefilew#remarks). This could be from another process, or if we are calling `ReadBlockFromDisk`/`ReadRawBlockFromDisk` without having a lock on `cs_main` (which has been allowed since ccd8ef65f9).
This PR mitigates this by scanning all pruned block files on startup after `LoadBlockIndexDB` and unlinking them again.
ACKs for top commit:
achow101:
ACK 3141eab9c6
pablomartin4btc:
re-ACK with added functional test 3141eab9c6.
furszy:
Code review ACK 3141eab9
theStack:
Code-review ACK 3141eab9c6
Tree-SHA512: 6c73bc57838ad1b7e5d441af3c4d6bf4c61c4382e2b86485e57fbb74a61240710c0ceeceb8b4834e610ecfa3175c6955c81ea4b2285fee11ca6383f472979d8d
0af16e7134 doc: add release note for #25574 (Martin Zumsande)
57ef2a4812 validation: report if pruning prevents completion of verification (Martin Zumsande)
0c7785bb25 init, validation: Improve handling if VerifyDB() fails due to insufficient dbcache (Martin Zumsande)
d6f781f1cf validation: return VerifyDBResult::INTERRUPTED if verification was interrupted (Martin Zumsande)
6360b5302d validation: Change return value of VerifyDB to enum type (Martin Zumsande)
Pull request description:
`VerifyDB()` can fail to complete due to insufficient dbcache at the level 3 checks. This PR improves the error handling in this case in the following ways:
- The rpc `-verifychain` now returns false if the check can't be completed due to insufficient cache
- During init, we only log a warning if the default values for `-checkblocks` and `-checklevel` are taken and the check doesn't complete. However, if the user actively specifies one of these args, we return with an InitError if we can't complete the check.
This PR also changes `-verifychain` RPC to return `false` if the verification didn't finish due to missing block data (pruning) or due to being interrupted by the node being shutdown.
Previously, this PR also included a fix for a possible assert during verification - this was done in #27009 (now merged).
ACKs for top commit:
achow101:
ACK 0af16e7134
ryanofsky:
Code review ACK 0af16e7134. Only small suggested changes since the last review, like renaming some of the enum values. I did leave more suggestions, but they are not very important and could be followups
john-moffett:
ACK 0af16e7134
MarcoFalke:
lgtm re-ACK 0af16e7134 🎚
Tree-SHA512: 84b4f767cf9bfbafef362312757c9bf765b41ae3977f4ece840e40c52a2266b1457832df0cdf70440be0aac2168d9b58fc817238630b0b6812f3836ca950bc0e
Moves chainstate initialization into its own function. This is
necessary to later support a more readable way of handling
background-validation chainstate cleanup during init, since the
chainstate initialization functions may need to be repeated after
moving leveldb filesystem content around.
This commit isn't strictly necessary, but the alternative is to (ab)use
the `while` loop in init.cpp with a `continue` on the basis of a
specific ChainstateLoadingError return value from LoadChainstate. Not
only is this harder to read, but it can't be unittested.
The approach here lets us consolidate background-validation cleanup to
LoadChainstate, and therefore exercise it within tests.
This commit is most easily reviewed with
git diff --color-moved=dimmed-zebra
--color-moved-ws=ignore-space-change
6a5e88e5cf miner: don't re-apply default Options value if argument is unset (stickies-v)
ea72c3d9d5 refactor: avoid duplicating BlockAssembler::Options members (stickies-v)
cba749a9b7 refactor: rename local gArgs to args (stickies-v)
Pull request description:
Two follow-ups for #26695, both refactoring and no observed (*) behaviour change:
- Rename `gArgs` to `args` because it's not actually a global
- Add `BlockAssembler::Options` as a (private) member to `BlockAssembler` to avoid having to assign all the options individually, essentially duplicating them
Reduces LoC and makes the code more readable, in my opinion.
---
(*) as [pointed out by ajtowns](https://github.com/bitcoin/bitcoin/pull/26883#discussion_r1068247937), this PR changes the interface of `ApplyArgsManOptions()`, making this not a pure refactoring PR. In practice, `ApplyArgsManOptions()` is never called in such a way that this leads to observed behaviour change. Regardless, I've carved out the potential behaviour change into a separate commit and would be okay with dropping it, should it turn out to be controversial.
ACKs for top commit:
glozow:
ACK 6a5e88e5cf
TheCharlatan:
Light code review ACK 6a5e88e5cf
Tree-SHA512: 15c30442ff0e070b1a58dc4c9615550d619ce35b4a2596b2c0a9d790259bbf987cab708f7cbb1057a8cf8b4c3226f3ad981282d3499ac442094806492a5f68ce
The rpc command verifychain now fails if the dbcache was not sufficient
to complete the verification at the specified level and depth.
In the same situation, the VerifyDB check during Init will now fail (and lead to
an early shutdown) if the user has explicitly specified -checkblocks or
-checklevel but the check couldn't be executed because of the limited
cache. If the user didn't change any of the two and is using the defaults, log a warning
but don't prevent the node from starting up.
This does not change behavior. It is in preparation for
special handling of the case where VerifyDB doesn't finish
for various reasons, but doesn't fail.
Use DBParams struct to remove ArgsManager uses from txdb.
To reduce size of this commit, this moves references to gArgs variable out of
txdb.cpp to calling code in chainstate.cpp. But these moves are temporary. The
gArgs references in chainstate.cpp are moved out to calling code in init.cpp in
later commits.
This commit does not change behavior.
Add CoinsViewOptions struct to remove ArgsManager uses from txdb.
To reduce size of this commit, this moves references to gArgs variable out of
txdb.cpp to calling code in validation.cpp. But these moves are temporary. The
gArgs references in validation.cpp are moved out to calling code in init.cpp in
later commits.
This commit does not change behavior.
Add DBParams and DBOptions structs to remove ArgsManager uses from dbwrapper.
To reduce size of this commit, this moves references to gArgs variable out of
dbwrapper.cpp to calling code in txdb.cpp. But these moves are temporary. The
gArgs references in txdb.cpp are moved out to calling code in init.cpp in later
commits.
This commit does not change behavior.
faba08b5b4 refactor: Remove stray cs_main redundant declaration (MarcoFalke)
fa02591edf doc: Export threadsafety.h from sync.h (MarcoFalke)
Pull request description:
Looks like this was forgotten when introducing kernel/cs_main ?
Also, there is a commit to export threadsafety.h from sync.h.
ACKs for top commit:
hebasto:
ACK faba08b5b4
Tree-SHA512: 0aa58e7693b6fcd504f9da7339f8baa463a6407f67b27f68002db705f4642321ac3765f16c3d906c925ee24085591b79160a62fa5f4aaf6f2e5dcc788411800d
eeee61065f Use AutoFile and HashVerifier where possible (MarcoFalke)
fa961141f7 Add HashVerifier (MarcoFalke)
Pull request description:
This was done in the context of https://github.com/bitcoin/bitcoin/pull/25284 , but I think it also makes sense standalone.
The basic idea is that serialization type should not be initialized when it is not needed. Same for the serialization version.
So do this here for `AutoFile` and `HashVerifier`. `CAutoFile` and `CHashVerifier` remain in places where it is not yet possible.
ACKs for top commit:
stickies-v:
ACK eeee61065f
Tree-SHA512: 93786778c309ecfdc1ed43552d24ff9d966954d69a47f66faaa6de24daacd25c651f3f62bde5abbb362700298fb3c04ffbd3207a0dd13d0bd8bff7fd6d07dcf8
ApplyArgsManOptions does not need to set default values for missing
arguments, these are already defined in the BlockAssembler::Options.
This commit changes the interface of ApplyArgsManOptions(). If
ApplyArgsManOptions() is called again after a option is changed,
this option will no longer be reset to the default value.
There is no observed behaviour change due to how
ApplyArgsManOptions() is currently used, and the new interface is
consistent with e.g. ValidationCacheSizes and MemPoolLimits.
Add Options as a member to BlockAssembler to avoid having to assign
all the options individually.
Additionally brings the struct more in line with how we typically
define default and ArgManager values, as e.g. with
ChainstateManager::Options and and CTxMemPool::Options
282019cd3d refactor: add kernel/cs_main.* (fanquake)
Pull request description:
One place to find / include `cs_main`.
No more:
> // Actually declared in validation.cpp; can't include because of circular dependency.
> extern RecursiveMutex cs_main;
Ultimately, no more need to include `validation.h` (which also includes (heavy/boost filled) `txmempool.h`) everywhere for `cs_main`. See #26087 for another example of why that is useful.
ACKs for top commit:
ajtowns:
ACK 282019cd3d
Tree-SHA512: 142835b794873e7a09c3246d6101843ae81ec0c6295e6873130c98a2abfa5f7282748d0f1a37237a779cc71c3bc0a75d03b20313ef5398c83d4814215cbc8287
04528054fc [bench] BlockAssembler with mempool packages (glozow)
6ce265acf4 [test util] lock cs_main before pool.cs in PopulateMempool (glozow)
8791410662 [test util] randomize fee in PopulateMempool (glozow)
cba5934eb6 [miner] allow bypassing TestBlockValidity (glozow)
c058852308 [refactor] parameterize BlockAssembler::Options in PrepareBlock (glozow)
a2de971ba1 [refactor] add helper to apply ArgsManager to BlockAssembler::Options (glozow)
Pull request description:
Performance of block template building matters as miners likely want to be able to start mining on a block with transactions asap after a block is found. We would want to know if a mempool PR accidentally caused, for example, a 100x slowdown. An `AssembleBlock()` bench exists, but it operates on a mempool with 101 transactions, each with 0 ancestors or descendants and with the same fee. Adding a bench with a more complex mempool is useful because (1) it's more realistic (2) updating packages can potentially cause the algorithm to take a long time.
ACKs for top commit:
kevkevinpal:
Tested ACK [0452805](04528054fc)
achow101:
ACK 04528054fc
stickies-v:
ACK 04528054f
Tree-SHA512: 38c138d6a75616651f9b1faf4e3a1cd833437a486f4e84308fbee958e8462bb570582c88f7ba7ab99d80191e97855ac2cf27c43cc21585d3e4b0e227effe2fb5
55696a0ac3 wallet: remove `mempool_sequence` from `transactionRemovedFromMempool` (w0xlt)
bf19069c53 wallet: remove `mempool_sequence` from `transactionAddedToMempool` (w0xlt)
Pull request description:
This PR removes `mempool_sequence` from `transactionRemovedFromMempool` and `transactionAddedToMempool`.
`mempool_sequence` is not used in these methods, only in ZMQ notifications.
ACKs for top commit:
instagibbs:
ACK 55696a0ac3
Tree-SHA512: 621e89230bcb6edfed83e2758601a2b093822fc2dc4e9bfb00487e340f2bc4c5ac3bf6df3ca00b7fe55bb3df15858820f2bf698f403d2e48b915dd9eb47b63e0
47c4b1f52a mempool: log/halt when CalculateMemPoolAncestors fails unexpectedly (stickies-v)
5481f65849 mempool: add AssumeCalculateMemPoolAncestors helper function (stickies-v)
f911bdfff9 mempool: use util::Result for CalculateMemPoolAncestors (stickies-v)
66e028f739 mempool: use util::Result for CalculateAncestorsAndCheckLimits (stickies-v)
Pull request description:
Upon reviewing the documentation for `CTxMemPool::CalculateMemPoolAncestors`, I noticed `setAncestors` was meant to be an `out` parameter but actually is an `in,out` parameter, as can be observed by adding `assert(setAncestors.empty());` as the first line in the function and running `make check`. This PR fixes this unexpected behaviour and introduces refactoring improvements to make intents and effects of the code more clear.
## Unexpected behaviour
This behaviour occurs only in the package acceptance path, currently only triggered by `testmempoolaccept` and `submitpackage` RPCs.
In `MemPoolAccept::AcceptMultipleTransactions()`, we first call `PreChecks()` and then `SubmitPackage()` with the same `Workspace ws` reference. `PreChecks` leaves `ws.m_ancestors` in a potentially non-empty state, before it is passed on to `MemPoolAccept::SubmitPackage`. `SubmitPackage` is the only place where `setAncestors` isn't guaranteed to be empty before calling `CalculateMemPoolAncestors`. The most straightforward fix is to just forcefully clear `setAncestors` at the beginning of CalculateMemPoolAncestors, which is done in the first bugfix commit.
## Improvements
### Return value instead of out-parameters
This PR updates the function signatures for `CTxMemPool::CalculateMemPoolAncestors` and `CTxMemPool::CalculateAncestorsAndCheckLimits` to use a `util::Result` return type and eliminate both the `setAncestors` `in,out`-parameter as well as the error string. It simplifies the code and makes the intent and effects more explicit.
### Observability
There are 7 instances where we currently call `CalculateMemPoolAncestors` without actually checking if the function succeeded because we assume that it can't fail, such as in [miner.cpp](69b10212ea/src/node/miner.cpp (L399)). This PR adds a new wrapper `AssumeCalculateMemPoolAncestors` function that logs such unexpected failures, or in case of debug builds even halts the program. It's not crucial to the objective, more of an observability improvement that seems sensible to add on here.
ACKs for top commit:
achow101:
ACK 47c4b1f52a
w0xlt:
ACK 47c4b1f52a
glozow:
ACK 47c4b1f52a
furszy:
light code review ACK 47c4b1f5
aureleoules:
ACK 47c4b1f52a
Tree-SHA512: d908dad00d1a5645eb865c4877cc0bae74b9cd3332a3641eb4a285431aef119f9fc78172d38b55c592168a73dae83242e6af3348815f7b37cbe2d448a3a58648
Allows us to test BlockAssembler on transactions without signatures or
mature coinbases (which is what PopulateMempool creates). Also means
that `TestBlockValidity()` is not included in the bench timing.
This allows us to both manually manipulate options and grab values from
ArgsManager (i.e. -blockmaxweight and -blockmintxfee config options)
when constructing BlockAssembler::Options. Prior to this change, the
only way to apply the config options is by ctoring BlockAssembler with
no options, which calls DefaultOptions().
When CalculateMemPoolAncestors fails unexpectedly (e.g. it exceeds
ancestor/descendant limits even though we expect no limits to be applied),
add an error log entry for increased visibility. For debug builds,
the application will even halt completely since this is not supposed
to happen.
b19c4124b3 refactor: Rename ambiguous interfaces::MakeHandler functions (Ryan Ofsky)
dd6e8bd71c build: remove BOOST_CPPFLAGS from libbitcoin_util (fanquake)
82e272a109 refactor: Move src/interfaces/*.cpp files to libbitcoin_common.a (Ryan Ofsky)
Pull request description:
These belong in `libbitcoin_common.a`, not `libbitcoin_util.a`, because they aren't general-purpose utilities, they just contain some common glue code that is used by both the node and the wallet. Another reason not to include these in `libbitcoin_util.a` is to prevent them from being used by the kernel library.
Also rename ambiguous `MakeHandler` functions to `MakeCleanupHandler` and `MakeSignalHandler`. Cleanup function handler was introduced after boost signals handler, so original naming didn't make much sense.
This just contains a move-only commit, and a rename commit. There are no actual code or behavior changes.
This PR is an alternative to #26293, and solves the same issue of removing a boost dependency from the _util_ library. The advantages of this PR compared to #26293 are that it keeps the source directory structure more flat, and it avoids having to change #includes all over the codebase.
ACKs for top commit:
hebasto:
ACK b19c4124b3
Tree-SHA512: b3a1d33eedceda7ad852c6d6f35700159d156d96071e59acae2bc325467fef81476f860a8855ea39cf3ea706a1df2a341f34fb2dcb032c31a3b0e9cf14103b6a
38941a703e refactor: Move `txmempool_entry.h` --> `kernel/mempool_entry.h` (Hennadii Stepanov)
Pull request description:
This PR addresses the https://github.com/bitcoin/bitcoin/pull/17786#discussion_r1027818360:
> why not move it to the right place, that is to `kernel/txmempool_entry.h`?
ACKs for top commit:
MarcoFalke:
review ACK 38941a703e📊
Tree-SHA512: 0145974b63b67ca1d9d89af2dd9d4438beca480c16a563f330da05fec49b8394d7ba20ed83cf7d50b2e19454e006978ebed42b0e07887b98d00210f3201ce9ba
46339d29b1 test, refactor: Reorder sendtxrcncl tests for better readability (Gleb Naumenko)
14263c13f1 p2p, refactor: Extend logs for unexpected sendtxrcncl (Gleb Naumenko)
87493e112e p2p, test, refactor: Minor code improvements (Gleb Naumenko)
00c5dec818 p2p: Clarify sendtxrcncl policies (Gleb Naumenko)
ac6ee5ba21 test: Expand unit and functional tests for txreconciliation (Gleb Naumenko)
bc84e24a4f p2p, refactor: Switch to enum class for ReconciliationRegisterResult (Gleb Naumenko)
a60f729e29 p2p: Drop roles from sendtxrcncl (Gleb Naumenko)
6772cbf69c tests: stabilize sendtxrcncl test (Gleb Naumenko)
Pull request description:
Non-trivial changes include:
- Getting rid of roles in `sendtxrcncl` message (summarized in the [BIP PR](https://github.com/bitcoin/bips/pull/1376));
- Disconnect the peer if it send `sendtxrcncl` although we are in `blocksonly` and notified the peer with `fRelay=0`;
- Don't send `sendtxrcncl` to feeler connections.
ACKs for top commit:
vasild:
ACK 46339d29b1
ariard:
ACK 46339d2
mzumsande:
Code Review ACK 46339d29b1
Tree-SHA512: b5cc6934b4670c12b7dbb3189e739ef747ee542ec56678bf4e4355bfb481b746d32363c173635685b71969b3fe4bd52b1c8ebd3ea3b35c82044bba69220f6417
c8dc0e3eaa refactor: Inline `CTxMemPoolEntry` class's functions (Hennadii Stepanov)
75bbe594e5 refactor: Move `CTxMemPoolEntry` class to its own module (Hennadii Stepanov)
Pull request description:
This PR:
- gets rid of the `policy/fees` -> `txmempool` -> `policy/fees` circular dependency
- is an alternative to #13949, which nukes only one circular dependency
ACKs for top commit:
ryanofsky:
Code review ACK c8dc0e3eaa. Just include and whitespace changes since last review, and there's a moveonly commit now so it's very easy to review
theStack:
Code-review ACK c8dc0e3eaa
glozow:
utACK c8dc0e3eaa, agree these changes are an improvement.
Tree-SHA512: 36ece824e6ed3ab1a1e198b30a906c8ac12de24545f840eb046958a17315ac9260c7de26e11e2fbab7208adc3d74918db7a7e389444130f8810548ca2e81af41
0582932260 test: add test for fast rescan using block filters (top-up detection) (Sebastian Falbesoner)
ca48a4694f rpc: doc: mention rescan speedup using `blockfilterindex=1` in affected wallet RPCs (Sebastian Falbesoner)
3449880b49 wallet: fast rescan: show log message for every non-skipped block (Sebastian Falbesoner)
935c6c4b23 wallet: take use of `FastWalletRescanFilter` (Sebastian Falbesoner)
70b3513904 wallet: add `FastWalletRescanFilter` class for speeding up rescans (Sebastian Falbesoner)
c051026586 wallet: add method for retrieving the end range for a ScriptPubKeyMan (Sebastian Falbesoner)
845279132b wallet: support fetching scriptPubKeys with minimum descriptor range index (Sebastian Falbesoner)
088e38d3bb add chain interface methods for using BIP 157 block filters (Sebastian Falbesoner)
Pull request description:
## Description
This PR is another take of using BIP 157 block filters (enabled by `-blockfilterindex=1`) for faster wallet rescans and is a modern revival of #15845. For reviewers new to this topic I can highly recommend to read the corresponding PR review club (https://bitcoincore.reviews/15845).
The basic idea is to skip blocks for deeper inspection (i.e. looking at every single tx for matches) if our block filter doesn't match any of the block's spent or created UTXOs are relevant for our wallet. Note that there can be false-positives (see https://bitcoincore.reviews/15845#l-199 for a PR review club discussion about false-positive rates), but no false-negatives, i.e. it is safe to skip blocks if the filter doesn't match; if the filter *does* match even though there are no wallet-relevant txs in the block, no harm is done, only a little more time is spent extra.
In contrast to #15845, this solution only supports descriptor wallets, which are way more widespread now than back in the time >3 years ago. With that approach, we don't have to ever derive the relevant scriptPubKeys ourselves from keys before populating the filter, and can instead shift the full responsibility to that to the `DescriptorScriptPubKeyMan` which already takes care of that automatically. Compared to legacy wallets, the `IsMine` logic for descriptor wallets is as trivial as checking if a scriptPubKey is included in the ScriptPubKeyMan's set of scriptPubKeys (`m_map_script_pub_keys`): e191fac4f3/src/wallet/scriptpubkeyman.cpp (L1703-L1710)
One of the unaddressed issues of #15845 was that [the filter was only created once outside the loop](https://github.com/bitcoin/bitcoin/pull/15845#discussion_r343265997) and as such didn't take into account possible top-ups that have happened. This is solved here by keeping a state of ranged `DescriptorScriptPubKeyMan`'s descriptor end ranges and check at each iteration whether that range has increased since last time. If yes, we update the filter with all scriptPubKeys that have been added since the last filter update with a range index equal or higher than the last end range. Note that finding new scriptPubKeys could be made more efficient than linearly iterating through the whole `m_script_pub_keys` map (e.g. by introducing a bidirectional map), but this would mean introducing additional complexity and state and it's probably not worth it at this time, considering that the performance gain is already significant.
Output scripts from non-ranged `DescriptorScriptPubKeyMan`s (i.e. ones with a fixed set of output scripts that is never extended) are added only once when the filter is created first.
## Benchmark results
Obviously, the speed-up indirectly correlates with the wallet tx frequency in the scanned range: the more blocks contain wallet-related transactions, the less blocks can be skipped due to block filter detection.
In a [simple benchmark](https://github.com/theStack/bitcoin/blob/fast_rescan_functional_test_benchmark/test/functional/pr25957_benchmark.py), a regtest chain with 1008 blocks (corresponding to 1 week) is mined with 20000 scriptPubKeys contained (25 txs * 800 outputs) each. The blocks each have a weight of ~2500000 WUs and hence are about 62.5% full. A global constant `WALLET_TX_BLOCK_FREQUENCY` defines how often wallet-related txs are included in a block. The created descriptor wallet (default setting of `keypool=1000`, we have 8*1000 = 8000 scriptPubKeys at the start) is backuped via the `backupwallet` RPC before the mining starts and imported via `restorewallet` RPC after. The measured time for taking this import process (which involves a rescan) once with block filters (`-blockfilterindex=1`) and once without block filters (`-blockfilterindex=0`) yield the relevant result numbers for the benchmark.
The following table lists the results, sorted from worst-case (all blocks contain wallte-relevant txs, 0% can be skipped) to best-case (no blocks contain walltet-relevant txs, 100% can be skipped) where the frequencies have been picked arbitrarily:
wallet-related tx frequency; 1 tx per... | ratio of irrelevant blocks | w/o filters | with filters | speed gain
--------------------------------------------|-----------------------------|-------------|--------------|-------------
~ 10 minutes (every block) | 0% | 56.806s | 63.554s | ~0.9x
~ 20 minutes (every 2nd block) | 50% (1/2) | 58.896s | 36.076s | ~1.6x
~ 30 minutes (every 3rd block) | 66.67% (2/3) | 56.781s | 25.430s | ~2.2x
~ 1 hour (every 6th block) | 83.33% (5/6) | 58.193s | 15.786s | ~3.7x
~ 6 hours (every 36th block) | 97.22% (35/36) | 57.500s | 6.935s | ~8.3x
~ 1 day (every 144th block) | 99.31% (143/144) | 68.881s | 6.107s | ~11.3x
(no txs) | 100% | 58.529s | 5.630s | ~10.4x
Since even the (rather unrealistic) worst-case scenario of having wallet-related txs in _every_ block of the rescan range obviously doesn't take significantly longer, I'd argue it's reasonable to always take advantage of block filters if they are available and there's no need to provide an option for the user.
Feedback about the general approach (but also about details like naming, where I struggled a lot) would be greatly appreciated. Thanks fly out to furszy for discussing this subject and patiently answering basic question about descriptor wallets!
ACKs for top commit:
achow101:
ACK 0582932260
Sjors:
re-utACK 0582932260
aureleoules:
ACK 0582932260 - minor changes, documentation and updated test since last review
w0xlt:
re-ACK 0582932260
Tree-SHA512: 3289ba6e4572726e915d19f3e8b251d12a4cec8c96d041589956c484b5575e3708b14f6e1e121b05fe98aff1c8724de4564a5a9123f876967d33343cbef242e1
aaaa7bd0ba iwyu: Add missing includes (MacroFake)
fa9ebec096 Remove g_parallel_script_checks (MacroFake)
fa7c834b9f Move ::fCheckBlockIndex into ChainstateManager (MacroFake)
fa43188d86 Move ::fCheckpointsEnabled into ChainstateManager (MacroFake)
cccca83099 Move ::nMinimumChainWork into ChainstateManager (MacroFake)
fa29d0b57c Move ::hashAssumeValid into ChainstateManager (MacroFake)
faf44876db Move ::nMaxTipAge into ChainstateManager (MacroFake)
Pull request description:
It seems preferable to assign globals to a class (in this case `ChainstateManager`), than to leave them dangling. This should clarify scope for code-readers, as well as clarifying unit test behaviour.
ACKs for top commit:
dergoegge:
Code review ACK aaaa7bd0ba
ryanofsky:
Code review ACK aaaa7bd0ba. No changes since last review, other than rebase
aureleoules:
reACK aaaa7bd0ba
Tree-SHA512: 83ec3ba0fb4f1dad95810d4bd4e578454e0718dc1bdd3a794cc4e48aa819b6f5dad4ac4edab3719bdfd5f89cbe23c2740a50fd56c1ff81c99e521c5f6d4e898d
This is useful for speeding up wallet rescans and is based on an
earlier version from PR #15845 ("wallet: Fast rescan with BIP157 block
filters"), which was never merged.
Co-authored-by: MacroFake <falke.marco@gmail.com>
e133264c5b Add test for PSBT input verification (Greg Sanders)
d25699280a Verify PSBT inputs rather than check for fields being empty (Greg Sanders)
Pull request description:
In a few keys spots, PSBT finality is checked by looking for non-empty witness data.
This complicates a couple things:
1) Empty data can be valid in certain cases
2) User may be passed bogus final data by a counterparty during PSBT work happening, and end up with incorrect signatures that they may not be able to check in other contexts if the UTXO doesn't exist yet in chain/mempool, timelocks, etc.
On the whole I think these heavier checks are worth it in case someone is actually assuming the signatures are correct if our API is saying so.
ACKs for top commit:
achow101:
ACK e133264c5b
Tree-SHA512: 9de4fbb0be1257b081781f5df908fd55666e3acd5c4e36beb3b3f2f5a6aed69ff77068c44cde6127e159e773293fd9ced4c0bb47e693969f337e74dc8af030da
This changes the flag for the bitcoin-chainstate executable. Previously
it was false, now it is the chain's default value (still false for the
main chain).
This changes the minimum chain work for the bitcoin-chainstate
executable. Previously it was uint256{}, now it is the chain's default
minimum chain work.
Once we received a reconciliation announcement support
message from a peer and it doesn't violate our protocol,
we store the negotiated parameters which will be used
for future reconciliations.
If we're connecting to the peer which might support
transaction reconciliation, we announce we want to reconcile
with them.
We store the reconciliation salt so that when the peer
responds with their salt, we are able to compute the
full reconciliation salt.
This behavior is enabled with a CLI flag.
bf95976061 doc: add note about snapshot chainstate init (James O'Beirne)
e4d7995286 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7336 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc9241c0 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c361391b8 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c215 validation: add ResetChainstates() (James O'Beirne)
3a29dfbfb2 move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9247 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff377c validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d1590331 add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1e8b init: add utxo snapshot detection (James O'Beirne)
f9f1735f13 validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf100 db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
---
Half of the replacement for #24232. The original PR grew larger than expected throughout the review process.
This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.
Don't be scared! There are some big move-only commits in here.
Accompanying changes include:
- moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](https://github.com/bitcoin/bitcoin/pull/24232#discussion_r832762880).
- adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
- improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.
ACKs for top commit:
naumenkogs:
utACK bf95976061
ariard:
Code Review ACK bf9597606
ryanofsky:
Code review ACK bf95976061. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
fjahr:
utACK bf95976061
aureleoules:
ACK bf95976061
Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
bcb0cacac2 reindex, log, test: fixes#21379 (mruddy)
Pull request description:
Fixes#21379.
The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
These additions to the blk files are non-fatal, but also not desirable.
That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.
The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).
This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.
ACKs for top commit:
LarryRuane:
tested code-review ACK bcb0cacac2
ryanofsky:
Code review ACK bcb0cacac2. This is a disturbing bug with an easy fix which seems well-worth merging.
mzumsande:
ACK bcb0cacac2 (reviewed code and did some testing, I agree that it fixes the bug).
w0xlt:
tACK bcb0cacac2
Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
fabf1cdb20 Use steady clock for bench logging (MacroFake)
faed342a23 scripted-diff: Rename time symbols (MacroFake)
Pull request description:
Instead of using `0.001` and similar constants to "convert" an int64_t to milliseconds, use the type-safe `Ticks<>` helper. Also, use steady clock instead of system clock, since the durations are used for benchmarking.
ACKs for top commit:
fanquake:
ACK fabf1cdb20 - validation bench output still looks sane.
Tree-SHA512: e6525b5fdad6045ca500c56014897d7428ad288aaf375933d3b5939feddf257f6910d562eb66ebcde9186bef9a604ee8d763a318253838318d59df2a285be7c2
33b12e5df6 docs: improve docs where MemPoolLimits is used (stickies-v)
6945853c0b test: use NoLimits() in MempoolIndexingTest (stickies-v)
3a86f24a4c refactor: mempool: use CTxMempool::Limits (stickies-v)
b85af25f87 refactor: mempool: add MemPoolLimits::NoLimits() (stickies-v)
Pull request description:
Mempool currently considers 4 limits regarding ancestor and descendant count and size, which get passed around between functions quite a bit. This PR uses `CTxMemPool::Limits` introduced in https://github.com/bitcoin/bitcoin/pull/25290 to simplify those signatures and callsites.
The purpose of this PR is to improve readability and maintenance, without behaviour change.
As noted in the first commit "refactor: mempool: change MemPoolLimits members to uint", we currently have an underflow issue where a user could pass a negative `-limitancestorsize`, which is eventually cast to an unsigned integer. This behaviour already exists. Because it's orthogonal and to minimize scope, I think this should be fixed in a separate PR.
ACKs for top commit:
hebasto:
ACK 33b12e5df6, I have reviewed the code and it looks OK, I agree it can be merged.
glozow:
reACK 33b12e5df6
Tree-SHA512: 591c6dcee1894f1c3ca28b34a680eeadcf0d40cda92451b4a422c03087b27d682b5e30ba4367abd75a99b5ccb115b7884b0026958d3c7dddab030549db5a4056
0f40d65321 refactor: remove duplicate code from BlockAssembler (James O'Beirne)
Pull request description:
Found while reminding myself how transactions are chosen for blocks. Take it or leave it!
ACKs for top commit:
glozow:
ACK 0f40d65321
theStack:
Concept and code-review ACK 0f40d65321
Tree-SHA512: 8a2694e670ce3fe897ab8f64f64c8df5f8487fc1264527a3abbcba0e5b921fb693416497ccd62508295bc33f202c65556b91b6af463acb91aab43138d2492c14
Simplifies function signatures by removing repetition of all the
ancestor/descendant limits, and increases readability by being
more verbose by naming the limits, while still reducing the LoC.
If we call FlushBlockFile() without having intitialized the block index
with LoadBlockIndexDB(), we may be indexing into an empty vector.
Specifically this is an issue when we call MaybeRebalanceCaches() during
chainstate init before the block index has been loaded, which calls
FlushBlockFile().
Also add an assert to avoid undefined behavior.
Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
Add functionality for activating a snapshot-based chainstate if one is
detected on-disk.
Also cautiously initialize chainstate cache usages so that we don't
somehow blow past our cache allowances during initialization, then
rebalance at the end of init.
Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
This changes the snapshot's leveldb chainstate dir name from
`chainstate_[blockhash]` to `chainstate_snapshot`. This simplifies
later logic that loads snapshot data, and enforces the limitation
of a single snapshot at any given time.
Since we still need to persis the blockhash of the base block, we
write that out to a file (`chainstate_snapshot/base_blockhash`) for
later use during initialization, so that we can reinitialize the
snapshot chainstate.
Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
fa4c59d65b Move blockstorage option logging to LoadChainstate() (MacroFake)
fa3358b668 Move validation option logging to LoadChainstate() (MacroFake)
Pull request description:
This would allow libbitcoinkernel users to see the options logged as well. Currently they would only be logged for bitcoind. Behavior change suggested in the refactoring pull https://github.com/bitcoin/bitcoin/pull/25704#discussion_r956166460
ACKs for top commit:
ryanofsky:
Code review ACK fa4c59d65b. Only change since last review is moving pruning logprints out of `AppInitParameterInteraction` as suggested
jonatack:
Review ACK fa4c59d65b
Tree-SHA512: f27508ca06a78ef162f002d556cf830df374fe95fd4f10bf22c24b6b48276ce49f52f82ffedc43596c872ddcf08321ca03651495fd3abde16254cb8afab39d33
fa875349e2 Fix iwyu (MacroFake)
faad673716 Fix issues when calling std::move(const&) (MacroFake)
Pull request description:
Passing a symbol to `std::move` that is marked `const` is a no-op, which can be fixed in two ways:
* Remove the `const`, or
* Remove the `std::move`
ACKs for top commit:
ryanofsky:
Code review ACK fa875349e2. Looks good. Good for univalue to support c++11 move optimizations
Tree-SHA512: 3dc5cad55b93cfa311abedfb811f35fc1b7f30a1c68561f15942438916c7de25e179c364be11881e01f844f9c2ccd71a3be55967ad5abd2f35b10bb7a882edea
This makes a number of changes:
- Get rid of the verification_progress argument in the node interface
NotifyHeaderTip (it was always 0.0).
- Instead of passing a CBlockIndex* in the UI interface's NotifyHeaderTip,
send separate height, timestamp fields. This is becuase in headers presync,
no actual CBlockIndex object is available.
- Add a bool presync argument to both of the above, to identify signals
pertaining to the first headers sync phase.
This introduces an insignificant performance penalty, as it means locator
construction needs to use the skiplist-based CBlockIndex::GetAncestor()
function instead of the lookup-based CChain, but avoids the need for
callers to have access to a relevant CChain object.
1dc03dda05 [doc] remove non-signaling mentions of BIP125 (glozow)
32024d40f0 scripted-diff: remove mention of BIP125 from non-signaling var names (glozow)
Pull request description:
We have pretty thorough documentation of our RBF policy in doc/policy/mempool-replacements.md. It enumerates each rule with several sentences of rationale. Also, each rule pretty much has its own function (3 and 4 share one), with extensive comments. The doc states explicitly that our rules are similar but differ from BIP125, and contains a record of historical changes to RBF policy.
We should not use "BIP125" as synonymous with our RBF policy because:
- Our RBF policy is different from what is specified in BIP125, for example:
- the BIP does not mention our rule about the replacement feerate being higher (our Rule 6)
- the BIP uses minimum relay feerate for Rule 4, while we have used incremental relay feerate since #9380
- the "inherited signaling" question (CVE-2021-31876). Call it discrepancy, ambiguous wording, doc misinterpretation, or implementation details, I would recommend users refer to doc/policy/mempool-replacements.md
- the signaling policy is configurable, see #25353
- Our RBF policy may change further
- We have already marked BIP125 as only "partially implemented" in docs/bips.md since 1fd49eb498
- See comments from people who are not me recently:
- https://github.com/bitcoin/bitcoin/pull/25038#discussion_r909507429
- https://github.com/bitcoin/bitcoin/pull/25575#issuecomment-1179519204
This PR removes all non-signaling mentions of BIP125 (if people feel strongly, we can remove all mentions of BIP125 period). It may be useful to refer to the concept of "tx opts in to RBF if it has at least one nSequence less than (0xffffffff - 1)" as "BIP125 signaling" because:
- It is succint.
- It has already been widely marketed as BIP125 opt-in signaling.
- Our API uses it when referring to signaling (e.g. getmempoolentry["bip125-replaceable"] and wallet error message "not BIP 125 replaceable"). Changing those is more invasive.
- If/when we have other ways to signal in the future, we can disambiguate them this way. See #25038 which proposes another way of signaling, and where I pulled these commits from.
Alternatives:
- Changing our policy to match BIP125. This doesn't make sense as, for example, we would have to remove the requirement that a replacement tx has a higher feerate (Rule 6).
- Changing BIP125 to match what we have. This doesn't make sense as it would be a significant change to a BIP years after it was finalized and already used as a spec to implement RBF in other places.
- Document our policy as a new BIP and give it a number. This might make sense if we don't expect things to change a lot, and can be done as a next step.
ACKs for top commit:
darosior:
ACK 1dc03dda05
ariard:
ACK 1dc03dda
t-bast:
ACK 1dc03dda05
Tree-SHA512: a3adc2039ec5785892d230ec442e50f47f7062717392728152bbbe27ce1c564141f85253143f53cb44e1331cf47476d74f5d2f4b3cd873fc3433d7a0aa783e02
eeee5ada23 Make adjusted time type safe (MacroFake)
fa3be799fe Add time helpers (MacroFake)
Pull request description:
This makes follow-ups easier to review. Also, it makes sense by itself.
ACKs for top commit:
ryanofsky:
Code review ACK eeee5ada23. Confirmed type changes and equivalent code changes only.
Tree-SHA512: 51bf1ae5428552177286113babdd49e82459d6c710a07b6e80a0a045d373cf51045ee010461aba98e0151d8d71b9b3b5f8f73e302d46ba4558e0b55201f99e9f
Our RBF policy is different from the rules specified in BIP125. For
example, the BIP does not mention Rule 6, and our Rule 4 uses the
(configurable) incremental relay feerate (distinct from the
minimum relay feerate). Those interested in our policy should refer to
doc/policy/mempool-replacements.md instead. These rules may also
continue to diverge with package RBF and other RBF improvements. Keep
references to the BIP125 signaling wrt sequence numbers, since that is
still correct and widely used. It is helpful to refer to this as "BIP125
signaling" since it is unambiguous and succint, especially if we have
multiple ways to signal replaceability in the future.
The rule numbers in doc/policy/mempool-replacements.md correspond
largely to those of BIP 125, so we can still refer to them like "Rule 5."
Also:
- Make DEFAULT_MAX_SIG_CACHE_SIZE into constexpr
DEFAULT_MAX_SIG_CACHE_BYTES to utilize the compile-time integer
arithmetic overflow checking available to constexpr.
- Fix comment (MiB instead of MB) for DEFAULT_MAX_SIG_CACHE_BYTES.
- Pass in max_size_bytes parameter to InitS*Cache(), modify log line to
no longer allude to maxsigcachesize being split evenly between the two
validation caches.
- Fix possible integer truncation and add a comment.
[META] I've kept the integer types as int64_t in order to not introduce
unintended behaviour changes, in the next commit we will make
them size_t.
It is part of the node library. Also, it won't be moved to the kernel
lib, as it will be pruned of ArgsManager.
-BEGIN VERIFY SCRIPT-
# Move module
git mv src/mempool_args.cpp src/node/
git mv src/mempool_args.h src/node/
# Replacements
sed -i 's:mempool_args\.h:node/mempool_args.h:g' $(git grep -l mempool_args)
sed -i 's:mempool_args\.cpp:node/mempool_args.cpp:g' $(git grep -l mempool_args)
sed -i 's:MEMPOOL_ARGS_H:NODE_MEMPOOL_ARGS_H:g' $(git grep -l MEMPOOL_ARGS_H)
-END VERIFY SCRIPT-
4bedfd702a refactor: remove unneeded temporaries in node/interfaces, simplify code (Jon Atack)
b27ba169eb refactor: make all NodeImpl/ChainImpl/ExternalSignerImpl members public (Jon Atack)
Pull request description:
- Make all `NodeImpl`, `ChainImpl` and `ExternalSignerImpl` class members `public` (and document why), to be consistent in all the `*Impl` classes in `src/node/interfaces.cpp` and `src/wallet/interfaces.cpp` and to help future reviewers and contributors.
- Remove unneeded temporaries in `NodeImpl` and `ChainImpl` methods in `src/node/interfaces.cpp` and simplify, to make the code easier to read and understand and to improve performance by avoiding unnecessary move operations.
ACKs for top commit:
ryanofsky:
Code review ACK 4bedfd702a. Changes since last review, applying suggested style & simplifiying first commit. Also avoiding another lock in second commit.
Tree-SHA512: 112f7cad5e2838c94c5b79d61328f42fe75fdb97f401ab49eccf696fc2c6a8a0c0ee55ec974c0602acf7423f78bb82e90eb8a0cc531e1d3347f73b7c83685504
- make the code easier to read and understand
- improve performance by avoiding unnecessary move operations
- the cleaner, simpler, and easier to read the code is, the
better chance the compiler has at implementing it well
as the classes themselves are private, and to be consistent within all the
*Impl classes in src/node/interfaces.cpp and src/wallet/interfaces.cpp
following this order:
public:
// ... virtual methods ...
// ... nonvirtual helper methods ...
// ... data members ...
and add documentation in src/node/interfaces.cpp and src/wallet/interfaces.cpp
to help future reviewers and contributors.
dd065dae9f refactor: Make mapBlocksUnknownParent local, and rename it (Hennadii Stepanov)
Pull request description:
This PR is a second attempt at #19594. This PR has two motivations:
- Improve code hygiene by eliminating a global variable, `mapBlocksUnknownParent`
- Fix fuzz test OOM when running too long ([see #19594 comment](https://github.com/bitcoin/bitcoin/pull/19594#issuecomment-958801638))
A minor added advantage is to release `mapBlocksUnknownParent` memory when the reindexing phase is done. The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.
This PR helps our efforts to reduce the use of global variables. This variable isn't just global, it's hidden inside a function (it looks like a local variable but has the `static` attribute).
This global variable exists because the `-reindex` processing code calls `LoadExternalBlockFile()` multiple times (once for each block file), but that function must preserve some state between calls (the `mapBlocksUnknownParent` map). This PR fixes this by allocating this map as a local variable in the caller's scope and passing it in on each call. When reindexing completes, the map goes out of scope and is deallocated.
I tested this manually by reindexing on mainnet and signet. Also, the existing `feature_reindex.py` functional test passes.
ACKs for top commit:
mzumsande:
re-ACK dd065dae9f
theStack:
re-ACK dd065dae9f
shaavan:
reACK dd065dae9f
Tree-SHA512: 9cd20e44d2fa1096dd405bc107bc065ea8f904f5b3f63080341b08d8cf57b790df565f58815c2f331377d044d5306708b4bf6bdfc5ef8d0ed85d8e97d744732c
fad3c5826e refactor: Fix iwyu on node/chainstate (MacroFake)
Pull request description:
Fix the CI warning on master: https://cirrus-ci.com/task/5398182703136768?logs=ci#L7020
ACKs for top commit:
fanquake:
ACK fad3c5826e - could do chain.h
Tree-SHA512: 94f6ea0b3d9667863a4217b65bd1b9e07c65bdb566378faf0727bae5eb38d2d527ecae0c39efdda740b7ab7c8269141437ffbcb470cca7d559f09b8ee132d101
fa32b1bbfd refactor: Use chainman() helper consistently in ChainImpl (MacroFake)
Pull request description:
Doing anything else will just lead to more verbose and inconsistent code.
ACKs for top commit:
fanquake:
ACK fa32b1bbfd - all instances of `Assert(m_node.chainman)` in node/interfaces replaced with `chainman()`, which is the same thing.
shaavan:
Code Review ACK fa32b1bbfd
Tree-SHA512: a417680f79c150e4431aa89bc9db79fdf2dd409419081eb243194837b4ab8d16434165393f39a157473802753843e8c5314ad05c569b4e9221ce29a9fd1cefb8
faf98aecf8 Remove unused includes in rpc/fees.cpp (MacroFake)
1111ddeedf Remove unused includes from dbwrapper.h (MacroFake)
fa77fdd047 Add missing includes (MacroFake)
fa869ce2c2 Add missing includes to node/chainstate (MacroFake)
Pull request description:
Unused includes are confusing, but also cause unrelated compile errors when the unused includes were to be removed.
Fix that by adding the missing includes where they are needed and then remove them where they are not needed. This is also checked by iwyu.
ACKs for top commit:
hebasto:
ACK faf98aecf8, I have reviewed the code and it looks OK, I agree it can be merged.
jarolrod:
Code Review ACK faf98aecf8
Tree-SHA512: 75f3c6e6f6ecf8a98233e1a1463c75ca4e0eb3ec341150d274141072fe95413a3c2ec6386d1c527899cc63d43f63f5eb5991509847412773362808ddfb1bb435
Replace CommitInternal method with CustomCommit and use interfaces::Chain
instead of CChainState to generate block locator.
This commit does not change behavior in any way, except in the
(m_best_block_index == nullptr) case, which was added recently in
https://github.com/bitcoin/bitcoin/pull/24117 as part of an ongoing attempt to
prevent index corruption if bitcoind is interrupted during startup. New
behavior in that case should be slightly better than the old behavior (skipping
the entire custom+base commit now vs only skipping the base commit previously)
and this might avoid more cases of corruption.
817326a828 wallet: avoid rescans if under the snapshot (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: #15606)
---
Refuse to load a wallet if it requires a rescan lower than the height of assumed-valid blocks.
Of course in live code right now, `BLOCK_ASSUMED_VALID` block index entries don't exist since they're a unique flag introduced by the use of UTXO snapshots, so this is prophylactic code exercised only by unittests.
ACKs for top commit:
achow101:
ACK 817326a828
ryanofsky:
Code review ACK 817326a828. This seems like the simplest change we can make to avoid wallet problems when an assumeutxo snapshot is loaded.
Tree-SHA512: cfa44b2eb33d1818d30df45210d0dde1e9b78cc9b7c88cb985054dc28427bba9e0905debe4196065d1d3a5ce7bca7e605e629d5ce5f0225b25395746e6d3d596
Add new interfaces::BlockInfo struct to be able to pass extra block
information (file and undo information) to indexes which they are
updated to use high level interfaces::Chain notifications.
This commit does not change behavior in any way.
Also:
1. Have CChainState::LoadMempool and ::ThreadImport take in paths and
pass it through untouched to LoadMempool.
2. Make LoadMempool exit early if the load_path is empty.
3. Adjust the call to ::ThreadImport in ::AppInitMain to correctly pass
in an empty path if mempool persistence is disabled.