This changes the minimum chain work for the bitcoin-chainstate
executable. Previously it was uint256{}, now it is the chain's default
minimum chain work.
e899d4ca6f init: limit bip30 exceptions to coinbase txs (Chris Geihsler)
511eb7fdea Ignore problematic blocks in DisconnectBlock (Chris Geihsler)
Pull request description:
Fixes https://github.com/bitcoin/bitcoin/issues/22596
When using checklevel=4, block verification fails because of duplicate coinbase transactions involving blocks 91812 and 91722. There was already a check in place within `ConnectBlock` to ignore the problematic blocks, but `DisconnectBlock` did not contain a similar check to ignore these blocks when called from `VerifyDB`.
By ignoring these two blocks in `DisconnectBlock`, the block verification process succeeds at checklevel=4.
(Note to reviewers: this is my first contribution to Bitcoin Core, so any feedback is most welcome. Thanks in advance for reviewing!)
## Steps to reproduce:
Use the following bitcoin.conf file and start bitcoind. I only used block data through block ~100000 so that the verification process was much faster.
```
assumevalid=0
checkblocks=0
checklevel=4
```
Without this change, you will see the following error when the blocks are verified:
```
2022-04-14T02:56:44Z init message: Verifying blocks…
2022-04-14T02:56:44Z Verifying last 101881 blocks at level 4
2022-04-14T02:56:44Z [0%]...[10%]...[20%]...[30%]...[40%]...ERROR: VerifyDB(): *** coin database inconsistencies found (last 10160 blocks, 142571 good transactions before that)
2022-04-14T02:57:01Z : Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
: Corrupted block database detected.
Please restart with -reindex or -reindex-chainstate to recover.
```
With this change, you will see this instead:
```
2022-04-14T02:32:29Z init message: Verifying blocks…
2022-04-14T02:32:29Z Verifying last 101746 blocks at level 4
2022-04-14T02:32:29Z [0%]...[10%]...[20%]...[30%]...[40%]...[50%]...[60%]...[70%]...[80%]...[90%]...[DONE].
2022-04-14T02:32:48Z No coin database inconsistencies in last 101746 blocks (226126 transactions)
```
ACKs for top commit:
laanwj:
Code review ACK e899d4ca6f
achow101:
ACK e899d4ca6f
jamesob:
(Biased) ACK e899d4ca6f ([`jamesob/ackr/24851.2.seejee.init_ignore_bip_30_verif`](https://github.com/jamesob/bitcoin/tree/ackr/24851.2.seejee.init_ignore_bip_30_verif))
Tree-SHA512: d2f6d25e9619aee32c1a73fe846b1b587698eaa5a4994fa6424f1038f45654f9fd52b74a69843cc84d90168d74827130ccf8e9201502f5d52281acdb20429291
bf95976061 doc: add note about snapshot chainstate init (James O'Beirne)
e4d7995286 test: add testcases for snapshot initialization (James O'Beirne)
cced4e7336 test: move-only-ish: factor out LoadVerifyActivateChainstate() (James O'Beirne)
51fc9241c0 test: allow on-disk coins and block tree dbs in tests (James O'Beirne)
3c361391b8 test: add reset_chainstate parameter for snapshot unittests (James O'Beirne)
00b357c215 validation: add ResetChainstates() (James O'Beirne)
3a29dfbfb2 move-only: test: make snapshot chainstate setup reusable (James O'Beirne)
8153bd9247 blockmanager: avoid undefined behavior during FlushBlockFile (James O'Beirne)
ad67ff377c validation: remove snapshot datadirs upon validation failure (James O'Beirne)
34d1590331 add utilities for deleting on-disk leveldb data (James O'Beirne)
252abd1e8b init: add utxo snapshot detection (James O'Beirne)
f9f1735f13 validation: rename snapshot chainstate dir (James O'Beirne)
d14bebf100 db: add StoragePath to CDBWrapper/CCoinsViewDB (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: https://github.com/bitcoin/bitcoin/pull/15606)
---
Half of the replacement for #24232. The original PR grew larger than expected throughout the review process.
This change adds the ability to initialize a snapshot-based chainstate during init if one is detected on disk. This is of course unused as of now (aside from in unittests) given that we haven't yet enabled actually loading snapshots.
Don't be scared! There are some big move-only commits in here.
Accompanying changes include:
- moving the snapshot coinsdb directory from being called `chainstate_[base blockhash]` to `chainstate_snapshot`, since we only support one snapshot in use at a time. This simplifies some logic, but it necessitates writing that base blockhash out to a file within the coinsdb dir. See [discussion here](https://github.com/bitcoin/bitcoin/pull/24232#discussion_r832762880).
- adding a simple fix in `FlushBlockFile()` that avoids a crash when attemping to flush to disk before `LoadBlockIndexDB()` is called, which happens when calling `MaybeRebalanceCaches()` during multiple chainstate init.
- improving the unittest to allow testing with on-disk chainstates - necessary to test a simulated restart and re-initialization.
ACKs for top commit:
naumenkogs:
utACK bf95976061
ariard:
Code Review ACK bf9597606
ryanofsky:
Code review ACK bf95976061. Changes since last review: rebasing, switching from CAutoFile to AutoFile, adding comments, switching from BOOST_CHECK to Assert in test util, using chainman.GetMutex() in tests, destroying one ChainstateManager before creating a new one in tests
fjahr:
utACK bf95976061
aureleoules:
ACK bf95976061
Tree-SHA512: 15ae75caf19f8d12a12d2647c52897904d27b265a7af6b4ae7b858592eeadb8f9da6c2394b6baebec90adc28742c053e3eb506119577dae7c1e722ebb3b7bcc0
bcb0cacac2 reindex, log, test: fixes#21379 (mruddy)
Pull request description:
Fixes#21379.
The blocks/blk?????.dat files are mutated and become increasingly malformed, or corrupt, as a result of running the re-indexing process.
The mutations occur after the re-indexing process has finished, as new blocks are appended, but are a result of a re-indexing process miscalculation that lingers in the block manager's `m_blockfile_info` `nSize` data until node restart.
These additions to the blk files are non-fatal, but also not desirable.
That is, this is a form of data corruption that the reading code is lenient enough to process (it skips the extra bytes), but it adds some scary looking log messages as it encounters them.
The summary of the problem is that the re-index process double counts the size of the serialization header (magic message start bytes [4 bytes] + length [4 bytes] = 8 bytes) while calculating the blk data file size (both values already account for the serialization header's size, hence why it is over accounted).
This bug manifests itself in a few different ways, after re-indexing, when a new block from a peer is processed:
1. If the new block will not fit into the last blk file processed while re-indexing, while remaining under the 128MiB limit, then the blk file is flushed to disk and truncated to a size that is 8 greater than it should be. The truncation adds zero bytes (see `FlatFileSeq::Flush` and `TruncateFile`).
1. If the last blk file processed while re-indexing has logical space for the new block under the 128 MiB limit:
1. If the blk file was not already large enough to hold the new block, then the zeros are, in effect, added by `fseek` when the file is opened for writing. Eight zero bytes are added to the end of the last blk file just before the new block is written. This happens because the write offset is 8 too great due to the miscalculation. The result is 8 zero bytes between the end of the last block and the beginning of the next block's magic + length + block.
1. If the blk file was already large enough to hold the new block, then the current existing file contents remain in the 8 byte gap between the end of the last block and the beginning of the next block's magic + length + block. Commonly, when this occcurs, it is due to the blk file containing blocks that are not connected to the block tree during reindex and are thus left behind by the reindex process and later overwritten when new blocks are added. The orphaned blocks can be valid blocks, but due to the nature of concurrent block download, the parent may not have been retrieved and written by the time the node was previously shutdown.
ACKs for top commit:
LarryRuane:
tested code-review ACK bcb0cacac2
ryanofsky:
Code review ACK bcb0cacac2. This is a disturbing bug with an easy fix which seems well-worth merging.
mzumsande:
ACK bcb0cacac2 (reviewed code and did some testing, I agree that it fixes the bug).
w0xlt:
tACK bcb0cacac2
Tree-SHA512: acc97927ea712916506772550451136b0f1e5404e92df24cc05e405bb09eb6fe7c3011af3dd34a7723c3db17fda657ae85fa314387e43833791e9169c0febe51
fabf1cdb20 Use steady clock for bench logging (MacroFake)
faed342a23 scripted-diff: Rename time symbols (MacroFake)
Pull request description:
Instead of using `0.001` and similar constants to "convert" an int64_t to milliseconds, use the type-safe `Ticks<>` helper. Also, use steady clock instead of system clock, since the durations are used for benchmarking.
ACKs for top commit:
fanquake:
ACK fabf1cdb20 - validation bench output still looks sane.
Tree-SHA512: e6525b5fdad6045ca500c56014897d7428ad288aaf375933d3b5939feddf257f6910d562eb66ebcde9186bef9a604ee8d763a318253838318d59df2a285be7c2
Simplifies function signatures by removing repetition of all the
ancestor/descendant limits, and increases readability by being
more verbose by naming the limits, while still reducing the LoC.
fa521c9603 Use steady clock for all millis bench logging (MacroFake)
Pull request description:
Currently `GetTimeMillis` is used for bench logging in milliseconds integral precision. Replace it to use a steady clock that is type-safe and steady.
Microsecond or float precision can be done in a follow-up.
ACKs for top commit:
fanquake:
ACK fa521c9603 - started making the same change.
Tree-SHA512: 86a810e496fc663f815acb8771a6c770331593715cde85370226685bc50c13e8e987e3c5efd0b4e48b36ebd2372255357b709204bac750d41e94a9f7d9897fa6
If a UTXO snapshot fails to validate, don't leave the resulting datadir
on disk as this will confuse initialization on next startup and we'll
get an assertion error.
Used in later commits to remove leveldb directories for
- invalid snapshot chainstates, and
- background-vaildation chainstates that have finished serving their
purpose.
Add functionality for activating a snapshot-based chainstate if one is
detected on-disk.
Also cautiously initialize chainstate cache usages so that we don't
somehow blow past our cache allowances during initialization, then
rebalance at the end of init.
Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
This changes the snapshot's leveldb chainstate dir name from
`chainstate_[blockhash]` to `chainstate_snapshot`. This simplifies
later logic that loads snapshot data, and enforces the limitation
of a single snapshot at any given time.
Since we still need to persis the blockhash of the base block, we
write that out to a file (`chainstate_snapshot/base_blockhash`) for
later use during initialization, so that we can reinitialize the
snapshot chainstate.
Co-authored-by: Russell Yanofsky <russ@yanofsky.org>
3add234546 ui: show header pre-synchronization progress (Pieter Wuille)
738421c50f Emit NotifyHeaderTip signals for pre-synchronization progress (Pieter Wuille)
376086fc5a Make validation interface capable of signalling header presync (Pieter Wuille)
93eae27031 Test large reorgs with headerssync logic (Suhas Daftuar)
355547334f Track headers presync progress and log it (Pieter Wuille)
03712dddfb Expose HeadersSyncState::m_current_height in getpeerinfo() (Suhas Daftuar)
150a5486db Test headers sync using minchainwork threshold (Suhas Daftuar)
0b6aa826b5 Add unit test for HeadersSyncState (Suhas Daftuar)
83c6a0c524 Reduce spurious messages during headers sync (Suhas Daftuar)
ed6cddd98e Require callers of AcceptBlockHeader() to perform anti-dos checks (Suhas Daftuar)
551a8d957c Utilize anti-DoS headers download strategy (Suhas Daftuar)
ed470940cd Add functions to construct locators without CChain (Pieter Wuille)
84852bb6bb Add bitdeque, an std::deque<bool> analogue that does bit packing. (Pieter Wuille)
1d4cfa4272 Add function to validate difficulty changes (Suhas Daftuar)
Pull request description:
New nodes starting up for the first time lack protection against DoS from low-difficulty headers. While checkpoints serve as our protection against headers that fork from the main chain below the known checkpointed values, this protection only applies to nodes that have been able to download the honest chain to the checkpointed heights.
We can protect all nodes from DoS from low-difficulty headers by adopting a different strategy: before we commit to storing a header in permanent storage, first verify that the header is part of a chain that has sufficiently high work (either `nMinimumChainWork`, or something comparable to our tip). This means that we will download headers from a given peer twice: once to verify the work on the chain, and a second time when permanently storing the headers.
The p2p protocol doesn't provide an easy way for us to ensure that we receive the same headers during the second download of peer's headers chain. To ensure that a peer doesn't (say) give us the main chain in phase 1 to trick us into permanently storing an alternate, low-work chain in phase 2, we store commitments to the headers during our first download, which we validate in the second download.
Some parameters must be chosen for commitment size/frequency in phase 1, and validation of commitments in phase 2. In this PR, those parameters are chosen to both (a) minimize the per-peer memory usage that an attacker could utilize, and (b) bound the expected amount of permanent memory that an attacker could get us to use to be well-below the memory growth that we'd get from the honest chain (where we expect 1 new block header every 10 minutes).
After this PR, we should be able to remove checkpoints from our code, which is a nice philosophical change for us to make as well, as there has been confusion over the years about the role checkpoints play in Bitcoin's consensus algorithm.
Thanks to Pieter Wuille for collaborating on this design.
ACKs for top commit:
Sjors:
re-tACK 3add234546
mzumsande:
re-ACK 3add234546
sipa:
re-ACK 3add234546
glozow:
ACK 3add234546
Tree-SHA512: e7789d65f62f72141b8899eb4a2fb3d0621278394d2d7adaa004675250118f89a4e4cb42777fe56649d744ec445ad95141e10f6def65f0a58b7b35b2e654a875
This makes a number of changes:
- Get rid of the verification_progress argument in the node interface
NotifyHeaderTip (it was always 0.0).
- Instead of passing a CBlockIndex* in the UI interface's NotifyHeaderTip,
send separate height, timestamp fields. This is becuase in headers presync,
no actual CBlockIndex object is available.
- Add a bool presync argument to both of the above, to identify signals
pertaining to the first headers sync phase.
In order to prevent memory DoS, we must ensure that we don't accept a new
header into memory until we've performed anti-DoS checks, such as verifying
that the header is part of a sufficiently high work chain. This commit adds a
new argument to AcceptBlockHeader() so that we can ensure that all call-sites
which might cause a new header to be accepted into memory have to grapple with
the question of whether the header is safe to accept, or needs further
validation.
This patch also fixes two places where low-difficulty-headers could have been
processed without such validation (processing an unrequested block from the
network, and processing a compact block).
Credit to Niklas Gögge for noticing this issue, and thanks to Sjors Provoost
for test code.
Avoid permanently storing headers from a peer, unless the headers are part of a
chain with sufficiently high work. This prevents memory attacks using low-work
headers.
Designed and co-authored with Pieter Wuille.
Move ChainstateManager options into m_options struct to simplify class
initialization, organize class members, and to name external option variables
differently than internal state variables.
This change was originally in #25862, but it was suggested to split off in
https://github.com/bitcoin/bitcoin/pull/25862#discussion_r951459817 so it could
be merged earlier and reduce conflicts with other PRs.
1dc03dda05 [doc] remove non-signaling mentions of BIP125 (glozow)
32024d40f0 scripted-diff: remove mention of BIP125 from non-signaling var names (glozow)
Pull request description:
We have pretty thorough documentation of our RBF policy in doc/policy/mempool-replacements.md. It enumerates each rule with several sentences of rationale. Also, each rule pretty much has its own function (3 and 4 share one), with extensive comments. The doc states explicitly that our rules are similar but differ from BIP125, and contains a record of historical changes to RBF policy.
We should not use "BIP125" as synonymous with our RBF policy because:
- Our RBF policy is different from what is specified in BIP125, for example:
- the BIP does not mention our rule about the replacement feerate being higher (our Rule 6)
- the BIP uses minimum relay feerate for Rule 4, while we have used incremental relay feerate since #9380
- the "inherited signaling" question (CVE-2021-31876). Call it discrepancy, ambiguous wording, doc misinterpretation, or implementation details, I would recommend users refer to doc/policy/mempool-replacements.md
- the signaling policy is configurable, see #25353
- Our RBF policy may change further
- We have already marked BIP125 as only "partially implemented" in docs/bips.md since 1fd49eb498
- See comments from people who are not me recently:
- https://github.com/bitcoin/bitcoin/pull/25038#discussion_r909507429
- https://github.com/bitcoin/bitcoin/pull/25575#issuecomment-1179519204
This PR removes all non-signaling mentions of BIP125 (if people feel strongly, we can remove all mentions of BIP125 period). It may be useful to refer to the concept of "tx opts in to RBF if it has at least one nSequence less than (0xffffffff - 1)" as "BIP125 signaling" because:
- It is succint.
- It has already been widely marketed as BIP125 opt-in signaling.
- Our API uses it when referring to signaling (e.g. getmempoolentry["bip125-replaceable"] and wallet error message "not BIP 125 replaceable"). Changing those is more invasive.
- If/when we have other ways to signal in the future, we can disambiguate them this way. See #25038 which proposes another way of signaling, and where I pulled these commits from.
Alternatives:
- Changing our policy to match BIP125. This doesn't make sense as, for example, we would have to remove the requirement that a replacement tx has a higher feerate (Rule 6).
- Changing BIP125 to match what we have. This doesn't make sense as it would be a significant change to a BIP years after it was finalized and already used as a spec to implement RBF in other places.
- Document our policy as a new BIP and give it a number. This might make sense if we don't expect things to change a lot, and can be done as a next step.
ACKs for top commit:
darosior:
ACK 1dc03dda05
ariard:
ACK 1dc03dda
t-bast:
ACK 1dc03dda05
Tree-SHA512: a3adc2039ec5785892d230ec442e50f47f7062717392728152bbbe27ce1c564141f85253143f53cb44e1331cf47476d74f5d2f4b3cd873fc3433d7a0aa783e02
eeee5ada23 Make adjusted time type safe (MacroFake)
fa3be799fe Add time helpers (MacroFake)
Pull request description:
This makes follow-ups easier to review. Also, it makes sense by itself.
ACKs for top commit:
ryanofsky:
Code review ACK eeee5ada23. Confirmed type changes and equivalent code changes only.
Tree-SHA512: 51bf1ae5428552177286113babdd49e82459d6c710a07b6e80a0a045d373cf51045ee010461aba98e0151d8d71b9b3b5f8f73e302d46ba4558e0b55201f99e9f
9376a6dae4 refactor: make active_chain_tip a reference (Aurèle Oulès)
Pull request description:
This PR fixes a TODO introduced in #21055.
Makes `active_chain_tip` argument in `CheckFinalTxAtTip` function a reference instead of a pointer.
ACKs for top commit:
dongcarl:
ACK 9376a6dae4
Tree-SHA512: c36d1769e0b9598b7f79334704b26b73e958d54caa3bd7e4eff954f3964fcf3f5e3a44a5a760497afad51b76e1614c86314fe035e4083c855e3574a620de7f4d
Our RBF policy is different from the rules specified in BIP125. For
example, the BIP does not mention Rule 6, and our Rule 4 uses the
(configurable) incremental relay feerate (distinct from the
minimum relay feerate). Those interested in our policy should refer to
doc/policy/mempool-replacements.md instead. These rules may also
continue to diverge with package RBF and other RBF improvements. Keep
references to the BIP125 signaling wrt sequence numbers, since that is
still correct and widely used. It is helpful to refer to this as "BIP125
signaling" since it is unambiguous and succint, especially if we have
multiple ways to signal replaceability in the future.
The rule numbers in doc/policy/mempool-replacements.md correspond
largely to those of BIP 125, so we can still refer to them like "Rule 5."
faab8dceb3 Remove unused SetTip(nullptr) code (MacroFake)
Pull request description:
Now that this path is no longer used after commit b51e60f914, we can remove it.
Future code should reset `CChain` by simply discarding it and constructing a fresh one.
ACKs for top commit:
ryanofsky:
Code review ACK faab8dceb3. Just moved an assert statement since last review
Tree-SHA512: 7dc273b11133d85d32ca2a69c0c7c07b39cdd338141ef5b51496e7de334a809864d5459eb95535497866c8b1e468aae84ed8f91b543041e6ee20130d5622874e
Also:
- Make DEFAULT_MAX_SIG_CACHE_SIZE into constexpr
DEFAULT_MAX_SIG_CACHE_BYTES to utilize the compile-time integer
arithmetic overflow checking available to constexpr.
- Fix comment (MiB instead of MB) for DEFAULT_MAX_SIG_CACHE_BYTES.
- Pass in max_size_bytes parameter to InitS*Cache(), modify log line to
no longer allude to maxsigcachesize being split evenly between the two
validation caches.
- Fix possible integer truncation and add a comment.
[META] I've kept the integer types as int64_t in order to not introduce
unintended behaviour changes, in the next commit we will make
them size_t.
This fixes an potential overflow which existed prior to this patchset.
If CuckooCache::cache<Element, Hash>::setup_bytes is called with a
`size_t bytes` which, when divided by sizeof(Element), does not fit into
an uint32_t, the implicit conversion to uint32_t in the call to setup
will result in an overflow.
At least on x86_64, this overflow is possible:
static_assert(std::numeric_limits<size_t>::max() / 32 <= std::numeric_limits<uint32_t>::max());
static_assert(std::numeric_limits<size_t>::max() / 4 <= std::numeric_limits<uint32_t>::max());
This commit detects such cases and signals to callers that the `size_t
bytes` input is too large.
1. -maxsigcachesize is a DEBUG_ONLY option
2. Almost 7 years has passed since its semantics change in
830e3f3d02 from "number of entries" to
"number of mebibytes"
3. A std::new_handler was added to the codebase after the original PR
which introduced this limit, which will terminate immediately instead
of causing trouble by being caught somewhere unexpected.
Returning the approximate total size eliminates the need for
InitS*Cache() to do nElems*sizeof(uint256). The cuckoocache has a better
idea of this information.
Our RBF policy is different from the rules specified in BIP125 (refer to
doc/policy/mempool-replacements.md instead), and will continue to
diverge with package RBF. Keep references to BIP125 sequence number,
since that is still useful and correct.
-BEGIN VERIFY SCRIPT-
ren() { sed -i "s:\<$1\>:$2:g" $(git grep -l "\<$1\>" ./src ./test); }
ren m_allow_bip125_replacement m_allow_replacement
ren allow_bip125_replacement allow_replacement
ren MAX_BIP125_REPLACEMENT_CANDIDATES MAX_REPLACEMENT_CANDIDATES
-END VERIFY SCRIPT-
dd065dae9f refactor: Make mapBlocksUnknownParent local, and rename it (Hennadii Stepanov)
Pull request description:
This PR is a second attempt at #19594. This PR has two motivations:
- Improve code hygiene by eliminating a global variable, `mapBlocksUnknownParent`
- Fix fuzz test OOM when running too long ([see #19594 comment](https://github.com/bitcoin/bitcoin/pull/19594#issuecomment-958801638))
A minor added advantage is to release `mapBlocksUnknownParent` memory when the reindexing phase is done. The current situation is somewhat similar to a memory leak because this map exists unused for the remaining lifetime of the process. It's true that this map should be empty of data elements after use, but its internal metadata (indexing structures, etc.) can have non-trivial size because there can be many thousands of simultaneous elements in this map.
This PR helps our efforts to reduce the use of global variables. This variable isn't just global, it's hidden inside a function (it looks like a local variable but has the `static` attribute).
This global variable exists because the `-reindex` processing code calls `LoadExternalBlockFile()` multiple times (once for each block file), but that function must preserve some state between calls (the `mapBlocksUnknownParent` map). This PR fixes this by allocating this map as a local variable in the caller's scope and passing it in on each call. When reindexing completes, the map goes out of scope and is deallocated.
I tested this manually by reindexing on mainnet and signet. Also, the existing `feature_reindex.py` functional test passes.
ACKs for top commit:
mzumsande:
re-ACK dd065dae9f
theStack:
re-ACK dd065dae9f
shaavan:
reACK dd065dae9f
Tree-SHA512: 9cd20e44d2fa1096dd405bc107bc065ea8f904f5b3f63080341b08d8cf57b790df565f58815c2f331377d044d5306708b4bf6bdfc5ef8d0ed85d8e97d744732c
3a61fc56a0 refactor: move CBlockIndex#ToString() from header to implementation (Jon Atack)
57865eb512 CDiskBlockIndex: rename GetBlockHash() to ConstructBlockHash() (Jon Atack)
99e8ec8721 CDiskBlockIndex: remove unused ToString() class member (Jon Atack)
14aeece462 CBlockIndex: ensure phashBlock is not nullptr before dereferencing (Jon Atack)
Pull request description:
Fix a few design issues, potential footguns and inconsistent behavior in the CBlockIndex and CDiskBlockIndex classes.
- Ensure phashBlock in `CBlockIndex#GetBlockHash()` is not nullptr before dereferencing and remove a now-redundant assert preceding a GetBlockHash() caller. This protects against UB here, and in case of failure (which would indicate a consensus bug), the debug log will print `bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion 'phashBlock != nullptr' failed. Aborted` instead of `Segmentation fault`.
- Remove the unused `CDiskBlockIndex#ToString()` class member, and mark the inherited `CBlockIndex#ToString()` public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.
- Rename the `CDiskBlockIndex GetBlockHash()` class member to `ConstructBlockHash()`, which also makes sense as they perform different operations to return a blockhash, and mark the inherited `CBlockIndex#GetBlockHash()` public interface member as deleted to disallow calling it in the derived CDiskBlockIndex class.
- Move `CBlockIndex#ToString()` from header to implementation, which also allows dropping `tinyformat.h` from the header file.
Rationale and discussion regarding the CDiskBlockIndex changes:
Here is a failing test on master that demonstrates the inconsistent behavior of the current design: calling the same inherited public interface functions on the same CDiskBlockIndex object should yield identical behavior, but does not.
```diff
diff --git a/src/test/validation_chainstatemanager_tests.cpp b/src/test/validation_chainstatemanager_tests.cpp
index 6dc522b421..dac3840f32 100644
--- a/src/test/validation_chainstatemanager_tests.cpp
+++ b/src/test/validation_chainstatemanager_tests.cpp
@@ -240,6 +240,15 @@ BOOST_FIXTURE_TEST_CASE(chainstatemanager_activate_snapshot, TestChain100Setup)
const CBlockIndex* tip = chainman.ActiveTip();
BOOST_CHECK_EQUAL(tip->nChainTx, au_data.nChainTx);
+ // CDiskBlockIndex "is a" CBlockIndex, as it publicly inherits from it.
+ // Test that calling the same inherited interface functions on the same
+ // object yields identical behavior.
+ CDiskBlockIndex index{tip};
+ CBlockIndex *pB = &index;
+ CDiskBlockIndex *pD = &index;
+ BOOST_CHECK_EQUAL(pB->GetBlockHash(), pD->GetBlockHash());
+ BOOST_CHECK_EQUAL(pB->ToString(), pD->ToString());
```
(build and run: `$ ./src/test/test_bitcoin -t validation_chainstatemanager_tests`)
The GetBlockHash() test assertion only passes on master because the different methods invoked by the current design happen to return the same result. If one of the two is changed, it fails like the ToString() assertion does.
Redefining inherited non-virtual functions is well-documented as incorrect design to avoid inconsistent behavior (see Scott Meyers, Effective C++, Item 36). Class usage is confusing when the behavior depends on the pointer definition instead of the object definition (static binding happening where dynamic binding was expected). This can lead to unsuspected or hard-to-track bugs.
Outside of critical hot spots, correctness usually comes before optimisation, but the current design dates back to main.cpp and it may possibly have been chosen to avoid the overhead of dynamic dispatch. This solution does the same: the class sizes are unchanged and no vptr or vtbl is added.
There are better designs for doing this that use composition instead of inheritance, or that separate the public interface from the private implementations. One example of the latter would be a non-virtual public interface that calls private virtual implementation methods, i.e. the Template pattern via the Non-Virtual Interface (NVI) idiom.
ACKs for top commit:
vasild:
ACK 3a61fc56a0
Tree-SHA512: 9ff358ab0a6d010b8f053ad8303c6d4d061e62d9c3755a56b9c9f5eab855d02f02bee42acc77dfa0cbf4bb5cb775daa72d675e1560610a29bd285c46faa85ab7
and remove a now-redundant assert preceding a GetBlockHash() caller.
This protects against UB here, and in case of failure (which would
indicate a consensus bug), the debug log will print
bitcoind: chain.h:265: uint256 CBlockIndex::GetBlockHash() const: Assertion `phashBlock != nullptr' failed.
Aborted
instead of
Segmentation fault
facc2fa7b8 Use AutoFile where possible (MacroFake)
6666803c89 streams: Add AutoFile without ser-type and ser-version (MacroFake)
Pull request description:
This was done in the context of https://github.com/bitcoin/bitcoin/pull/25284 , but I think it also makes sense standalone.
The basic idea is that serialization type should not be initialized when it is not needed. Same for the serialization version.
So do this here for `AutoFile`. `CAutoFile` remains in places where it is not yet possible.
ACKs for top commit:
laanwj:
Code review ACK facc2fa7b8
fanquake:
ACK facc2fa7b8
Tree-SHA512: d82d024d55af57565ac53d9d1517afafc12b46964effba0332de62a6c77869356fa77f89e6d4834438fff44c45b64fccdf5a1358bfea03e28dfe55013b3c099d
Also:
1. Have CChainState::LoadMempool and ::ThreadImport take in paths and
pass it through untouched to LoadMempool.
2. Make LoadMempool exit early if the load_path is empty.
3. Adjust the call to ::ThreadImport in ::AppInitMain to correctly pass
in an empty path if mempool persistence is disabled.
Not only does this increase coverage, it is also more correct in that
when ::LoadMempool is called with a mempool and chainstate, it calls
AcceptToMemoryPool with just the chainstate.
AcceptToMemoryPool will then act on the chainstate's mempool via
CChainState::GetMempool, which may be different from the mempool
originally passed to ::LoadMempool. (In this fuzz test's case, it
definitely is different)
Also, move DummyChainstate to its own file since it's now used by the
validation_load_mempool fuzz test to replace CChainState's m_mempool.
m_is_loaded/IsLoaded() doesn't actually indicate whether or not the
mempool was successfully, loaded, but rather if a load has been
attempted and did not result in a catastrophic ShutdownRequested.
-BEGIN VERIFY SCRIPT-
find_regex="\bm_is_loaded\b" \
&& git grep -l -E "$find_regex" \
| xargs sed -i -E "s@$find_regex@m_load_tried@g"
find_regex="\bIsLoaded\b" \
&& git grep -l -E "$find_regex" \
| xargs sed -i -E "s@$find_regex@GetLoadTried@g"
find_regex="\bSetIsLoaded\b" \
&& git grep -l -E "$find_regex" \
| xargs sed -i -E "s@$find_regex@SetLoadTried@g"
-END VERIFY SCRIPT-
Also introduce node::{ShouldPersistMempool,MempoolPath} helper functions
in node/mempool_persist_args.{h,cpp} which are used by non-kernel
DumpMempool callers to determine whether or not to automatically dump
the mempool and where to dump it to.
This new node policy setting enables to accept replaced-by-fee
transaction without inspection of the replaceability signaling
as described in BIP125 "explicit signaling".
If turns on, the node mempool accepts transaction replacement
as described in `policy/mempool-replacements.md`.
The default setting value is `false`, implying opt-in RBF
is enforced.
d1684beabe fees: Pass in a filepath instead of referencing gArgs (Carl Dong)
9a3d825c30 init: Remove redundant -*mempool*, -limit* queries (Carl Dong)
6c5c60c412 mempool: Use m_limit for UpdateTransactionsFromBlock (Carl Dong)
9e93b10301 node/ifaces: Use existing MemPoolLimits (Carl Dong)
38af2bcf35 mempoolaccept: Use limits from mempool in constructor (Carl Dong)
9333427014 mempool: Introduce (still-unused) MemPoolLimits (Carl Dong)
716bb5fbd3 scripted-diff: Rename anc/desc size limit vars to indicate SI unit (Carl Dong)
1ecc77321d scripted-diff: Rename DEFAULT_MEMPOOL_EXPIRY to indicate time unit (Carl Dong)
aa9141cd81 mempool: Pass in -mempoolexpiry instead of referencing gArgs (Carl Dong)
51c7a41a5e init: Only determine maxmempool once (Carl Dong)
386c9472c8 mempool: Make GetMinFee() with custom size protected (Carl Dong)
82f00de7a6 mempool: Pass in -maxmempool instead of referencing gArgs (Carl Dong)
f1941e8bfd pool: Add and use MemPoolOptions, ApplyArgsManOptions (Carl Dong)
0199bd35bb fuzz/rbf: Add missing TestingSetup (Carl Dong)
ccbaf546a6 scripted-diff: Rename DEFAULT_MAX_MEMPOOL_SIZE to indicate SI unit (Carl Dong)
fc02f77ca6 ArgsMan: Add Get*Arg functions returning optional (Carl Dong)
Pull request description:
This is part of the `libbitcoinkernel` project: #24303, https://github.com/bitcoin/bitcoin/projects/18
-----
As mentioned in the Stage 1 Step 2 description of [the `libbitcoinkernel` project](https://github.com/bitcoin/bitcoin/issues/24303), `ArgsManager` will not be part of `libbitcoinkernel`. Therefore, it is important that we remove any dependence on `ArgsManager` by code that will be part of `libbitcoinkernel`. This is the first in a series of PRs aiming to achieve this.
This PR removes `CTxMemPool+MempoolAccept`'s dependency on `ArgsManager` by introducing a `CTxMemPool::Options` struct, which is used to specify `CTxMemPool`'s various options at construction time.
These options are:
- `-maxmempool` -> `CTxMemPool::Options::max_size`
- `-mempoolexpiry` -> `CTxMemPool::Options::expiry`
- `-limitancestorcount` -> `CTxMemPool::Options::limits::ancestor_count`
- `-limitancestorsize` -> `CTxMemPool::Options::limits::ancestor_size`
- `-limitdescendantcount` -> `CTxMemPool::Options::limits::descendant_count`
- `-limitdescendantsize` -> `CTxMemPool::Options::limits::descendant_size`
More context can be gleaned from the commit messages. The important commits are:
- 56eb479ded8bfb2ef635bb6f3b484f9d5952c70d "pool: Add and use MemPoolOptions, ApplyArgsManOptions"
- a1e08b70f3068f4e8def1c630d8f50cd54da7832 "mempool: Pass in -maxmempool instead of referencing gArgs"
- 6f4bf3ede5812b374828f08fc728ceded2f10024 "mempool: Pass in -mempoolexpiry instead of referencing gArgs"
- 5958a7fe4806599fc620ee8c1a881ca10fa2dd16 "mempool: Introduce (still-unused) MemPoolLimits"
Reviewers: Help needed in the following commits (see commit messages):
- a1e08b70f3068f4e8def1c630d8f50cd54da7832 "mempool: Pass in -maxmempool instead of referencing gArgs"
- 0695081a797e9a5d7787b78b0f8289dafcc6bff7 "node/ifaces: Use existing MemPoolLimits"
Note to Reviewers: There are perhaps an infinite number of ways to architect `CTxMemPool::Options`, the current one tries to keep it simple, usable, and flexible. I hope we don't spend too much time arguing over the design here since that's not the point. In the case that you're 100% certain that a different design is strictly better than this one in every regard, please show us a fully-implemented branch.
-----
TODO:
- [x] Use the more ergonomic `CTxMemPool::Options` where appropriate
- [x] Doxygen comments for `ApplyArgsManOptions`, `MemPoolOptions`
-----
Questions for Reviewers:
1. Should we use `std::chrono::seconds` for `CTxMemPool::Options::expiry` and `CTxMemPool::m_expiry` instead of an `int64_t`? Something else? (`std::chrono::hours`?)
2. Should I merge `CTxMemPool::Limits` inside `CTxMemPool::Options`?
ACKs for top commit:
MarcoFalke:
ACK d1684beabe🍜
ryanofsky:
Code review ACK d1684beabe. Just minor cleanups since last review, mostly switching to brace initialization
Tree-SHA512: 2c138e52d69f61c263f1c3648f01c801338a8f576762c815f478ef5148b8b2f51e91ded5c1be915e678c0b14f6cfba894b82afec58d999d39a7bb7c914736e0b
Since:
- UpdateTransactionsFromBlock is only called by
MaybeUpdateMempoolForReorg, which calls it with the gArgs-determined
ancestor limits
- UpdateForDescendants is only called by UpdateTransactionsFromBlock
with the ancestor limits unchanged
We can remove the requirement to specify the ancestor limits for both
UpdateTransactionsFromBlock and UpdateForDescendants and just use the
values in the m_limits member.
Also move some removed comments to MemPoolLimits struct members.
The uint64_t cast in UpdateForDescendants is not new behavior,
see the diff in CChainState::MaybeUpdateMempoolForReorg for where they
were previously.
Better to be explicit when it comes to sizes to avoid unintentional
bugs. We use MB and KB all over the place.
-BEGIN VERIFY SCRIPT-
find_regex="DEFAULT_(ANCESTOR|DESCENDANT)_SIZE_LIMIT" \
&& git grep -l -E "$find_regex" \
| xargs sed -i -E "s@$find_regex@\0_KVB@g"
-END VERIFY SCRIPT-
- Store the mempool expiry (-mempoolexpiry) in CTxMemPool as a
std::chrono::seconds member.
- Remove the requirement to explicitly specify a mempool expiry for
LimitMempoolSize(...), just use the newly-introduced member.
- Remove all now-unnecessary instances of:
std::chrono::hours{gArgs.GetIntArg("-mempoolexpiry", DEFAULT_MEMPOOL_EXPIRY)}
- Store the mempool size limit (-maxmempool) in CTxMemPool as a member.
- Remove the requirement to explicitly specify a mempool size limit for
CTxMemPool::GetMinFee(...) and LimitMempoolSize(...), just use the
stored mempool size limit where possible.
- Remove all now-unnecessary instances of:
gArgs.GetIntArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE_MB) * 1000000
The code change in CChainState::GetCoinsCacheSizeState() is correct
since the coinscache should not repurpose "extra" mempool memory
headroom for itself if the mempool doesn't even exist.
Better to be explicit when it comes to sizes to avoid unintentional
bugs. We use MB and KB all over the place.
-BEGIN VERIFY SCRIPT-
find_regex="DEFAULT_MAX_MEMPOOL_SIZE" \
&& git grep -l -E "$find_regex" \
| xargs sed -i -E "s@$find_regex@\0_MB@g"
-END VERIFY SCRIPT-
ce893c0497 doc: Update developer notes (Anthony Towns)
d2852917ee sync.h: Imply negative assertions when calling LOCK (Anthony Towns)
bba87c0553 scripted-diff: Convert global Mutexes to GlobalMutexes (Anthony Towns)
a559509a0b sync.h: Add GlobalMutex type (Anthony Towns)
be6aa72f9f qt/clientmodel: thread safety annotation for m_cached_tip_mutex (Anthony Towns)
f24bd45b37 net_processing: thread safety annotation for m_tx_relay_mutex (Anthony Towns)
Pull request description:
This changes `LOCK(mutex)` for non-global, non-recursive mutexes to be annotated with the negative capability for the mutex it refers to, to prevent . clang applies negative capabilities recursively, so this helps avoid forgetting to annotate functions.
This can't reasonably be used for globals, because clang would require every function to be annotated with `EXCLUSIVE_LOCKS_REQUIRED(!g_mutex)` for each global mutex; so this introduces a trivial `GlobalMutex` subclass of `Mutex`, and reduces the annotations for both `GlobalMutex` to `LOCKS_EXCLUDED` which only catches trivial errors (eg (`LOCK(x); LOCK(x);`).
ACKs for top commit:
MarcoFalke:
review ACK ce893c0497🐦
hebasto:
ACK ce893c0497
Tree-SHA512: 5c35e8c7677ce3d994a7e3774f4344adad496223a51b3a1d1d3b5f20684b2e1d5cff688eb3fbc8d33e1b9940dfa76e515f9434e21de6f3ce3c935e29a319f529
fa4068b4e2 Move minRelayTxFee to policy/settings (MacroFake)
Pull request description:
Seems a bit confusing to put policy stuff into validation, so fix that.
Also fix includes via `iwyu`.
ACKs for top commit:
ariard:
ACK fa4068b, the includes move compiles well locally.
ryanofsky:
Code review ACK fa4068b4e2. Make sense to move the global variable to policy/settings and the default constant to policy/policy. Ariard points out other constants that could be moved, which seems fine, but it seems like moving the global variable to be with other related global variables is more significant.
Tree-SHA512: adf9619002610d1877f3aef0a9e6115fc4c2ad64135a3e5100824c650b560c47f47ac28894c6214a50a7888355252a9f6f7cec98c23a771a1964160ef1ca77de
This check isn't any longer just about bad pay-to-script-hash inputs; it
also excludes any kind of nonstandard input, unknown witness versions,
coinbases, etc.
This is the "fruit of our labor" for this patchset.
ChainstateManager::PopulateAndValidateSnapshot can now directly call
ComputeUTXOStats(...).
Our consensus engine is now fully decoupled from all indices.
See the src/Makefile.am for some satisfying removals.
In previous commits in this patchset, we removed all in-param members of
CCoinsStats. Now that that's done, we can modify GetUTXOStats to return
an optional CCoinsStats instead of a status bool. Callers are modified
accordingly.
In rpc/blockchain.cpp, we discover that GetUTXOStats' status bool when
getting UTXO stats for pprev was not checked for error. We fix this as
well.
Currently, CCoinsStats is a struct with both in-params and out-params
where the hash_type and index_requested members are the only in-params.
This change removes CCoinsStats' hash_type in-param member and adds it
to the relevant functions instead.
[META] In subsequent commits, all of CCoinsStats' members which serve as
in-params will be moved out so as to make CCoinsStats a pure
out-param struct.
This decouples validation.cpp from netaddress.cpp (transitively,
timedata.cpp, and asmap.cpp).
This is important for libbitcoinkernel as:
- There is no reason for the consensus engine to be coupled with
netaddress, timedata, and asmap
- Users of libbitcoinkernel can now easily supply their own
std::function that provides the adjusted time.
See the src/Makefile.am changes for some satisfying removals.
bb5c24b120 validation: move g_versionbitscache into ChainstateManager (Anthony Towns)
eca22c726a test/versionbits: make versionbitscache a parameter (Anthony Towns)
d603f1d8a7 deploymentstatus: make versionbitscache a parameter (Anthony Towns)
78adef1753 refactor: use chainman instead of chainParams for DeploymentActive* (Anthony Towns)
deffe0df6c deploymentstatus: allow chainman in place of consensusParams (Anthony Towns)
eaa2e3f25c validation: move UpdateUncommittedBlockStructures and GenerateCoinbaseCommitment into ChainstateManager (Anthony Towns)
5c67e84d37 validation: replace ::Params() calls with chainstate/chainman member (Anthony Towns)
38860f93b6 validation: remove redundant CChainParams params from ChainstateManager methods (Anthony Towns)
69675ea4e7 validation: add CChainParams to ChainstateManager (Anthony Towns)
Pull request description:
Gives `ChainstateManager` a reference to the `CChainParams` its working on, and simplifies some of the functions that would otherwise take that as a parameter. Removes the `g_versionbitscache` global by moving it into `ChainstateManager`.
ACKs for top commit:
dongcarl:
reACK bb5c24b120
MarcoFalke:
review ACK bb5c24b120📙
Tree-SHA512: 3fa74905e5df561e3e74bb0b8fce6085c5311e6633e7d74c0fb0c82a907f5bbb1fd4ebc5d11d4f0b1c019bb51eabb9f6e4bcc4652a696d36a5878c807b85f121
This fixes a blk file size calculation made during reindex that results in increased blk file malformity.
The fix is to avoid double counting the size of the serialization header during reindex.
This adds a unit test to reproduce the bug before the fix and to ensure that it does not recur.
These changes include a log message change also so as to not be as alarming. This is a common and recoverable
data corruption. These messages can now be filtered by the debug log reindex category.
308dd2e93e Sanity assert GetAncestor() != nullptr where appropriate (Adam Jonas)
Pull request description:
Re-opening #17232. I have rebased the PR and addressed jonatack's nit suggestions.
Add sanity asserts for return value of `CBlockIndex::GetAncestor()` where appropriate.
In validation.cpp `CheckSequenceLocks`, check the return value of `tip->GetAncestor(maxInputHeight)` stored into `lp->maxInputBlock`. If it ever returns `nullptr` because the ancestor isn't found, it's going to be a bad bug to keep going, since a `LockPoints` object with the `maxInputBlock` member set to `nullptr` signifies no relative lock time.
In the other places, the added asserts would prevent accidental dereferencing of a null pointer which is undefined behavior.
Co-Authored-By: Adam Jonas <jonas@chaincode.com>
Co-Authored-By: danra <danra@users.noreply.github.com>
ACKs for top commit:
jonatack:
ACK 308dd2e93e
Tree-SHA512: 5bfdaab1499607ae2c3cd3e2e9e8c37850bfd0e327e680f4e36c81f9c6d98a543af78ecfac1ab0e06325d264412615a04d52005875780c7db2a4d81bd2d2259a
Add sanity asserts for return value of `CBlockIndex::GetAncestor()` where appropriate.
In validation.cpp `CheckSequenceLocks`, check the return value of `tip->GetAncestor(maxInputHeight)` stored into `lp->maxInputBlock`. If it ever returns `nullptr` because the ancestor isn't found, it's going to be a bad bug to keep going, since a `LockPoints` object with the `maxInputBlock` member set to `nullptr` signifies no relative lock time.
In the other places, the added asserts would prevent accidental dereferencing of a null pointer which is undefined behavior.
Co-Authored-By: Aurèle Oulès <aurele@oules.com>
Co-Authored-By: danra <danra@users.noreply.github.com>
f64aa9c411 Disallow more unsafe string->path conversions allowed by path append operators (Ryan Ofsky)
Pull request description:
Add more `fs::path` `operator/` and `operator+` overloads to prevent unsafe string->path conversions on Windows that would cause strings to be decoded according to the current Windows locale & code page instead of the correct string encoding.
Update application code to deal with loss of implicit string->path conversions by calling `fs::u8path` or `fs::PathFromString` explicitly, or by just changing variable types from `std::string` to `fs::path` to avoid conversions altogether, or make them happen earlier.
In all cases, there's no change in behavior either (1) because strings only contained ASCII characters and would be decoded the same regardless of what encoding was used, or (2) because of the 1:1 mapping between paths and strings using the `PathToString` and `PathFromString` functions.
Motivation for this PR was just that I was experimenting with #24469 and noticed that operations like `fs::path / std::string` were allowed, and I thought it would be better not to allow them.
ACKs for top commit:
hebasto:
ACK f64aa9c411
Tree-SHA512: 944cce49ed51537ee7a35ea4ea7f5feaf0c8fff2fa67ee81ec5adebfd3dcbaf41b73eb35e49973d5f852620367f13506fd12a7a9b5ae3a7a0007414d5c9df50f
fa10c9f5a1 Crash debug builds on PCKG_MEMPOOL_ERROR (MacroFake)
Pull request description:
Would be nice to allow fuzz targets to meaningfully cover this code
ACKs for top commit:
glozow:
utACK fa10c9f5a1
vincenzopalazzo:
ACK fa10c9f5a1
Tree-SHA512: 68efacedbf72f67cf3dc0bb9927a698492cdc1b08df91ef6af863ad8828b78058a64e52d64d244a5b2966cb9e63797b2647d1bb222677bf83b26fca6e4b1dbf0
In previous commits in this patchset, we've made sure that every
Unload/UnloadBlockIndex member function resets its own members, and does
not reach out to globals.
This means that their corresponding classes' default destructors can now
replace them, and do an even more thorough job without the need to be
updated for every new member variable.
Therefore, we can remove them, and also remove UnloadBlockIndex since
that's not used anymore.
Unfortunately, chainstatemanager_loadblockindex relies on
CChainState::UnloadBlockIndex, so that needs to stay for now.
This change also introduces an aditional buffer of 10 blocks (PRUNE_LOCK_BUFFER) that will not be pruned before the best block.
Co-authored-by: Luke Dashjr <luke-jr+git@utopios.org>
Add more fs::path operator/ and operator+ overloads to prevent unsafe
string->path conversions on Windows that would cause strings to be
decoded according to the current Windows locale & code page instead of
the correct string encoding.
Update application code to deal with loss of implicit string->path
conversions by calling fs::u8path or fs::PathFromString explicitly, or
by just changing variable types from std::string to fs::path to avoid
conversions altoghther, or make them happen earlier.
In all cases, there's no change in behavior either (1) because strings
only contained ASCII characters and would be decoded the same regardless
of what encoding was used, or (2) because of the 1:1 mapping between
paths and strings using the PathToString and PathFromString functions.
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
-----
Code Reviewer Notes
Call graph of relevant functions:
UnloadBlockIndex() <-- Moved from
calls ChainstateManager::Unload()
which calls BlockManager::Unload() <-- Moved to
So calling UnloadBlockIndex() would still run this moved code. The code
will also now run when ~BlockManager gets called, which makes sense.
-----
Code Reviewer Notes
Call graph of relevant functions:
UnloadBlockIndex() <-- Moved from
calls ChainstateManager::Unload() <-- Moved to
Safe because ChainstateManager::Unload() is called only by
UnloadBlockIndex() and no other callers.
When using checklevel=4, block verification fails because of duplicate coinbase transactions
involving blocks 91812 and 91722. There was already a check in place for ConnectBlock to
ignore the problematic blocks, but DisconnectBlock did not contain a similar check.
This change ignores the blocks where these inconsistencies surface so
that block verification will succeed at checklevel=4.
Now BlockManager::LoadBlockIndex() will ACTUALLY only load BlockMan
members.
[META] In a later commit, pindexBestHeader will be moved to ChainMan as
a member
-----
Code Reviewer Notes
Call graph of relevant functions:
ChainstateManager::LoadBlockIndex() <-- Moved to
calls BlockManager::LoadBlockIndexDB()
which calls BlockManager::LoadBlockIndex() <-- Moved from
There is only one call to each of inner functions, meaning that no
behavior is changing.
Package validation policy only differs from individual policy in its
evaluation of feerate. Minimize DoS surface; don't validate all over
again if we know the result will be the same.
This avoids "parents pay for children" and "siblings pay for siblings"
behavior, since package feerate is calculated with totals and is
topology-unaware.
It also ensures that package validation never causes us to reject a
transaction that we would have otherwise accepted in single-tx
validation.
cccc1e70b8 Enforce Taproot script flags whenever WITNESS is set (MarcoFalke)
fa42299411 Remove nullptr check in GetBlockScriptFlags (MarcoFalke)
faadc606c7 refactor: Pass const reference instead of pointer to GetBlockScriptFlags (MarcoFalke)
Pull request description:
Now that Taproot is active, it makes sense to enforce its rules on all blocks, even historic ones, regardless of the deployment status.
### Benefits:
(With "script flags" I mean "taproot script verification flags".)
* Script flags are known ahead for all blocks (even blocks not yet created) and do not change. This may benefit static analysis, code review, and development of new script features that build on Taproot.
* Any future bugs introduced in the deployment code won't have any effect on the script flags, as they are independent of deployment.
* Enforcing the taproot rules regardless of the deployment status makes testing easier because invalid blocks after activation are also invalid before activation. So there is no need to differentiate the two cases.
* It gives belt-and-suspenders protection against a practically expensive and theoretically impossible IBD reorg attack where the node is eclipsed. While `nMinimumChainWork` already protects against this, the cost for a few months worth of POW might be lowered until a major version release of Bitcoin Core reaches EOL. The needed work for the attack is the difference between `nMinimumChainWork` and the work at block 709632.
For reference, previously the same was done for P2SH and WITNESS in commit 0a8b7b4b33.
### Implementation:
I found one block which fails verification with the flags applied, so I added a `TaprootException`, similar to the `BIP16Exception`.
For reference, the debug log:
```
ERROR: ConnectBlock(): CheckInputScripts on b10c007c60e14f9d087e0291d4d0c7869697c6681d979c6639dbd960792b4d41 failed with non-mandatory-script-verify-flag (Witness program was passed an empty witness)
BlockChecked: block hash=0000000000000000000f14c35b2d841e986ab5441de8c585d5ffe55ea1e395ad state=non-mandatory-script-verify-flag (Witness program was passed an empty witness)
InvalidChainFound: invalid block=0000000000000000000f14c35b2d841e986ab5441de8c585d5ffe55ea1e395ad height=692261 log2_work=92.988459 date=2021-07-23T08:24:20Z
InvalidChainFound: current best=0000000000000000000067b17a4c0ffd77c29941b15ad356ca8f980af137a25d height=692260 log2_work=92.988450 date=2021-07-23T07:47:31Z
ERROR: ConnectTip: ConnectBlock 0000000000000000000f14c35b2d841e986ab5441de8c585d5ffe55ea1e395ad failed, non-mandatory-script-verify-flag (Witness program was passed an empty witness)
```
Hint for testing, make sure to set `-noassumevalid`.
### Considerations
Obviously this change can lead to consensus splits on the network in light of massive reorgs. Currently the last block before Taproot activation, that is the last block without the Taproot script flags set, is only buried by a few days of POW. However, when and if this patch is included in the next major release, it will be buried by a few months of POW. BIP90 considerations apply when looking at reorgs this large.
ACKs for top commit:
Sjors:
tACK cccc1e70b8
achow101:
ACK cccc1e70b8
laanwj:
Code review ACK cccc1e70b8
ajtowns:
ACK cccc1e70b8 ; code review; wrote a "getblockscriptflags" rpc to quickly check that blocks just had bit 17 (taproot) added; review of earlier revisions had established non-exception blocks do validate with taproot rules enabled.
jamesob:
ACK cccc1e70b8 ([`jamesob/ackr/23536.1.MarcoFalke.enforce_taproot_script_f`](https://github.com/jamesob/bitcoin/tree/ackr/23536.1.MarcoFalke.enforce_taproot_script_f))
Tree-SHA512: 00044de68939caef6420ffd588c1291c041a8b397c80a3df1e3e3487fbeae1821d23975c51c95e44e774558db76f943b00b4e27cbd0213f64a9253116dc6edde
f865cf8ded Add and use BlockManager::GetAllBlockIndices (Carl Dong)
28ba0313ea Add and use CBlockIndexHeightOnlyComparator (Carl Dong)
12eb05df63 move-only: Move CBlockIndexWorkComparator to blockstorage (Carl Dong)
c600ee3816 Only load BlockMan in BlockMan member functions (Carl Dong)
42e56d9b18 style-only: No need for std::pair for vSortedByHeight (Carl Dong)
3bbb6fea05 style-only: Various blockstorage.cpp cleanups (Carl Dong)
5be9ee3c54 refactor: more const annotations for uses of CBlockIndex* (Anthony Towns)
Pull request description:
The only important commit is "Only load BlockMan in BlockMan member functions", everything else is all just small style changes.
Here's the commit message, reproduced:
```
This commit effectively splits the "load block index itself" logic from
"derive Chainstate variables from loaded block index" logic.
This means that BlockManager::LoadBlockIndex{,DB} will only load what's
relevant to the BlockManager.
```
ACKs for top commit:
ajtowns:
ACK f865cf8ded ; code review only
MarcoFalke:
review ACK f865cf8ded 🗂
Tree-SHA512: 7b204d782834e06fd7329d022e2ae860181b4e8105c33bfb928539a4ec24161dc7438a9c4d4ee279dcad77de310c160b997bb8aa18923243d0fd55ccf4ad7c3a
This commit effectively splits the "load block index itself" logic from
"derive Chainstate variables from loaded block index" logic.
This means that BlockManager::LoadBlockIndex{,DB} will only load what's
relevant to the BlockManager.
I strongly recommend reviewing with the following git-diff flags:
--color-moved=dimmed_zebra --color-moved-ws=allow-indentation-change
fa8d4d9128 scripted-diff: Clarify CheckFinalTxAtTip name (MarcoFalke)
fa4e30b0f3 policy: Remove unused locktime flags (MarcoFalke)
Pull request description:
The locktime flags have many issues:
* They are passed in by a default argument, which is fragile. It has already lead to bugs like the one fixed in commit e30b6ea194.
* They are negative (signed), which doesn't make sense for flags (unsigned in general). According to the review comments when the code was added: "The max on the flags is a fairly weird operation." (https://github.com/bitcoin/bitcoin/pull/6566#issuecomment-150310861)
* No call site relies on the default argument and they all pass in a single compile-time constant, rendering most of the code dead and untested.
* The dead code calls `GetAdjustedTime` (network adjusted time), which has its own issues. See https://github.com/bitcoin/bitcoin/issues/4521
Fix all issues by removing them
ACKs for top commit:
ajtowns:
ACK fa8d4d9128
theStack:
Code-review ACK fa8d4d9128
glozow:
ACK fa8d4d9128, agree the default arg `flags` is a massive footgun and just setting max flags is weird. Adding `AtTip` to the names makes sense to me, since they're both testing for *next* block and only ever used for {,re}addition to mempool.
Tree-SHA512: 79f4a52f34909eb598d88bbae7afe8abe5f85f45c128483d16aa83dacd0e5579e561b725d01b1e9a931d1821012a51ad2bc6fb2867f8d09ee541f9d234d696f8
faa1aec26b Remove confusing P1008R1 violation in ATMPArgs (MarcoFalke)
Pull request description:
The `= delete` doesn't achieve the stated goal and it is also redundant, since it is not possible to default construct the `ATMPArgs` type.
This can be tested with:
```diff
diff --git a/src/validation.cpp b/src/validation.cpp
index 2813b62462..1c939c0b8a 100644
--- a/src/validation.cpp
+++ b/src/validation.cpp
@@ -519,6 +519,7 @@ public:
/** Parameters for child-with-unconfirmed-parents package validation. */
static ATMPArgs PackageChildWithParents(const CChainParams& chainparams, int64_t accept_time,
std::vector<COutPoint>& coins_to_uncache) {
+ ATMPArgs{};
return ATMPArgs{/* m_chainparams */ chainparams,
/* m_accept_time */ accept_time,
/* m_bypass_limits */ false,
```
Which fails on current master *and* this pull with the following error:
```
validation.cpp:525:22: error: reference member of type 'const CChainParams &' uninitialized
ATMPArgs{};
~^
validation.cpp:470:29: note: uninitialized reference member is here
const CChainParams& m_chainparams;
^
1 error generated.
```
Further reading (optional):
* http://open-std.org/JTC1/SC22/WG21/docs/papers/2018/p1008r1.pdf
ACKs for top commit:
achow101:
ACK faa1aec26b
glozow:
code review ACK faa1aec26b
Tree-SHA512: 16db2c9959a1996eafbfa533dc4d1483761b9d28295aed5a82b86abd7268da37c51c59ddc67c205165ecb415dbe637b12a0e1b3234d50ab0b3b79de66d7bd73e
6c23c41561 refactor: Rewrite AddToBlockIndex with try_emplace (Carl Dong)
c05cf7aa1e style: Modernize range-based loops over m_block_index (Carl Dong)
c2a1655799 style-only: Use using instead of typedef for BlockMap (Carl Dong)
dd79dad175 refactor: Rewrite InsertBlockIndex with try_emplace (Carl Dong)
531dce0347 tests: Remove now-unnecessary manual Unload's (Carl Dong)
bec86ae326 blockstorage: Make m_block_index own CBlockIndex's (Carl Dong)
Pull request description:
Part of: #24303
Split off from: #22564
```
Instead of having CBlockIndex's live on the heap, which requires manual
memory management, have them be owned by m_block_index. This means that
they will live and die with BlockManager.
```
The second commit demonstrates how this makes calls to `Unload()` to satisfy the address sanitizer unnecessary.
ACKs for top commit:
ajtowns:
ACK 6c23c41561
MarcoFalke:
re-ACK 6c23c41561🎨
Tree-SHA512: 81b2b5119be27cc0f8a9457b11da60cc60930315d2a5be36be89fe253d32073ffe622348ff153114b9b3212197bddbc791810913a43811b33cc58e7162bd105b
ae9ceed3e2 validation, refactoring: remove ChainstateManager::Reset() (Jon Atack)
daad0093e3 validation: replace lock with annotation in UnloadBlockIndex() (Jon Atack)
Pull request description:
Thread safety refactoring seen in #24177:
- replace re-acquiring lock cs_main with a thread safety annotation in UnloadBlockIndex()
- remove ChainstateManager::Reset(), as it is currently unused (can be reintroduced in the test utilities if needed for unit testing)
ACKs for top commit:
laanwj:
Code review ACK ae9ceed3e2
vasild:
ACK ae9ceed3e2
klementtan:
crACK ae9ceed3e2
Tree-SHA512: cebb782572997cc2dda01590d6bb6c5e479e8202324d8b6ff459b814ce09e818b996c881736bfebd1b8bf4b6d7a0f79faf3ffea176a4699dd7d7429de2db2d13
fa7991601c Fixup style of VerifyDB (MarcoFalke)
fa462ea787 Avoid implicit-integer-sign-change in VerifyLoadedChainstate (MarcoFalke)
Pull request description:
This happens when checking all blocks (`-1`).
To test:
```
./configure CC=clang CXX=clang++ --with-sanitizers=undefined,integer
make
UBSAN_OPTIONS="suppressions=$(pwd)/test/sanitizer_suppressions/ubsan:print_stacktrace=1:halt_on_error=1:report_error_type=1" ./test/functional/rpc_blockchain.py
ACKs for top commit:
theStack:
Code-review ACK fa7991601c
brunoerg:
crACK fa7991601c
Tree-SHA512: bcbe6becf2fbedd21bbde83a544122e79465937346802039532143b2e4165784905a8852c0ccb088b964874df5e5550931fdde3629cbcee3ae237f2f63c43a8e
Instead of having CBlockIndex's live on the heap, which requires manual
memory management, have them be owned by m_block_index. This means that
they will live and die with BlockManager.
A change to BlockManager::LookupBlockIndex:
- Previously, it was a const member function returning a non-const CBlockIndex*
- Now, there's are const and non-const versions of
BlockManager::LookupBlockIndex returning a CBlockIndex with the same
const-ness as the member function:
(e.g. const CBlockIndex* LookupBlockIndex(...) const)
See next commit for some weirdness that this eliminates.
The range based for-loops are modernize (using auto + destructuring) in
a future commit.
48742693ac Replace "can not" with "cannot" in docs, user messages, and tests (Jon Atack)
e670edd434 User-facing content fixups from transifex translator feedback (Jon Atack)
Pull request description:
Closes#24366.
ACKs for top commit:
laanwj:
Code review re-ACK 48742693ac
hebasto:
re-ACK 48742693ac, only suggested change since my previous [review](https://github.com/bitcoin/bitcoin/pull/24367#pullrequestreview-885938219).
Tree-SHA512: 4dcdcb417251a413e65fab6070515e13a1267c8e0dbcf521386b842511391f24c84a0c2168fe13458c977682034466509bf2a3453719d4d94d3c568fd9f4adb4
799968e8b3 tracing: misc follow-ups to 22902 (0xb10c)
36a6584703 tracing: correctly scope utxocache:flush tracepoint (Arnab Sen)
Pull request description:
This PR is a follow-up to the [#22902](https://github.com/bitcoin/bitcoin/pull/22902).
Previously, the tracepoint `utxocache:flush` was called, even when it was not flushing. So, the tracepoint is now scoped to write only when coins cache to disk.
ACKs for top commit:
0xB10C:
ACK 799968e8b3
Tree-SHA512: ebb096cbf991c551c81e4339821f10d9768c14cf3d8cb14d0ad851acff5980962228a1c746914c6aba3bdb27e8be53b33349c41efe8bab5542f639916e437b5f
eb8b22d517 block_connected: re-use previous GetTimeMicros (William Casarin)
80e1c55687 block_connected: don't serialize block hash twice (William Casarin)
Pull request description:
In the validation:block_connected tracepoint, we call block->GetHash(), which
ends up calling CBlockHeader::GetHash(), executing around 8000 serialization
instructions. We don't need to do this extra work, because block->GetHash() is
already called further up in the function. Let's save that value as a local
variable and re-use it in our tracepoint so there is no unnecessary tracepoint
overhead.
Shave off an extra 100 or so instructions from the validation:block_connected
tracepoint by reusing a nearby GetTimeMicros(). This brings the tracepoint down
to 54 instructions. Still high, but much better than the previous ~154 and
8000 instructions which it was originally.
Signed-off-by: William Casarin <jb55@jb55.com>
ACKs for top commit:
0xB10C:
ACK eb8b22d517
laanwj:
Code review ACK eb8b22d517
theStack:
re-ACK eb8b22d517
Tree-SHA512: 92ae585e487554e0f73042a8abaa239f630502c1d198e010bd7c1de252d882bccb627bbf0e4faec09c1253e782b145bcf153f9fee78cdb8456188044a96f8267
f485a07454 Add missing thread safety lock assertions in validation.h (Jon Atack)
37af8a20cf Add missing thread safety lock assertions in validation.cpp (Jon Atack)
Pull request description:
A number of functions in validation.{h,cpp} have a thread safety lock annotation in the declaration but are missing the corresponding run-time lock assertion in the definition.
ACKs for top commit:
hebasto:
re-ACK f485a07454, only suggested change since my [previous](https://github.com/bitcoin/bitcoin/pull/24177#pullrequestreview-877810465) review.
vasild:
ACK f485a07454
Tree-SHA512: c86c0c0e8fe6ec7ae9ed9890f1dd7d042aa482ecf99feb6679a670aa004f6e9a99f7bc047205a34968fab7f1f841898c59b48c3ed6245c166e3b5abbf0867445
020acea99b refactor: replace RecursiveMutex m_chainstate_mutex with Mutex (w0xlt)
ddeefeef20 refactor: add negative TS annotations for `m_chainstate_mutex` (w0xlt)
1dfd31bc26 scripted-diff: rename m_cs_chainstate -> m_chainstate_mutex (w0xlt)
Pull request description:
This PR is related to #19303 and gets rid of the `RecursiveMutex m_cs_chainstate`.
`m_cs_chainstate` is only held in `ActivateBestChain()` and `InvalidateBlock()`.
So apparently there is no recursion involved, so the `m_cs_chainstate` can be a non-recursive mutex.
ACKs for top commit:
hebasto:
ACK 020acea99b, I have reviewed the code and it looks OK, I agree it can be merged.
theStack:
Code-review ACK 020acea99b🌴
shaavan:
reACK 020acea99b
Tree-SHA512: c7c16e727e326df3410514915ce753a2a5e1da78857ef965ef683e36251e1b73c9cced4cd5231b04dbe2be0ea14084f6731b4d7a4d9a8e086e982b985e37e4b4
fa5d2e678c Remove unused char serialize (MarcoFalke)
fa24493d63 Use spans of std::byte in serialize (MarcoFalke)
fa65bbf217 span: Add BytePtr helper (MarcoFalke)
Pull request description:
This changes the serialize code (`.read()` and `.write()` functions) to take a `Span` instead of a pointer and size. This is a breaking change for the serialize interface, so at no additional cost we can also switch to `std::byte` (instead of using `char`).
The benefits of using `Span`:
* Less verbose and less fragile code when passing an already existing `Span`(-like) object to or from serialization
The benefits of using `std::byte`:
* `std::byte` can't accidentally be mistaken for an integer
The goal here is to only change serialize to use spans of `std::byte`. If needed, `AsBytes`, `MakeUCharSpan`, ... can be used (temporarily) to pass spans of the right type.
Other changes that are included here:
* [#22167](https://github.com/bitcoin/bitcoin/pull/22167) (refactor: Remove char serialize by MarcoFalke)
* [#21906](https://github.com/bitcoin/bitcoin/pull/21906) (Preserve const in cast on CTransactionSignatureSerializer by promag)
ACKs for top commit:
laanwj:
Concept and code review ACK fa5d2e678c
sipa:
re-utACK fa5d2e678c
Tree-SHA512: 08ee9eced5fb777cedae593b11e33660bed9a3e1711a7451a87b835089a96c99ce0632918bb4666a4e859c4d020f88fb50f2dd734216b0c3d1a9a704967ece6f
This checks finality at the current Tip, so clarify this in its name.
-BEGIN VERIFY SCRIPT-
ren() { sed -i "s/\<$1\>/$2/g" $( git grep -l "$1" ./src/ ) ; }
ren CheckSequenceLocks CheckSequenceLocksAtTip
ren CheckFinalTx CheckFinalTxAtTip
-END VERIFY SCRIPT-
c5b36b1c1b Mempool Update Cut-Through Optimization (Jeremy Rubin)
c49daf9885 [TESTS] Increase limitancestorcount in tournament RPC test to showcase improved algorithm (Jeremy Rubin)
Pull request description:
Often when we're updating mempool entries we update entries that we ultimately end up removing the updated entries shortly thereafter. This patch makes it so that we filter for such entries a bit earlier in processing, which yields a mild improvement for these cases, and is negligible overhead otherwise.
There's potential for a better -- but more sophisticated -- algorithm that can be used taking advantage of epochs, but I figured it is better to do something that is simple and works first and upgrade it later as the other epoch mempool work proceeds as it makes the patches for the epoch algorithm simpler to understand, so you can consider this as preparatory work. It could either go in now if it is not controversial, or we could wait until the other patch is ready to go.
ACKs for top commit:
instagibbs:
reACK c5b36b1
sipa:
utACK c5b36b1c1b
mzumsande:
Code Review ACK c5b36b1c1b
Tree-SHA512: 78b16864f77a637d8a68a65e23c019a9757d8b2243486728ef601d212ae482f6084cf8e69d810958c356f1803178046e4697207ba40d6d10529ca57de647fae6
No behavior change.
This code was introduced in 5add7a7 before we required C++11, which is
why the struct was needed. As we are now using more modern C++ and this
is the only place where lockpoints are updated for mempool entries, it
is more idiomatic to call `modify` with a lambda.
Co-authored-by: Hennadii Stepanov <32963518+hebasto@users.noreply.github.com>
Behavior change: don't quit right after LimitMempoolSize() when a
package is partially submitted. We should still send
TransactionAddedToMempool notifications for
transactions that were submitted.
Not behavior change: add a new package validation result for mempool logic errors.
The previous interface required callers to guess that the tx had been
swapped and look up the tx again by txid to find a `MEMPOOL_ENTRY`
result. This is a confusing interface.
Instead, explicitly tell the caller that this transaction was
`DIFFERENT_WITNESS` in the result linked to the mempool entry's wtxid.
This gives the caller all the information they need in 1 lookup, and
they can query the mempool for the other transaction if needed.
fac22fd36b log: Remove GetAdjustedTime from IBD header progress estimation (MarcoFalke)
Pull request description:
This is a "refactor" that shouldn't change behaviour, because the two times are most likely equal. A minimum of 5 outbound peers are needed to adjust the time. And if the time is adjusted, it will be by at most 70 minutes (`DEFAULT_MAX_TIME_ADJUSTMENT`). Thus, the progress estimate should differ by at most 7 blocks.
ACKs for top commit:
laanwj:
Code review ACK fac22fd36b
vincenzopalazzo:
ACK fac22fd36b
Tree-SHA512: bf9f5eef66db0110dd268cf6dbfab64b9c11ba776924f5b386ceae3f2d005272cceb87ebcc96e0c8b854c051514854a2a5af39ae43bad008fac685b5aafaabd0
Shave off an extra 100 or so instructions from the
validation:block_connected tracepoint by reusing a nearby
GetTimeMicros(). This brings the tracepoint down to 54 instructions.
Still high, but much better than the previous ~154 and 8000 instructions
which it was originally.
Signed-off-by: William Casarin <jb55@jb55.com>
In the validation:block_connected tracepoint, we call block->GetHash(),
which ends up calling CBlockHeader::GetHash(), executing around 8000
serialization instructions. We don't need to do this extra work, because
block->GetHash() is already called further up in the function. Let's
save that value as a local variable and re-use it in our tracepoint so
there is no unnecessary tracepoint overhead.
Signed-off-by: William Casarin <jb55@jb55.com>
BlockManager is a large data structure, and cs_main is not required to
take its address or access every part of it. Individual BlockManager
fields and methods which do require cs_main like m_block_index and
LookupBlockIndex are already annotated separately, and these other
annotations describe locking requirements more accurately and do a
better job enforcing thread safety.
Since cs_main is not needed to access the address of the m_block object,
this commit drops cs_main LOCK calls which were added pointlessly to
satisfy this annotation in the past.
Co-authored-by: Carl Dong <contact@carldong.me>
This is a refactor and safe to do because:
* UnloadBlockIndex calls ChainstateManager::Unload, which calls
BlockManager::Unload
* Only unit tests call Unload directly
fa7efc915b Fixup style of moved code (MarcoFalke)
fade2a44f4 Move BlockManager to node/blockstorage (MarcoFalke)
Pull request description:
`BlockManager` is responsible for reading and writing block(headers). So move it to the existing `blockstorage` module in `node`. Also, move validation code unrelated to block-storage out from `BlockManager`.
ACKs for top commit:
ryanofsky:
Code review obvious ACK fa7efc915b
Tree-SHA512: 0197943d818e5f59e743b07fbb92e7661bff90081127a41e35e5692ce49d6f6a7872448670b0da282f7714580a45c8d93e571a67177c8b5f785ce9edefe834c5
b4adc5ad67 [bugfix] update lockpoints correctly during reorg (glozow)
b6002b07a3 MOVEONLY: update_lock_points to txmempool.h (glozow)
Pull request description:
I introduced a bug in #22677 (sorry! 😅)
Mempool entries cache `LockPoints`, containing the first height/blockhash/`CBlockIndex*` at which the transaction becomes valid. During a reorg, we re-check timelocks on all mempool entries using `CheckSequenceLocks(useExistingLockPoints=false)` and remove any now-invalid entries. `CheckSequenceLocks()` also mutates the `LockPoints` passed in, and we update valid entries' `LockPoints` using `update_lock_points`. Thus, `update_lock_points(lp)` needs to be called right after `CheckSequenceLocks(lp)`, otherwise we lose the data in `lp`. I incorrectly assumed they could be called in separate loops.
The incorrect behavior introduced is: if we have a reorg in which a timelocked mempool transaction is still valid but becomes valid at a different block, the cached `LockPoints` will be incorrect.
This PR fixes the bug, adds a test, and adds an assertion at the end of `removeForReorg()` to check that all mempool entries' lockpoints are valid. You can reproduce the bug by running the test added in the [test] commit on the code before the [bugfix] commit.
ACKs for top commit:
jnewbery:
ACK b4adc5ad67
vasild:
ACK b4adc5ad67
mzumsande:
Code Review ACK b4adc5ad67
hebasto:
ACK b4adc5ad67
MarcoFalke:
re-ACK b4adc5ad67🏁
Tree-SHA512: 16b59f6ff8140d0229079ca1c6b04f2f4a00a2e49931275150e4f3fe5ac4ec109698b083fa6b223ba9511f328271cc1ab081263669d5da020af7fee83c13e401
fadd73037e refactor: Remove implicit-integer-sign-change suppressions in validation.cpp (MarcoFalke)
Pull request description:
A file-wide suppression is problematic because it will wave through future violations, potentially bugs.
Fix that by using per-statement casts.
ACKs for top commit:
shaavan:
ACK fadd73037e
theStack:
Code-review ACK fadd73037e
Tree-SHA512: a8a05613be35382b92d7970f958a4e8f4332432056eaa9d72f6719495134b93aaaeea692899d9035654d0e0cf56bcd759671eeeacfd0535582c0ea048ab58a56
fa1a51cbc1 doc: testnet3 was not reset and is doing BIP30 checks again (MarcoFalke)
Pull request description:
ACKs for top commit:
theStack:
ACK fa1a51cbc1
Tree-SHA512: 793eccda583a3edb056b142c36a09a5c867f61d90b96e15e6643417d62eb651eb2f3429c5f245bdb062d18ab9bb05b5048c0888aa5a492cb7bb21a2f3f52324e
Previously, the `utxocache:flush` tracepoint was in the wrong scope and
reached every time `CChainState::FlushStateToDisk` was called, even when
there was no flushing of the cache. The tracepoint is now properly scoped
and will be reached during a full flush.
Inside the scope, the `fDoFullFlush` value will always be `true`, so it
doesn't need to be logged separately. Hence, it's dropped from the
tracepoint arguments.
fab6d6b2d1 Move pindexBestInvalid to ChainstateManager (MarcoFalke)
facd2137ec Move m_failed_blocks to ChainstateManager (MarcoFalke)
fa47b5c100 Move AcceptBlockHeader to ChainstateManager (MarcoFalke)
fa3d62cf7b Move FindForkInGlobalIndex from BlockManager to CChainState (MarcoFalke)
Pull request description:
Move globals or members of the wrong class to the right class.
ACKs for top commit:
naumenkogs:
ACK fab6d6b2d1
Sjors:
ACK fab6d6b2d1
shaavan:
ACK fab6d6b2d1
Tree-SHA512: 926cbdfa22838517497bacb79ed5f521f64117c2aacf96a0176f62831b4713314a32abc0213df5ee067edf63e4a4300f752a26006d36e5aab415bb91209a271f
The member is unrelated to block storage (BlockManager). It is related
to validation.
Fix the confusion by moving it.
Can be reviewed with
--color-moved=dimmed-zebra --color-moved-ws=ignore-all-space
The helper was moved in commit b026e318c3,
which also mentioned that it could be moved to CChainState. So do that,
as the functionality is not block-storage related.
This also allows to drop one function argument.
2283b9cd1e test: add tests for LoadBlockIndex when using multiple chainstates (James O'Beirne)
0fd599a51a validation: have LoadBlockIndex account for snapshot use (James O'Beirne)
d0c6e61f5d validation: don't modify genesis during snapshot load (James O'Beirne)
Pull request description:
This is part of the [assumeutxo project](https://github.com/bitcoin/bitcoin/projects/11) (parent PR: #15606)
---
Currently, `BlockManager::LoadBlockIndex` adds all blocks that have downloaded transactions to the active chain state's `setBlockIndexCandidates` set, ignoring the background chain state.
This PR changes ChainstateManager::LoadBlockIndex to update `setBlockIndexCandidates` in the background chain, not just the active chain. In the active chain, the same blocks are added as before. In the background chain, only blocks that have actually been validated, not blocks marked assumed-valid are added so the background chain will continue to download and validate assumed-valid blocks.
ACKs for top commit:
MarcoFalke:
Concept ACK 2283b9cd1e🤽
Sjors:
utACK 2283b9cd1e
Tree-SHA512: 7c9a80802df4722d85d12b78d2e7f628ac5f11cb8be66913d5c3230339bd1220c6723805509d4460826a17d1dc04b0ae172eb7d09ac0ea5dc5e41d77975cbd5e
Ensure that blocks past the snapshot base block (i.e. the end of the
assumed-valid region of the chain) are not included in
setBlockIndexCandidates for the background validation chainstate. These
blocks, while fully validated and lacking the BLOCK_ASSUMED_VALID flag,
*rely* on blocks which are assumed-valid, and so shouldn't be added to
the IBD chainstate.
Co-authored-by: Russ Yanofsky <russ@yanofsky.org>
Avoid modifying the genesis block index entry during snapshot load. This
is because, in a future change that fixes LoadBlockIndex for UTXO
snapshots, we detect block index entries that are reliant on
assumed-valid ancestors and treat them specially.
Since the genesis block doesn't have BLOCK_VALID_SCRIPTS, it would be
erroneously marked BLOCK_ASSUMED_VALID during snapshot load if we didn't
skip it here. This would cause a "setBlockIndexCandidates() empty"
assertion to be tripped since all block index entries would be marked
assume-valid due to genesis, which is never re-validated.
There's probably no good reason to modify the genesis block index entry
during snapshot load anyway...
It would make for sense for the TODO to be done in PR 17487
(or noted in the review feedback for a follow-up),
no need to continue maintaining the TODO in the codebase.
Use the `LOG_TIME_MILLIS_WITH_CATEGORY_MSG_ONCE` macro to improve the
logging of snapshot persistance and no longer manually track the duration.
before
[snapshot] flushing coins cache (0 MB)... done (0.00ms)
[snapshot] flushing snapshot chainstate to disk (0 MB)... done (0.00ms)
after
FlushSnapshotToDisk: flushing coins cache (0 MB) started
FlushSnapshotToDisk: completed (0.00ms)
FlushSnapshotToDisk: saving snapshot chainstate (0 MB) started
FlushSnapshotToDisk: completed (0.00ms)
The logging can be observed in the output of
./src/test/test_bitcoin -t validation_chainstate_tests -- DEBUG_LOG_OUT
7f15eff2dd style-only: Remove redundant scope in *Chainstate (Carl Dong)
89bec827fd Collapse the 2 cs_main locks in LoadChainstate (Carl Dong)
3b1584b794 Remove all #include // for * comments (Carl Dong)
9a5a5a3d08 test/setup: Use LoadChainstate (Carl Dong)
c541da0d62 node/chainstate: Add options for in-memory DBs (Carl Dong)
ceb9790341 node/caches: Remove intermediate variables (Carl Dong)
ac4bf138b8 node/caches: Extract cache calculation logic (Carl Dong)
15f2e33bb3 validation: VerifyDB only needs Consensus::Params (Carl Dong)
4da9c076d1 node/chainstate: Decouple from ShutdownRequested (Carl Dong)
05441c2dc5 node/chainstate: Decouple from GetTime (Carl Dong)
2414ebc18b init: Delay RPC block notif until warmup finished (Carl Dong)
8d466a8504 Move -checkblocks LogPrintf to AppInitMain (Carl Dong)
aad8d59789 node/chainstate: Reduce coupling of LogPrintf (Carl Dong)
b345979a2b node/chainstate: Decouple from concept of uiInterface (Carl Dong)
ca7c0b934d Split off VerifyLoadedChainstate (Carl Dong)
adf4912d77 node/chainstate: Remove do/while loop (Carl Dong)
975235ca0a Move init logistics message for BAD_GENESIS_BLOCK to init.cpp (Carl Dong)
8715658983 Move mempool nullptr Assert out of LoadChainstate (Carl Dong)
9162a4f93e node/chainstate: Decouple from concept of NodeContext (Carl Dong)
c7a5c46e6f node/chainstate: Decouple from ArgsManager (Carl Dong)
ae9121f958 node/chainstate: Decouple from stringy errors (Carl Dong)
cbac28b72f node/chainstate: Decouple from GetTimeMillis (Carl Dong)
cb64af9635 node: Extract chainstate loading sequence (Carl Dong)
Pull request description:
This PR:
1. Coalesce the Chainstate loading sequence between `AppInitMain` and `*TestingSetup` (which makes it more tested)
2. Makes the Chainstate loading sequence reusable in preparation for future work extracting out our consensus engine.
Code-wise, this PR:
1. Extracts `AppInitMain`'s Chainstate loading sequence into a `::LoadChainstateSequence` function
2. Makes this `::LoadChainstateSequence` function reusable by
1. Decoupling it from various concepts (`ArgsManager`, `uiInterface`, etc)
2. Making it report errors using an `enum` rather than by setting a `bilingual_str`
3. Makes `*TestingSetup` use this new `::LoadChainstateSequence`
Reviewers: Aside from commentary, I've also included `git diff` flags of interest in the commit messages which I hope will aid review!
ACKs for top commit:
ryanofsky:
Code review ACK 7f15eff2dd. Thanks for updates!
MarcoFalke:
review ACK 7f15eff2dd💳
Tree-SHA512: fb9a6cbd1c511a52b477c62a5e68e53a8be5dec2fff0e44a279966afb91efbab44bf1fe7c6b1519f8464ecc25f42dd4bae8e1efbf55ee91fc90fa0b92e3a83e2
Often when we're updating mempool entries we update entries that we
ultimately end up removing the updated entries shortly thereafter. This
patch makes it so that we filter for such entries a bit earlier in
processing, which yields a mild improvement for these cases, and is
negligible overhead otherwise.
dce8c4c381 rpc: getblockfrompeer (Sjors Provoost)
b884ababc2 rpc: move Ensure* helpers to server_util.h (Sjors Provoost)
Pull request description:
This adds an RPC method to fetch a block directly from a peer. This can used to fetch stale blocks with lower proof of work that are normally ignored by the node (`headers-only` in `getchaintips`).
Usage:
```
bitcoin-cli getblockfrompeer HASH peer_n
```
Closes#20155
Limitations:
* you have to specify which peer to fetch the block from
* the node must already have the header
ACKs for top commit:
jnewbery:
ACK dce8c4c381
fjahr:
re-ACK dce8c4c381
Tree-SHA512: 843ba2b7a308f640770d624d0aa3265fdc5c6ea48e8db32269b96a082b7420f7953d1d8d1ef2e6529392c7172dded9d15639fbc9c24e7bfa5cfb79e13a5498c8
f1f10c0514 Remove CTxMemPool params from ATMP (lsilva01)
Pull request description:
Remove `CTxMemPool` parameter from `AcceptToMemoryPool` function, as suggested in https://github.com/bitcoin/bitcoin/pull/23437#issuecomment-962536149 .
This requires that `CChainState` has access to `MockedTxPool` in `tx_pool.cpp` as mentioned https://github.com/bitcoin/bitcoin/pull/23173#discussion_r731895386. So the `MockedTxPool` is attributed to `CChainState::m_mempool` before calling `AcceptToMemoryPool`.
Requires #23437.
ACKs for top commit:
jnewbery:
utACK f1f10c0514
MarcoFalke:
review ACK f1f10c0514🔙
Tree-SHA512: 2a4885f4645014fc1fa98bb1090f13721c1a0796bc0021b9cb43bc8cc13920b6eaf057d1f5ed796e0a110e7813e41fe0196334ce7c80d1231fc057a9a3bdf349
During a reorg, we re-check timelocks on all mempool entries using
CheckSequenceLocks(useExistingLockPoints=false) and remove any
now-invalid entries. CheckSequenceLocks() also mutates the LockPoints
passed in, and we update valid entries' LockPoints using
update_lock_points. Thus, update_lock_points(lp) needs to be called
right after CheckSequenceLocks(lp), otherwise we lose the data in lp.
commit bedf246 introduced a bug by separating those two loops.
fa3942fc4c Remove GetSpendHeight (MarcoFalke)
Pull request description:
It is unclear what the goal of the helper is, as the caller already
knows the spend height before calling the helper.
Also, in case the coins view is corrupted, LookupBlockIndex will return
nullptr. Dereferencing a nullptr is UB.
Fix both issues by removing it. Also, add a sanity check, which aborts
if the coins view is corrupted.
ACKs for top commit:
laanwj:
Code review ACK fa3942fc4c
ryanofsky:
Code review ACK fa3942fc4c. I'm not aware of cases where coins GetBestBlock could be different from active chain tip, and asset seems sufficient to guarantee PR doesn't change behavior if that doesn't happen.
Tree-SHA512: 29f65d72e116ec5a4509e0947ceeaa5bb6b7dfd5d174d3c7945cb15fa266d590c4f8b48e6385de74ef7d7c84ebd2255de902ad9c87c24955348a91b12e5bffd5
It is unclear what the goal of the helper is, as the caller already
knows the spend height before calling the helper.
Also, in case the coins view is corrupted, LookupBlockIndex will return
nullptr. Dereferencing a nullptr is UB.
Fix both issues by removing it. Also, add a sanity check, which aborts
if the coins view is corrupted.
a64078e385 Break validation <-> txmempool circular dependency (glozow)
64e4963c63 [mempool] always assert coin spent (glozow)
bb9078ed51 [refactor] put finality and maturity checking into a lambda (glozow)
bedf246f1e [mempool] only update lockpoints for non-removed entries (glozow)
1b3a11e126 MOVEONLY: TestLockPointValidity to txmempool (glozow)
Pull request description:
Remove 2 circular dependencies: validation - txmempool and validation - policy/rbf - txmempool
Validation should depend on txmempool (e.g. `CChainstateManager` has a mempool and we often need to know what's in our mempool to validate transactions), but txmempool is a data structure that shouldn't really need to know about chain state.
- Changes `removeForReorg()` to be parameterized by a callable that returns true/false (i.e. whether the transaction should be removed due to being now immature or nonfinal) instead of a `CChainState`. The mempool really shouldn't need to know about coinbase maturity or lockpoints, it just needs to know which entries to remove.
ACKs for top commit:
laanwj:
Code review ACK a64078e385
mjdietzx:
reACK a64078e385
theStack:
re-ACK a64078e385
Tree-SHA512: f75995200569c09dfb8ddc09729da66ddb32167ff1e8a7e72f105ec062d2d6a9a390e6b4a2a115e7ad8ad3525f891ee1503f3cd2bed11773abcaf7c3230b1136
faad05c6d2 Crash debug builds when mempool ConsensusScriptChecks fails (MarcoFalke)
Pull request description:
Currently a bug in the function might sneak around our testing infrastructure.
Fix that by turning bugs into crashes during tests.
ACKs for top commit:
glozow:
utACK faad05c6d2, there's something seriously wrong with the code if this returns false, good to throw in debug mode
Tree-SHA512: dfea1cd9ce3f1c303f49cca1417cd5c77c6ed12849aaff7b6ab1b6060f2f0c9cf5d4689017355d11f66639bab35823f65f848e6979042fa875181509dfd5d3d7
No behavior change.
Parameterize removeForReorg using a CChain and callable that
encapsulates validation logic. The mempool shouldn't need to know a
bunch of details about coinbase maturity and lock finality. Instead,
just pass in a callable function that says true/false. Breaks circular
dependency by removing txmempool's dependency on validation.
As node operators are free to set their mempool policies however they
please, it's possible for package transaction(s) to already be in the
mempool. We definitely don't want to reject the entire package in that
case (as that could be a censorship vector).
We should still return the successful result to the caller, so add
another result type to MempoolAcceptResult.
0fdb619aaf [validation] Always call mempool.check() after processing a new transaction (John Newbery)
2c64270bbe [refactor] Don't call AcceptToMemoryPool() from outside validation.cpp (John Newbery)
92a3aeecf6 [validation] Add CChainState::ProcessTransaction() (John Newbery)
36167faea9 [logging/documentation] Remove reference to AcceptToMemoryPool from error string (John Newbery)
4c24142b1e [validation] Remove comment about AcceptToMemoryPool() (John Newbery)
5759fd12b8 [test] Don't set bypass_limits to true in txvalidation_tests.cpp (John Newbery)
497c9e2964 [test] Don't set bypass_limits to true in txvalidationcache_tests.cpp (John Newbery)
Pull request description:
Similarly to how #18698 added `ProcessNewBlock()` and `ProcessNewBlockHeaders()` methods to the `ChainstateManager` class, this PR adds a new `ProcessTransaction()` method. Code outside validation no longer calls `AcceptToMemoryPool()` directly, but calls through the higher-level `ProcessTransaction()` method. Advantages:
- The interface is simplified. Calling code no longer needs to know about the active chainstate or mempool object, since `AcceptToMemoryPool()` can only ever be called for the active chainstate, and that chainstate knows which mempool it's using. We can also remove the `bypass_limits` argument, since that can only be used internally in validation.
- responsibility for calling `CTxMemPool::check()` is removed from the callers, and run automatically by `ChainstateManager` every time `ProcessTransaction()` is called.
ACKs for top commit:
lsilva01:
tACK 0fdb619 on Ubuntu 20.04
theStack:
Code-review ACK 0fdb619aaf
ryanofsky:
Code review ACK 0fdb619aaf. Only changes since last review: splitting & joining commits, adding more explanations to commit messages, tweaking MEMPOOL_ERROR string, fixing up argument name comments.
Tree-SHA512: 0b395c2e3ef242f0d41d47174b1646b0a73aeece38f1fe29349837e6fb832f4bf8d57e1a1eaed82a97c635cfd59015a7e07f824e0d7c00b2bee4144e80608172
No change in behavior, because package transactions would not be going
through the rbf logic in PreChecks anyway (BIP125 is currently disabled
for package acceptance, see ATMPArgs).
We draw the line here because each individual transaction in package
validation still goes through all PreChecks. For example, checking that
one's own conflicts and dependencies are disjoint (a consensus check)
and individual transaction mempool ancestor/descendant limits.
The aliases are leftover from a previous MOVEONLY refactor - they are
unnecessary and removing them reduces the diff for splitting out mempool
Checks from PreChecks, making RBF variables MemPoolAccept-wide, etc.
-BEGIN VERIFY SCRIPT-
unalias() { sed -i "s:\<$1\>:$2:g" src/validation.cpp; sed -i "/$2 = $2/d" src/validation.cpp; }
unalias nModifiedFees ws.m_modified_fees
unalias nConflictingFees ws.m_conflicting_fees
unalias nConflictingSize ws.m_conflicting_size
unalias setConflicts ws.m_conflicts
unalias allConflicting ws.m_all_conflicting
unalias setAncestors ws.m_ancestors
-END VERIFY SCRIPT-
CTxMemPool::check() will carry out internal consistency checks 1/n times,
where n is set by the `-checkmempool` configuration option. By default,
mempool consistency checks are disabled entirely on mainnet.
Therefore, this change has no effect on mainnet nodes running with
default configuration. It simply removes the responsibility to trigger
mempool consistency checks from net_processing.
This just calls through to AcceptToMemoryPool() internally, and is currently unused.
Also add a new transaction validation failure reason TX_NO_MEMPOOL to
indicate that there is no mempool.
"This logic is not necessary for memory pool transactions, as
AcceptToMemoryPool already refuses previously-known transaction ids
entirely." refers to the logic at
a206b0ea12/src/main.cpp (L484-L486),
which was later removed in commit 450cbb0944.
This is not only cleaner but also helps make sure we are always using
the virtual size measure that includes the sigop weight heuristic (which
is the vsize the mempool would return).
This bool was originally part of Workspace and was removed in #22539
when it was no longer needed in Finalize(). Re-introducing it because,
once again, multiple functions will need to know whether we're doing an
RBF. Member of MemPoolAccept so that we can use this to inform package
RBF in the future.
No change in behavior.
ATMPArgs can continue to have granular rules like switching BIP125
on/off while we create an interface for the different sets of rules for
single transactions vs multiple-testmempoolaccept vs package validation.
This is a cleaner interface than manually constructing the args, which
makes it easy to mix up ordering, use the wrong default, etc. It also
means we don't need to edit ATMP/single transaction validation code
every time we update ATMPArgs for package validation.
082c5bf099 [refactor] pass coinsview and height to check() (glozow)
ed6115f1ea [mempool] simplify some check() logic (glozow)
9e8d7ad5d9 [validation/mempool] use Spend/AddCoin instead of UpdateCoins (glozow)
09d18916af MOVEONLY: remove single-use helper func CheckInputsAndUpdateCoins (glozow)
e8639ec26a [mempool] remove now-unnecessary code (glozow)
54c6f3c1da [mempool] speed up check() by using coins cache and iterating in topo order (glozow)
30e240f65e [bench] Benchmark CTxMemPool::check() (glozow)
cb1407196f [refactor/bench] make mempool_stress bench reusable and parameterizable (glozow)
Pull request description:
Remove the txmempool <-> validation circular dependency by removing txmempool's dependency on validation. There are two functions in txmempool that need validation right now: `check()` and `removeForReorg()`. This PR removes the dependencies in `check()`.
This PR also improves the performance of `CTxMemPool::check()` by walking through the entries exactly once, in ascending ancestorcount order, which guarantees that we see parents before children.
ACKs for top commit:
jnewbery:
reACK 082c5bf099
GeneFerneau:
tACK [082c5bf](082c5bf099)
rajarshimaitra:
tACK 082c5bf099
theStack:
Code-review ACK 082c5bf099
Tree-SHA512: 40ac622af1627b5c3e6abb4f0f035d833265a8c5e8dc88faf5354875dfb5137f137825e54bbd2a2668ed37b145c5d02285f776402629f58596e51853a9a79d29
a0efe529e4 Fix outdated comments referring to ::ChainActive() (Samuel Dobson)
Pull request description:
After #21866 there are a few outdated comments referring to `::ChainActive()`, which should instead refer to `ChainstateManager::ActiveChain()`.
ACKs for top commit:
jamesob:
ACK a0efe529e4
Tree-SHA512: 80da19c105ed29ac247e6df4c8e916c3bf3f37230b63f07302114eef9c115add673e9649f0bbe237295be0c6da7b1030b5b93e14daf6768f17ce5de7cf2c9ff2
The tracepoint `validation:block_connected` was introduced in #22006.
The first argument was the hash of the connected block as a pointer
to a C-like String. The last argument passed the hash of the
connected block as a pointer to 32 bytes. The hash was only passed as
string to allow `bpftrace` scripts to print the hash. It was
(incorrectly) assumed that `bpftrace` cannot hex-format and print the
block hash given only the hash as bytes.
The block hash can be printed in `bpftrace` by calling
`printf("%02x")` for each byte of the hash in an `unroll () {...}`.
By starting from the last byte of the hash, it can be printed in
big-endian (the block-explorer format).
```C
$p = $hash + 31;
unroll(32) {
$b = *(uint8*)$p;
printf("%02x", $b);
$p -= 1;
}
```
See also: https://github.com/bitcoin/bitcoin/pull/22902#discussion_r705176691
This is a breaking change to the block_connected tracepoint API, however
this tracepoint has not yet been included in a release.