9ad2fe7e69 clusterlin: only start/use search when enough iterations left (Pieter Wuille)
bd044356ed clusterlin: improve heuristic to decide split transaction (optimization) (Pieter Wuille)
71f2629398 clusterlin: include topological pot subsets automatically (optimization) (Pieter Wuille)
e20fda77a2 clusterlin: reduce computation of unnecessary pot sets (optimization) (Pieter Wuille)
6060a948ca clusterlin bench: add example hard cluster benchmarks (Pieter Wuille)
2965fbf203 clusterlin: track upper bound potential set for work items (optimization) (Pieter Wuille)
9e43e4ce10 clusterlin: use feerate-sorted depgraph in SearchCandidateFinder (Pieter Wuille)
b80e6dfe78 clusterlin: add reordering support for DepGraph (Pieter Wuille)
85a285a306 clusterlin: separate initial search entries per component (optimization) (Pieter Wuille)
e4faea9ca7 clusterlin bench: have low/high iter benchmarks instead of per-iter (Pieter Wuille)
Pull request description:
Part of cluster mempool: #30289
Depends on #30126, and was split off from it.
This improves the candidate search algorithm introduced in the previous PR with a variety of optimizations.
The resulting search algorithm largely follows Section 2 of [How to linearize your cluster](https://delvingbitcoin.org/t/how-to-linearize-your-cluster/303#h-2-finding-high-feerate-subsets-5), though with a few changes:
* Connected component analysis is performed inside the search algorithm (creating initial work items per component for each candidate), rather than once at a higher level. This duplicates some work but is significantly simpler in implementation.
* No ancestor-set based presplitting inside the search is performed; instead, the `best` value is initialized with the best topologically valid set known to the LIMO algorithm before search starts: the better one out of the highest-feerate remaining ancestor set, and the highest-feerate prefix of remaining transactions in `old_linearization`.
* Work items are represented using an included set *inc* and an undefined set *und*, rather than included and excluded.
* Potential sets *pot* are not computed for work items with empty *inc*.
At a high level, the only missing optimization from that post is bottleneck analysis; my thinking is that it only really helps with clusters that are already relatively cheap to linearize (doing so would need to be done at a higher level, not inside the search algorithm).
---
Overview of the impact of each commit here on linearize performance:
* **[clusterlin bench: have low/high iter benchmarks instead of per-iter](21a184db63)**: no impact
* **[separate initial search entries per component (optimization)](c84c5c86ba)**: reduce iterations, increase start-up cost
* **[add reordering support for DepGraph](019ff29609)**: no impact
* **[use feerate-sorted depgraph in SearchCandidateFinder](8e27dd5a22)**: typically reduce iterations, increase start-up cost
* **[track upper bound potential set for work items](781e0fb3aa)**: reduce iterations, increase cost per iteration
* **[reduce computation of unnecessary pot sets](9fe834fa97)**: reduce cost per iteration
* **[include topological pot subsets automatically](30612710a4)**: reduce iterations, increase cost per iteration
* **[improve heuristic to decide split transaction](1880c00ab1)**: typically reduce iterations, increase cost per iteration
* **[only start/use search when enough iterations left](12760a57b3)**: just account for start-up cost as equivalent iterations
ACKs for top commit:
sdaftuar:
ACK 9ad2fe7e69
instagibbs:
reACK 9ad2fe7e69
glozow:
reACK 9ad2fe7e69, just have a question about the docs
Tree-SHA512: 108bcbb0676f36071eb83954059b5f3d6646c745015b644a2a5d7f5a8ac9424c2d01d339fa6318a3aff4cf313308e85bb80b0090899720a3fcba027b8025590a