This adds the (internal) Wrapper class, and the Using function that uses it. Given
a class F that implements Ser(stream, const object&) and Unser(stream, object&)
functions, this permits writing e.g. READWRITE(Using<F>(object)).
This new approach uses a static method which takes the object as
a argument. This has the advantage that its constness can be a
template parameter, allowing a single implementation that sees the
object as const for serialization and non-const for deserialization,
without casts.
More boilerplate is included in the new macro as well.
86b47fa741 speed up Unserialize_impl for prevector (Akio Nakamura)
Pull request description:
The unserializer for prevector uses `resize()` for reserve the area, but it's prefer to use `reserve()` because `resize()` have overhead to call its constructor many times.
However, `reserve()` does not change the value of `_size` (a private member of prevector).
This PR make the logic of read from stream to callback function, and prevector handles initilizing new values with that call-back and ajust the value of `_size`.
The changes are as follows:
1. prevector.h
Add a public member function named 'append'.
This function has 2 params, number of elemenst to append and call-back function that initilizing new appended values.
2. serialize.h
In the following two function:
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)`
- `Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)`
Make a callback function from each original logic of reading values from stream, and call prevector's `append()`.
3. test/prevector_tests.cpp
Add a test for `append()`.
## A benchmark result is following:
[Machine]
MacBook Pro (macOS 10.13.3/i7 2.2GHz/mem 16GB/SSD)
[result]
DeserializeAndCheckBlockTest => 22% faster
DeserializeBlockTest => 29% faster
[before PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 94.4901, 0.0094644, 0.0104715, 0.0098339
DeserializeBlockTest, 60, 130, 65.0964, 0.00800362, 0.00895134, 0.00824187
[After PR]
# Benchmark, evals, iterations, total, min, max, median
DeserializeAndCheckBlockTest, 60, 160, 77.1597, 0.00767013, 0.00858959, 0.00805757
DeserializeBlockTest, 60, 130, 49.9443, 0.00613926, 0.00691187, 0.00635527
ACKs for top commit:
laanwj:
utACK 86b47fa741
Tree-SHA512: 62ea121ccd45a306fefc67485a1b03a853435af762607dae2426a87b15a3033d802c8556e1923727ddd1023a1837d0e5f6720c2c77b38196907e750e15fbb902
The unserializer for prevector uses resize() for reserve the area,
but it's prefer to use reserve() because resize() have overhead
to call its constructor many times.
However, reserve() does not change the value of "_size"
(a private member of prevector).
This PR introduce resize_uninitialized() to prevector that similar to
resize() but does not call constructor, and added elements are
explicitly initialized in Unserialize_imple().
The changes are as follows:
1. prevector.h
Add a public member function named 'resize_uninitialized'.
This function processes like as resize() but does not call constructors.
So added elemensts needs explicitly initialized after this returns.
2. serialize.h
In the following two function:
Unserialize_impl(Stream& is, prevector<N, T>& v, const unsigned char&)
Unserialize_impl(Stream& is, prevector<N, T>& v, const V&)
Calls resize_uninitialized() instead of resize()
3. test/prevector_tests.cpp
Add a test for resize_uninitialized().
ece88fd Introduce BigEndian wrapper and use it for netaddress ports (Pieter Wuille)
Pull request description:
This is another small improvement taken from #10785.
Instead of manually converting from/to BE format in the `CService` serializer, provide a generic way in serialize.h to serialize BE data (only 16 bits for now).
Tree-SHA512: bd67cf7eed465dad08551fb62f659e755e0691e4597a9f59d285d2b79975b50e5710d35a34a185b5ad232e1deda9a4946615f9132b1ed7d96ed8087f73ace66b
818dc74 Support serialization as another type without casting (Pieter Wuille)
Pull request description:
This adds a `READWRITEAS(type, obj)` macro which serializes `obj` as if it were converted to `const type&` when `const`, and to `type&` when non-`const`. No actual cast is involved, so this only works when this conversion can be done automatically.
This makes it usable in serialization code that uses a single implementation for both serialization and deserializing, which doesn't know the constness of the object involved.
This is a redo of #12712, using a slightly different interface.
Tree-SHA512: 262f0257284ff99b5ffaec9b997c194e221522ba35c3ac8eaa9bb344449d7ea0a314de254dc77449fa7aaa600f8cd9a24da65aade8c1ec6aa80c6e9a7bba5ca7
Support is added to serialize arrays of type char or unsigned char directly,
without any wrappers. All invocations of the FLATDATA wrappers that are
obsoleted by this are removed.
This includes a patch by Russell Yanofsky to make char casting type safe.
The serialization of CSubNet is changed to serialize a bool directly rather
than though FLATDATA. This makes the serialization independent of the size
of the bool type (and will use 1 byte everywhere).
This adds a READWRITEAS(type, obj) macro which serializes obj as if it
were casted to (const type&) when const, and to (type&) when non-const.
This makes it usable in serialization code that uses a single
implementation for both serialization and deserializing, which doesn't
know the constness of the object involved.
Using VARINT with signed types is dangerous because negative values will appear
to serialize correctly, but then deserialize as positive values mod 128.
This commit changes the VARINT macro to trigger an error by default if called
with an signed value, and updates broken uses of VARINT to pass a special flag
that lets them keep working with no change in behavior.
Currently, the READWRITE macro cannot be passed any non-const temporaries, as
the SerReadWrite function only accepts lvalue references.
Deserializing into a temporary is very common, however. See for example
things like 's >> VARINT(n)'. The VARINT macro produces a temporary wrapper
that holds a reference to n.
Fix this by accepting non-const rvalue references instead of lvalue references.
We don't propagate the rvalue-ness down, as there are no useful optimizations
that only apply to temporaries.
Then use this new functionality to get rid of many (but not all) uses of the
'REF' macro (which casts away constness).
680bc2cbb Use range-based for loops (C++11) when looping over map elements (practicalswift)
Pull request description:
Before this commit:
```c++
for (std::map<T1, T2>::iterator x = y.begin(); x != y.end(); ++x) {
T1 z = (*x).first;
…
}
```
After this commit:
```c++
for (auto& x : y) {
T1 z = x.first;
…
}
```
Tree-SHA512: 954b136b7f5e6df09f39248a6b530fd9baa9ab59d7c2c7eb369fd4afbb591b7a52c92ee25f87f1745f47b41d6828b7abfd395b43daf84a55b4e6a3d45015e3a0
To get the advantages of faster GetSerializeSize() implementations
back that were removed in "Make GetSerializeSize a wrapper on top of
CSizeComputer", reintroduce them in the few places in the form of a
specialized Serialize() implementation. This actually gets us in a
better state than before, as these even get used when they're invoked
indirectly in the serialization of another object.
The CSerAction's ForRead() method does not depend on any runtime
data, so guarantee that requests to it can be optimized out by
making it constexpr.
Suggested by Cory Fields.
Remove the nType and nVersion as parameters to all serialization methods
and functions. There is only one place where it's read and has an impact
(in CAddress), and even there it does not impact any of the recursively
invoked serializers.
Instead, the few places that need nType or nVersion are changed to read
it directly from the stream object, through GetType() and GetVersion()
methods which are added to all stream classes.
Given that in default GetSerializeSize implementations created by
ADD_SERIALIZE_METHODS we're already using CSizeComputer(), get rid
of the specialized GetSerializeSize methods everywhere, and just use
CSizeComputer. This removes a lot of code which isn't actually used
anywhere.
For CCompactSize and CVarInt this actually removes a more efficient
size computing algorithm, which is brought back in a later commit.
The stream implementations had two cascading layers (the upper one
with operator<< and operator>>, and a lower one with read and write).
The lower layer's functions are never cascaded (nor should they, as
they should only be used from the higher layer), so make them return
void instead.
Implement `begin_ptr` and `end_ptr` in terms of C++11 code,
and add a comment that they are deprecated.
Follow-up to developer notes update in 654a211622.
There's only one case where a vector containing a fundamental type is
serialized all-at-once, unsigned char. Anything else would lead to
strange results.
Use a dummy argument to overload in that case.
- it now takes over the passed file descriptor and closes it in the
destructor
- this fixes a leak in LoadExternalBlockFile(), where an exception could
cause the file to not getting closed
- disallow copies (like recently added for CAutoFile)
- make nType and nVersion private
One might assume that CAutoFile would be ref-counted so that a copied object
would delay closing the underlying file until all copies have gone out of
scope. Since that's not the case with CAutoFile, explicitly disable copying.
Thanks to Pieter Wuille for most of the work on this commit.
I did not fixup the overhaul commit, because a rebase conflicted
with "remove fields of ser_streamplaceholder".
I prefer not to risk making a mistake while resolving it.
The nType and nVersion fields of stream objects are never accessed
from outside the class (or perhaps from the inside too, I haven't checked).
Thus no need to have them in a placeholder, whose only purpose is to
fill the "Stream" template parameter in serialization implementation.
The implementation of each class' serialization/deserialization is no longer
passed within a macro. The implementation now lies within a template of form:
template <typename T, typename Stream, typename Operation>
inline static size_t SerializationOp(T thisPtr, Stream& s, Operation ser_action, int nType, int nVersion) {
size_t nSerSize = 0;
/* CODE */
return nSerSize;
}
In cases when codepath should depend on whether or not we are just deserializing
(old fGetSize, fWrite, fRead flags) an additional clause can be used:
bool fRead = boost::is_same<Operation, CSerActionUnserialize>();
The IMPLEMENT_SERIALIZE macro will now be a freestanding clause added within
class' body (similiar to Qt's Q_OBJECT) to implement GetSerializeSize,
Serialize and Unserialize. These are now wrappers around
the "SerializationOp" template.
- ensures a consistent usage in header files
- also add a blank line after the copyright header where missing
- also remove orphan new-lines at the end of some files
Thus the read(...) and write(...) methods of all stream classes now have identical parameter lists.
This will bring these classes one step closer to a common interface.
Remove the 'state' and 'exceptmask' from serialize.h's stream implementations,
as well as related methods.
As exceptmask always included 'failbit', and setstate was always called with
bits = failbit, all it did was immediately raise an exception. Get rid of
those variables, and replace the setstate with direct exception throwing
(which also removes some dead code).
As a result, good() is never reached after a failure (there are only 2
calls, one of which is in tests), and can just be replaced by !eof().
fail(), clear(n) and exceptions() are just never called. Delete them.
`&vch[vch.size()]` and even `&vch[0]` on vectors can cause assertion
errors with VC in debug mode. This is the problem mentioned in #4239.
The deeper problem with this is that we rely on undefined behavior.
- Add `begin_ptr` and `end_ptr` functions that get the beginning and end
pointer of vector in a reliable way that copes with empty vectors and
doesn't reference outside the vector
(see https://stackoverflow.com/questions/1339470/how-to-get-the-address-of-the-stdvector-buffer-start-most-elegantly/1339767#1339767).
- Add a convenience constructor to CFlatData that wraps a vector.
I added `begin_ptr` and `end_ptr` as separate functions as I imagine
they will be useful in more places.
Use misc methods of avoiding unnecesary header includes.
Replace int typedefs with int##_t from stdint.h.
Replace PRI64[xdu] with PRI[xdu]64 from inttypes.h.
Normalize QT_VERSION ifs where possible.
Resolve some indirect dependencies as direct ones.
Remove extern declarations from .cpp files.
Changed CDataStream::GetAndClear() to use the most obvious
get get and clear instead of a tricky swap().
Added a unit test for CDataStream insert/erase/GetAndClear.
Note: GetAndClear() is not performance critical, it is used only
by the send-a-message-to-the-network code. Bug was not noticed
before now because the send-a-message code never erased from the
stream.
This seems to cause problems on recent clang, and looks totally
redundant and unused.
The const_iterator version is identical to the vector::const_iterator
one (which is a typedef thereof). Marking it private (instead of
removing) compiles fine, so this version is effectively unused even.
The length of vectors, maps, sets, etc are serialized using
Write/ReadCompactSize -- which, unfortunately, do not use a
unique encoding.
So deserializing and then re-serializing a transaction (for example)
can give you different bits than you started with. That doesn't
cause any problems that we are aware of, but it is exactly the type
of subtle mismatch that can lead to exploits.
With this pull, reading a non-canonical CompactSize throws an
exception, which means nodes will ignore 'tx' or 'block' or
other messages that are not properly encoded.
Please check my logic... but this change is safe with respect to
causing a network split. Old clients that receive
non-canonically-encoded transactions or blocks deserialize
them into CTransaction/CBlock structures in memory, and then
re-serialize them before relaying them to peers.
And please check my logic with respect to causing a blockchain
split: there are no CompactSize fields in the block header, so
the block hash is always canonical. The merkle root in the block
header is computed on a vector<CTransaction>, so
any non-canonical encoding of the transactions in 'tx' or 'block'
messages is erased as they are read into memory by old clients,
and does not affect the block hash. And, as noted above, old
clients re-serialize (with canonical encoding) 'tx' and 'block'
messages before relaying to peers.
Variable-length integers: bytes are a MSB base-128 encoding of the number.
The high bit in each byte signifies whether another digit follows. To make
the encoding is one-to-one, one is subtracted from all but the last digit.
Thus, the byte sequence a[] with length len, where all but the last byte
has bit 128 set, encodes the number:
(a[len-1] & 0x7F) + sum(i=1..len-1, 128^i*((a[len-i-1] & 0x7F)+1))
Properties:
* Very small (0-127: 1 byte, 128-16511: 2 bytes, 16512-2113663: 3 bytes)
* Every integer has exactly one encoding
* Encoding does not depend on size of original integer type
We're in a wholly different world now, C++-compiler-wise.
Current std::stringstream implementations don't have the stated problem anymore,
and are just as fast as CDataStream.
The #ifdef'd block does not even compile anymore; CDataStream constructor changed,
and missing some std::. Also timing in whole seconds is also way too granular
to say anything sensible in such microbenchmarks. Just remove it,
it can always be found again in git history.
This commit removes the dependency of serialize.h on PROTOCOL_VERSION,
and makes this parameter required instead of implicit. This is much saner,
as it makes the places where changing a version number can have an
influence obvious.
* move PROTOCOL_VERSION to version.h
* move CLIENT_VERSION* to version.h, make available past cpp stage
* clearly separate client, network version portions of version.h