Skip to content

Full API compatibilty with Boost.Serialization #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sithhell opened this issue Jul 12, 2013 · 10 comments
Closed

Full API compatibilty with Boost.Serialization #8

sithhell opened this issue Jul 12, 2013 · 10 comments
Milestone

Comments

@sithhell
Copy link

As far as I can see, the API of cereal only looks like Boost.Serialization.
We would really like to use cereal in our project but we would need it to be a full drop in replacement.

Things we currently miss:

  • versioning
  • operator& for archives
  • operator<< for output archives
  • operator>> for input archives
@randvoorhies
Copy link
Contributor

Adding the compatible operators would be pretty trivial, and so I'm open to it. The versioning however would be a bit trickier. Adding version info to the archives for every type goes a bit against the spirit of cereal (keep the bits low, add it yourself if you need it), and I don't think just faking it by always setting it to 0 or something would be a good idea. However, it should be possible to detect if users are using a serialize(Archive & , uint32_t) method, and to only read/write versioning info if so.

@sithhell
Copy link
Author

I agree. The problem is that we need a fallback for non-C++11 compilers. So we can't pay the one time prize of getting rid of the versioning (we don't really need it anyways, the prize is still high).

I like the suggestion to detect the versioning based on the signature.

@AzothAmmo
Copy link
Contributor

I definitely don't want versioning information to be in archives by default since it adds bloat.

I also like the automatic detection of versioning based off of the signature. This will take a bit of thought to figure out how to implement this well - we want the implementation to have zero overhead for users that don't use it. It will also require a lot of type trait work.

The operators are no problem and we can put them under a different doxygen group to discourage their use except for maintaining compatibility with boost.

@sithhell
Copy link
Author

Am 13.07.2013 08:14 schrieb "Shane Grant" [email protected]:

I definitely don't want versioning information to be in archives by
default since it adds bloat.

*nod *

I also like the automatic detection of versioning based off of the
signature. This will take a bit of thought to figure out how to implement
this well - we want the implementation to have zero overhead for users that
don't use it. It will also require a lot of type trait work.

Sounds reasonable.

The operators are no problem and we can put them under a different
doxygen group to discourage their use except for maintaining compatibility
with boost.

Prefect. Looking forward to all this!


Reply to this email directly or view it on GitHub.

@hkaiser
Copy link

hkaiser commented Jul 13, 2013

I believe it would be sufficient not to have the versioning information in the archive, but just have the serialization members be allowed to expect a second argument (which could be zero, always).

Another thing which would be needed is non-intrusive serialization (using non-member functions). Is that something which could be done as well?

@randvoorhies
Copy link
Contributor

Non-member serialize and save/load capability is already implemented. Passing 0 to the version could be done easily, but it feels like the wrong thing to do. If we're going to implement it, I vote that we should do it properly.

randvoorhies added a commit that referenced this issue Jul 14, 2013
Just proof of concept with a cereal::traits::has_versioned_member_serialize trait working. Need to implement all of the other _versioned_ variants, and then implement all of the mutually exclusive logic to get the serialization function selection to work.

This relates to issue #8
AzothAmmo added a commit that referenced this issue Dec 3, 2013
Remaining things to do: Modify cereal.hpp to properly choose between
versioned and non-versioned functions and place entries in the set of
versioned types as appropriate.
@AzothAmmo
Copy link
Contributor

This is very nearly done, see the latest changes at https://github.com/USCiLab/cereal/commits/boost_compat_new.

Final API will look basically identical to Boost for this use case with lots of comments heavily discouraging its use.

All types have a version of 0 unless specified otherwise with a version macro. Cereal will use the versioned serialization function if it exists and will fail to compile if it detects both a versioned and non-versioned set of serialization functions.

AzothAmmo added a commit that referenced this issue Dec 5, 2013
Simple case of making these functions for the rest of the output archive serialization functions and then adding it to
load.  Progress towards issue #8.
AzothAmmo added a commit that referenced this issue Dec 8, 2013
@AzothAmmo
Copy link
Contributor

So I'd say this issue is now about 90% complete. However there is an important design decision to be made before this is finished.

Currently the way I have implemented this version information is that it is serialized only when the first instance of some class is serialized. This is nice because it adds zero overhead to use cases where classes have no serialization, and fairly little overhead for those that opt for version information.

The overhead for classes that do use version information is: 32 bits of data the first time a class is serialized and each time a class with serialization information is serialized it must do a lookup in an std::unordered_set to see if it has already had its version serialized.

The disadvantages of such a method come for those who use version information in combination with text archives (XML, JSON), which support "out of order" loading and can load up data in an order that does not match the order it appears in the text archive. Obviously this could potentially be a problem if the first instance of some versioned class that is loaded isn't the one that has the version information with it.

The alternative to the above is to have all version information dumped at the beginning of an archive and then loaded at the beginning of using an archive. This would add some small amount of overhead to all archives, regardless of whether they use serialization. Chances are we would have one byte indicating whether such versioning information is present, followed by a mapping from identifiers to class versions (if present).

Advantages to the second method are that it is more efficient for those who use version information since we don't have to do the check each time we save or load to pull in the version information (it is done once at archive start up). The disadvantage is that this adds some small amount of overhead to all archives regardless of whether they use versioning.

What are everyone's thoughts on this? I'm currently leaning towards changing it to including this information at the beginning of the archive because I don't like the idea of someone using versioning with a text archive and getting a crash, but the purist in me doesn't want to add overhead for those that don't want it.

@AzothAmmo
Copy link
Contributor

Turns out there is way to do this with no overhead for those who don't want to use it, which I didn't realize for whatever reason. The way it will work in the end will be:

Version information comes first in an archive. Only pay a cost if using, and if using only pay a cost when first registering types (at program start-up) using the same static machinery as polymorphic types. Those using text based archives will need to keep this information at the beginning of their archive (this might change, not sure, will post when done).

AzothAmmo added a commit that referenced this issue Dec 11, 2013
@AzothAmmo
Copy link
Contributor

I'm calling this feature complete now. I ultimately ended up coding it in basically the same fashion as boost does - serialization information is output with a type the first time that type is serialized, and loaded the first time that type is loaded. I decided against putting all of the information at the beginning of the archive because of the implications this would have had on the API for those writing their own archive types.

So the overhead is now identical to what it is in boost and those using text archives that can perform out of order loading just need to make sure the first instance of a versioned type they load is the one that contains the version information. I don't imagine most users will use this feature so this shouldn't be a problem and will be well documented.

Documentation and some renaming will be done in issue #22

AzothAmmo added a commit that referenced this issue Dec 22, 2013
Added a test case for versioning which exposed a small bug that related to loading versioned data due to an improper use
of static.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants