Skip to content

Add the ability to serialize/deserialize Arc<str> and Box<str> #1339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

Sokom141
Copy link

@Sokom141 Sokom141 commented May 4, 2025

Description

This PR add the ability to automatically derive SerializeValue and DeserializeValue on structs that contain Box<str> and Arc<str>, enabling particular patterns to reduce memory footprint.

Along with the implementation I have added tests where I thought it made sense. Everything seems to work but I had troubles with the Makefile and wasn't able to bring up the 3-node cluster, so please double check (cargo t -p scylla-cql passed every test anyway).

Thanks!

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • I added relevant tests for new features and bug fixes.
  • All commits compile, pass static checks and pass test.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have provided docstrings for the public items that I want to introduce.
  • I have adjusted the documentation in ./docs/source/.
  • I added appropriate Fixes: annotations to PR description.

Fixes: #1336

Copy link

github-actions bot commented May 4, 2025

cargo semver-checks found no API-breaking changes in this PR.
Checked commit: 45b3d5c

Copy link
Collaborator

@Lorak-mmk Lorak-mmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the contribution! I left some minor comments regarding tests.
I also think that docs should be adjusted.

  • docs/source/data-types/text.md - it should mention that Box and Arc are supported.
  • docs/source/data-types/data-types.md mapping should also be updated.

@Sokom141
Copy link
Author

Sokom141 commented May 6, 2025

Let me know if everything looks good now.
Thank you

wprzytula
wprzytula previously approved these changes May 8, 2025
@Lorak-mmk
Copy link
Collaborator

@Sokom141 When looking into this I uncovered some more issues.
First I was surprised that Box<str> does not already work in SerializeValue. It should because we have SerializeValue impl for Box<T>.
Turns out it does not work because we only have an impl for &str instead of str - changing it is easy and unlocks SerializeValue for Box<str>. It also does not break &str impl because we have impl for &T.
Adding a similar impl for Arc<T> is also easy.

Then I looked into DeserializeValue thinking that it should also be easy to add support for arbitrary Box<T> and Arc<T>.
And we can write the following code for that:

impl<'frame, 'metadata, T: DeserializeValue<'frame, 'metadata>> DeserializeValue<'frame, 'metadata>
    for Box<T>
{
    fn type_check(typ: &ColumnType) -> Result<(), TypeCheckError> {
        T::type_check(typ).map_err(typck_error_replace_rust_name::<Self>)
    }

    fn deserialize(
        typ: &'metadata ColumnType<'metadata>,
        v: Option<FrameSlice<'frame>>,
    ) -> Result<Self, DeserializationError> {
        T::deserialize(typ, v)
            .map(Box::new)
            .map_err(deser_error_replace_rust_name::<Self>)
    }
}

This code unfortunately has a subtle mistake. typck_error_replace_rust_name only works with our built-in errors.
If the T above is some user type, for which the user implements DeserializeValue manually with a different error type, typck_error_replace_rust_name will panic when called.
It is also true for deser_error_replace_rust_name.

Then I started to wonder if we already have this bug somewhere else since it is so easy to cause it.
Turns out we do: in imple for BTreeSet<T> and HashSet<T, S>, in type_check we call type_check in T instead of ListLikeIterator<T>!
This is not all unfortunately. Implementations for secrecy, MaybeEmpty, and Option do not perform renaming at all, which would result in weird error messages in case of errors.
Apart from that I noticed that SerializeValue does not have renaming at all, but I did not yet check if it is a problem or intentional.

I think we should:

  • Either change deser_error_replace_rust_name and typck_error_replace_rust_name to not panic on custom errors, or introduce new functions for this purpose.
  • Change SerializeValue to work on str instead of &str
  • Add Arc<T> impl of SerializeValue.
  • Add Box<T> and Arc<T> imple of DeserializeValue
  • Verify that renaming should be performed in impls for secrecy, MaybeEmpty, and Option, and add it if so
  • Write tests for invalid types to make sure error messages are correct, work with custom errors, and perform renaming with builtin errors.
  • Check if we need to add renaming in SerializeValue, and if so add it.
  • In any case, invalid types tests are also necessary in SerializeValue.

I'll open an issue for that and we'll try to address it soon. I don't think there is much sense in merging this PR since it doesn't solve the real issue we found.

@Sokom141
Copy link
Author

Sokom141 commented May 8, 2025

Nice catch. This is fine for me, I am happy that this has uncovered some underlying problems!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable derive (de)serialization of Box<str>
3 participants