Use classes instead of lambdas for schedules #2125

akanto · 2025-04-27T14:09:38Z

…rtable schedules

Previously, using closures (e.g., lambdas) for learning_rate or clip_range caused segmentation faults when loading models across different platforms (e.g., macOS to Linux), because cloudpickle could not safely serialize/deserialize them.

Description

Refactor schedule-related helper functions into proper classes. This ensures full portability and prevents segfaults when
loading models across different operating systems. Introduces ConstantSchedule, CappedLinearSchedule, and
FloatConverterSchedule for supporting portability across different operating systems.

This commit rewrites:

constant_fn as a ConstantSchedule class
get_schedule_fn as a FloatConverterSchedule class
get_linear_fn as a LinearSchedule class

All schedules are now proper callable classes, making them portable and safely pickleable. Old functions are kept (marked as deprecated) for backward compatibility when loading existing models.

Motivation and Context

Fixes cross-platform segmentation faults (#2115) caused by non-portable closures (like lambdas) being serialized into model files. After this change, saved models are robust, portable, and no longer crash at load time if they are moved across different operating systems.

I have raised an issue to propose this change (required for new features and bug fixes)

Types of changes

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
Documentation (update in the documentation)

Checklist

Note: You can run most of the checks using make commit-checks.

Note: we are using a maximum length of 127 characters per line

…non-portable schedules Previously, using closures (e.g., lambdas) for learning_rate or clip_range caused segmentation faults when loading models across different platforms (e.g., macOS to Linux), because cloudpickle could not safely serialize/deserialize them. This commit rewrites: - `constant_fn` as a `ConstantSchedule` class - `get_schedule_fn` as a `FloatConverterSchedule` class - `get_linear_fn` as a `LinearSchedule` class All schedules are now proper callable classes, making them portable and safely pickleable. Old functions are kept (marked as deprecated) for backward compatibility when loading existing models.

araffin · 2025-05-05T08:09:24Z

stable_baselines3/common/base_class.py

@@ -273,7 +273,7 @@ def logger(self) -> Logger:

    def _setup_lr_schedule(self) -> None:
        """Transform to callable if needed."""
-        self.lr_schedule = get_schedule_fn(self.learning_rate)
+        self.lr_schedule = FloatConverterSchedule(self.learning_rate)


I would rather have a new get_schedule helper here because FloatConverterSchedule does more than what its name suggest

PS: you don't have to force push for every edit, you can simply push new commits to the same branch

or maybe rename FloatConverterSchedule to something else (the issue is that Schedule type is already taken... maybe FloatSchedule?

I have renamed it to FloatSchedule, as recommended, but maybe a ScheduleWrapper could be a better name, since it is basically a wrapper that ensures that a constant is transformed to Schedule, and ensures that any callable returning float values.

I wanted to change the original logic as minimal as possible, but maybe the original function is not clean enough, since it is doing multiple things.

araffin · 2025-05-05T08:16:37Z

stable_baselines3/common/utils.py

@@ -78,6 +78,35 @@ def update_learning_rate(optimizer: th.optim.Optimizer, learning_rate: float) ->
        param_group["lr"] = learning_rate


+class FloatConverterSchedule:


This class actually does more no? it enforces that we have a schedule object that can be pickled, no?
Maybe rename it to FloatSchedule and update the docstring to explain its utility (in addition to casting to float)

same comment as above

stable_baselines3/common/utils.py

araffin

thanks for the PR =)
Looks good overall, some minor comments only. I'll try to test it myself later.

PS: we would need to update SB3 contrib too after

- Renamed FloatConverterSchedule to FloatSchedule to better reflect its purpose. - Moved parameter documentation to the class-level docstring for proper Sphinx support

araffin

LGTM, thanks =)

Once it is merged, could you also open a PR for SB3 contrib and the RL Zoo?

akanto · 2025-05-15T08:03:40Z

Thanks, I can send a pull request. I guess a commit is not enough, but the stable-baslines3 release of 2.6.1 is required to update the dependencies: https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/blob/master/setup.py#L70, and after that, I can update all get_schedule_fn etc. functions there.

araffin · 2025-05-15T09:07:01Z

I've released an alpha that you can use:
https://pypi.org/project/stable-baselines3/2.6.1a1/

akanto mentioned this pull request Apr 27, 2025

[Bug]: Segmentation fault when continuing training on another machine due to non-portable serialization of learning_rate and clip_range #2115

Closed

5 tasks

akanto force-pushed the save-load-portability branch from 52b6ad1 to 63cfb2e Compare April 27, 2025 16:00

akanto force-pushed the save-load-portability branch from 63cfb2e to a6d8c07 Compare April 27, 2025 16:01

araffin changed the title ~~Fixes #2115. Avoid segmentation fault when loading models with non-po…~~ Use classes instead of lambas for schedules Apr 28, 2025

araffin self-requested a review May 5, 2025 08:06

araffin reviewed May 5, 2025

View reviewed changes

stable_baselines3/common/utils.py Outdated Show resolved Hide resolved

araffin reviewed May 5, 2025

View reviewed changes

akanto and others added 2 commits May 10, 2025 14:58

Incorporate pull request comments:

161cbfc

- Renamed FloatConverterSchedule to FloatSchedule to better reflect its purpose. - Moved parameter documentation to the class-level docstring for proper Sphinx support

Merge branch 'master' into save-load-portability

77cde66

araffin self-requested a review May 12, 2025 18:46

araffin changed the title ~~Use classes instead of lambas for schedules~~ Use classes instead of lambdas for schedules May 14, 2025

araffin added 2 commits May 14, 2025 12:55

Update changelog and test

0a4cce1

Add more tests and deprecate explicitely the lambdas

5efea4e

araffin approved these changes May 14, 2025

View reviewed changes

araffin merged commit f9c4ca5 into DLR-RM:master May 14, 2025
4 checks passed

akanto mentioned this pull request May 17, 2025

Use classes for schedules instead of lambdas Stable-Baselines-Team/stable-baselines3-contrib#294

Merged

15 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use classes instead of lambdas for schedules #2125

Use classes instead of lambdas for schedules #2125

Uh oh!

akanto commented Apr 27, 2025

Uh oh!

araffin May 5, 2025

Uh oh!

araffin May 5, 2025

Uh oh!

akanto May 10, 2025

Uh oh!

araffin May 5, 2025

Uh oh!

akanto May 10, 2025

Uh oh!

Uh oh!

araffin left a comment

Uh oh!

araffin left a comment

Uh oh!

Uh oh!

akanto commented May 15, 2025

Uh oh!

araffin commented May 15, 2025

Uh oh!

Uh oh!

		@@ -78,6 +78,35 @@ def update_learning_rate(optimizer: th.optim.Optimizer, learning_rate: float) ->
		param_group["lr"] = learning_rate


		class FloatConverterSchedule:

Use classes instead of lambdas for schedules #2125

Use classes instead of lambdas for schedules #2125

Uh oh!

Conversation

akanto commented Apr 27, 2025

Description

Motivation and Context

Types of changes

Checklist

Uh oh!

araffin May 5, 2025

Choose a reason for hiding this comment

Uh oh!

araffin May 5, 2025

Choose a reason for hiding this comment

Uh oh!

akanto May 10, 2025

Choose a reason for hiding this comment

Uh oh!

araffin May 5, 2025

Choose a reason for hiding this comment

Uh oh!

akanto May 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

araffin left a comment

Choose a reason for hiding this comment

Uh oh!

araffin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

akanto commented May 15, 2025

Uh oh!

araffin commented May 15, 2025

Uh oh!

Uh oh!