You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please add Recurrent layer Support for PPO and A2C
Motivation
I am working on agile 3d biped locomotion. I feel that PPO with a recurrent layer like LSTM will greatly help.
Pitch
Stable-baselines tensorflow version have options like MlpLSTMPolicy for algorithms like PPO2 and A2C.
I can see in the source code of stable-baselines3 that the predict method of BasePolicy class has some provision for recurrent policies, But the OnPolicyAlgorithm class does not take that in to account.
I would also like to contribute to this.
Checklist
[y] I have checked that there is no similar issue in the repo
The text was updated successfully, but these errors were encountered:
Yup they would be cool to have, but nobody has had the time to come around to implement them yet (also the benefit of LSTMs outside complex tasks is questionable, see this and bottom of this). Try framestacking meanwhile.
Uh oh!
There was an error while loading. Please reload this page.
🚀 Feature
Please add Recurrent layer Support for PPO and A2C
Motivation
I am working on agile 3d biped locomotion. I feel that PPO with a recurrent layer like LSTM will greatly help.
Pitch
Stable-baselines tensorflow version have options like MlpLSTMPolicy for algorithms like PPO2 and A2C.
I can see in the source code of stable-baselines3 that the predict method of BasePolicy class has some provision for recurrent policies, But the OnPolicyAlgorithm class does not take that in to account.
I would also like to contribute to this.
Checklist
The text was updated successfully, but these errors were encountered: