Recurrent layer support for Online policy algorithms #550

Bharath2 · 2021-08-22T05:55:09Z

🚀 Feature

Please add Recurrent layer Support for PPO and A2C

Motivation

I am working on agile 3d biped locomotion. I feel that PPO with a recurrent layer like LSTM will greatly help.

Pitch

Stable-baselines tensorflow version have options like MlpLSTMPolicy for algorithms like PPO2 and A2C.
I can see in the source code of stable-baselines3 that the predict method of BasePolicy class has some provision for recurrent policies, But the OnPolicyAlgorithm class does not take that in to account.

I would also like to contribute to this.

Checklist

[y] I have checked that there is no similar issue in the repo

Miffyli · 2021-08-22T10:02:42Z

Duplicate of #18 and #160, closing as such.

Yup they would be cool to have, but nobody has had the time to come around to implement them yet (also the benefit of LSTMs outside complex tasks is questionable, see this and bottom of this). Try framestacking meanwhile.

Bharath2 added the enhancement New feature or request label Aug 22, 2021

Miffyli closed this as completed Aug 22, 2021

Miffyli added the duplicate This issue or pull request already exists label Aug 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Recurrent layer support for Online policy algorithms #550

Recurrent layer support for Online policy algorithms #550

Bharath2 commented Aug 22, 2021 •

edited

Loading

Miffyli commented Aug 22, 2021 •

edited

Loading

Uh oh!

Recurrent layer support for Online policy algorithms #550

Recurrent layer support for Online policy algorithms #550

Comments

Bharath2 commented Aug 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🚀 Feature

Motivation

Pitch

Checklist

Miffyli commented Aug 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Bharath2 commented Aug 22, 2021 •

edited

Loading

Miffyli commented Aug 22, 2021 •

edited

Loading