Closed
Description
Why is there no frame stacking in the Atari wrapper but only max-pooling over the last 2 observations?
I can see that OpenAI's gym does the same thing with AtariPreprocessing class.
The original DM's DQN paper had max-pooling as well but it stacked 3-5 frames before feeding it into DQN.
I saw that the results of SB3's DQN match that of SB2 but does it match DM's implementation?
Could you point me to a paper/result that showed that this doesn't affect the performance on Atari?
And if so why did DM use it in their paper?
Thank you!