sb3-extra-buffers documentation
Memory-efficient Stable-Baselines3 buffers
sb3-extra-buffers is an unofficial collection of extra
Stable-Baselines3 buffer
classes. Its main goal is simple: keep large reinforcement-learning buffers
small enough to use comfortably by compressing observations before they are
stored.
The package is designed for reinforcement-learning workloads with large replay or rollout buffers, especially Atari-style image observations where storing raw frames can dominate memory use. As shown in benchmarks on Atari games <benchmarks>, the best configurations reduce memory use by more than 95% while keeping sampling latency close to the uncompressed SB3 buffers.
Why compress buffers?
Many RL runs store hundreds of thousands or millions of observations. For image observations, the buffer can become the dominant memory cost long before the policy network does. This project targets that bottleneck while preserving the normal SB3 integration pattern:
pass a custom buffer class to an SB3 algorithm,
pass compression options through
rollout_buffer_kwargsorreplay_buffer_kwargs,continue training with the normal SB3 workflow.
Lossless compression is most useful when observations contain repeated or
structured values. Good candidates include semantic-segmentation masks, color
palette game frames, grayscale observations, and many RGB observations. For
noisy RGB input, zstd is usually a stronger default than run-length
encoding.
Project links
Package layout
sb3_extra_buffers.compressedCompressed replay and rollout buffers, compressed arrays, dtype helpers, and compression backend discovery.
sb3_extra_buffers.recordingCircular recording buffers for frames, rewards, actions, and optional features, plus frameless and dummy variants.
sb3_extra_buffers.training_utilsAtari environment creation, evaluation, and replay-buffer warm-up helpers.
sb3_extra_buffers.vec_bufandsb3_extra_buffers.gpu_buffersExperimental helpers for multi-buffer delegation and GPU-oriented byte storage.
How do I know the compressed buffers are implemented correctly?
The repository includes tested example training scripts. After 10M steps with
rle-jit compression, evaluation on 10,000 episodes produced the rewards
documented in Validation.
Citing this project
If you use this project in your research or work, please cite:
@article{Huang2025EnhancingRL,
title={Enhancing Reinforcement Learning in 3D Environments through Semantic Segmentation: A Case Study in ViZDoom},
author={Hugo Huang},
journal={ArXiv},
year={2025},
volume={abs/2511.11703},
url={https://arxiv.org/abs/2511.11703},
}
I really appreciate it :-)
What is included
User guide
Indices
The Index lists documented names in alphabetical order.