Skip to content

Fix env.reset() to return (obs, info) tuple for SB3 v2.0+ compatibility#1402

Open
he-yufeng wants to merge 2 commits intoAI4Finance-Foundation:masterfrom
he-yufeng:fix/sb3-reset-api
Open

Fix env.reset() to return (obs, info) tuple for SB3 v2.0+ compatibility#1402
he-yufeng wants to merge 2 commits intoAI4Finance-Foundation:masterfrom
he-yufeng:fix/sb3-reset-api

Conversation

@he-yufeng
Copy link
Copy Markdown

Summary

Four environments' reset() methods return only the observation instead of the (obs, info) tuple required by the Gymnasium API (and SB3 v2.0+), causing ValueError: too many values to unpack (expected 2) when used with DummyVecEnv.

Fixed Environments

  • env_btc_ccxt.py (BitcoinEnv)
  • env_multiple_crypto.py (CryptoEnv)
  • env_stocktrading_stoploss.py (StockTradingEnvStopLoss)
  • env_stocktrading_cashpenalty.py (StockTradingEnvCashpenalty)

Note: env_stocktrading.py and env_stocktrading_np.py already return (obs, {}) correctly.

Change

Each fix is a single-line change: return statereturn state, {}

Test plan

  • Verify DummyVecEnv wrapping works with each fixed environment
  • Verify training with SB3 PPO/A2C works without the unpacking error

Fixes #1051

🤖 Generated with Claude Code

何宇峰 and others added 2 commits March 2, 2026 04:08
…ompatibility

Stable Baselines 3 v2.0+ expects env.reset() to return a (obs, info)
tuple following the Gymnasium API. Four environments still returned only
the observation, causing "too many values to unpack (expected 2)" when
used with SB3's DummyVecEnv.

Fixed environments:
- env_btc_ccxt.py (BitcoinEnv)
- env_multiple_crypto.py (CryptoEnv)
- env_stocktrading_stoploss.py (StockTradingEnvStopLoss)
- env_stocktrading_cashpenalty.py (StockTradingEnvCashpenalty)

Fixes AI4Finance-Foundation#1051

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@atharvajoshi01 atharvajoshi01 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SB3 v2.0 changed the reset() API to return (obs, info) tuple. This fixes all the env classes to match. Clean and necessary for anyone using stable-baselines3 >= 2.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

2 participants