Look ahead bias of CryptoTradingEnv in Chapter 4 #39
-
In Quant finance, look ahead bias in training and back-test is a often pitfall. When you create CryptoTradingEnv, Observation token each step is Tensorflow-2-Reinforcement-Learning-Cookbook/Chapter04/crypto_trading_env.py Lines 138 to 141 in 31f8376 I think |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Same issue applies to Chapter 5. |
Beta Was this translation helpful? Give feedback.
-
Good point! The For example, The agent would observe the ticker data for the past 30 days/hours/minutes in one observation. Practically, when using a live-stream of ticker data (from an exchange server), we would buffer 30 data points before we can start getting action predictions from the agent.
No. This doesn't mean the agent is really looking-ahead. The intended/correct interpretation is to treat the horizon as historical data fed to the agent (not future/look-ahead data).
Both are equivalent. I made the choice to use the first option since that's simpler to represent in a data frame and matches real data streams (ascending order in time) from exchanges. Hope that addresses your concern? |
Beta Was this translation helpful? Give feedback.
-
I see. Thanks for answering. |
Beta Was this translation helpful? Give feedback.
Good point!
The
current_step
used in this RL context is just an abstraction for the MDP step number. There's really no representation for current (time) step. So, thehorizon
used in the above recipe in practical terms is equivalent to thehistory
and not the future /look-ahead
.For example,
Assuming a value of
30
for the horizon as per the default config in the recipe:Tensorflow-2-Reinforcement-Learning-Cookbook/Chapter04/crypto_trading_env.py
Lines 21 to 22 in 31f8376
…