FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance
Paper β’ 2011.09607 β’ Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
A deep reinforcement learning-based autonomous cryptocurrency trading agent using Soft Actor-Critic (SAC) with multi-timeframe OHLCV analysis.
crypto_trading_ai/
βββ README.md
βββ requirements.txt
βββ config/
β βββ config.yaml # All hyperparameters & settings
βββ data/
β βββ downloader.py # OHLCV data download (ccxt)
β βββ features.py # Technical indicators & feature engineering
β βββ normalizer.py # Normalization pipeline [-1, 1]
βββ env/
β βββ trading_env.py # Gymnasium trading environment
β βββ multi_timeframe.py # Multi-timeframe data alignment
βββ agent/
β βββ networks.py # Actor & Critic neural networks (CNN-LSTM)
β βββ sac.py # SAC algorithm implementation
β βββ replay_buffer.py # Prioritized experience replay
βββ train.py # Training entry point
βββ evaluate.py # Backtesting & evaluation
βββ utils/
βββ logger.py # Training logger & TensorBoard
βββ helpers.py # Utility functions
(3 timeframes Γ 64 lookback Γ 21 features) + 5 portfolio state = 4,037 dimensions| Category | Features |
|---|---|
| Price Returns | returns, log_returns, high_low_range, open_close_range, upper_shadow, lower_shadow |
| Volume | volume_ratio (log ratio to rolling mean) |
| Trend | SMA(10,30), EMA(10,30) β as % distance from close |
| Momentum | RSI(14), Stochastic K/D, MACD/Signal/Histogram |
| Volatility | Bollinger Bands (upper/lower relative), ATR(14) as % of close |
| Volume | OBV rate of change |
| Dimension | Range | Meaning |
|---|---|---|
| direction | [-1, 1] | -1=full short, 0=neutral, 1=full long |
| position_size | [0, 1] | Fraction of capital to allocate |
| leverage | [1x, 10x] | Leverage multiplier |
reset()/step() API# Clone and install
pip install -r requirements.txt
# Option A: Download real data from Binance
python -m crypto_trading_ai.data.downloader --config config/config.yaml
# Option B: Generate synthetic data for testing
python -m crypto_trading_ai.data.downloader --config config/config.yaml --synthetic
# Train (auto-generates synthetic data if no CSV files found)
cd crypto_trading_ai
python train.py --synthetic
# Evaluate / Backtest
python evaluate.py --synthetic --plot
# The training is designed for GPU acceleration
# On a T4 GPU: ~300 steps/s with default config
# On CPU: ~30 steps/s with reduced config
python train.py --config config/config.yaml
Edit config/config.yaml to customize everything:
| Section | Key Parameters |
|---|---|
| Data | symbol, exchange, timeframes, train/val/test splits |
| Features | lookback_window, indicators list |
| Environment | initial_capital, max_leverage, commission, reward type |
| Agent | CNN/LSTM dims, SAC hyperparams (lr, gamma, tau, alpha) |
| Training | total_timesteps, eval/save frequency |
After training, the evaluation generates:
Raw OHLCV β Price-relative features β Rolling Z-score β Clip to [-1, 1]
[0,100] β [-1,1] via linear rescalez/3 and clipsearchsorted (O(n log n))MIT