maddes8cht commited on
Commit
e8dddbe
Β·
verified Β·
1 Parent(s): afc6640

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -12
README.md CHANGED
@@ -7,23 +7,84 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- # Traders Lab
11
 
12
- 🚧 **Note (from April 30. 2025) : This organization is still being set up. No datasets are available yet, but they will be uploaded soon – expected within the comming month (May 2025). Stay tuned!**
13
 
14
- **Traders-lab** is a collection of free, structured, and up-to-date financial datasets for training machine learning models in algorithmic trading.
15
 
16
- ## 🧠 Motivation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
- Finding clean and openly licensed trading data can be hard. This project focuses on making such datasets available in different key formats:
19
 
20
- - βœ… **Daily candles** over several years
21
- - βœ… **Hour-level candles** for at least 2 years (growing over time)
22
- - βœ… **Minute-level candles** for recent history
23
 
24
- By providing both, I aim to support models that can learn across **multiple timescales** β€” understanding how coarse and fine views of the market relate.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
25
 
26
- ## πŸ› οΈ Features
27
 
28
- - Consistently structured datasets for reproducibility
29
- - Regular updates to keep training data current
 
7
  pinned: false
8
  ---
9
 
10
+ # πŸ“Š Traders-Lab β€” Open Financial Time Series Data
11
 
12
+ Traders-Lab publishes **public financial time series datasets** with a strong focus on **high-quality intraday data accumulation** over extended periods of time.
13
 
14
+ The primary goal is not short-term freshness, but **long-term continuity and gap-free historical depth**, especially for minute-level data.
15
 
16
+ ---
17
+
18
+ ## πŸ“’ Announcement
19
+
20
+ **A major update will be released today (December 17. 2025) after the US market close.**
21
+
22
+ With this release, the long-running *β€œPreliminary”* phase will be **officially concluded**.
23
+ A new dataset named **TroveLedger** will mark the transition to a stable and consolidated dataset line.
24
+
25
+ Earlier *Preliminary* datasets will remain available temporarily to allow a smooth transition.
26
+
27
+ ---
28
+
29
+ ## πŸ”‘ Core Focus: Accumulated Minute-Level Data
30
+
31
+ High-quality **minute-resolution OHLC data over long time spans** is difficult to obtain from free sources.
32
+
33
+ Typical public data access (e.g. via yfinance) provides:
34
+
35
+ * **Daily candles:** often spanning decades
36
+ * **Hourly candles:** approximately one year into the past
37
+ * **Minute candles:** typically limited to the most recent 7 days
38
+
39
+ This makes freshly downloaded minute data unsuitable for training models that rely on **historical intraday patterns**.
40
+
41
+ The key value of the datasets published here lies in **continuous accumulation**:
42
+
43
+ * Minute-level data is collected day by day
44
+ * Over time, this results in **months of gap-free minute data**
45
+ * This provides a fundamentally different foundation for training and evaluation than repeatedly downloading short rolling windows
46
+
47
+ ---
48
 
49
+ ## πŸ”„ Update Philosophy
50
 
51
+ The primary guarantee is **data continuity**, not update frequency.
 
 
52
 
53
+ Specifically:
54
+
55
+ * Daily updates are **not guaranteed**
56
+ * The absence of **gaps** in accumulated minute data **is** the main objective
57
+ * Updates are performed on trading days whenever possible
58
+
59
+ All data updates are designed to **extend existing time series**, not to replace them.
60
+
61
+ ---
62
+
63
+ ## ⏱️ Update Rotation & Data Freshness
64
+
65
+ To balance data quality, processing time, and responsible use of public data sources:
66
+
67
+ * **Minute data** is updated most frequently to ensure continuity
68
+ * **Hourly and daily data** follow a rotation-based update schedule
69
+ * Hourly and daily datasets are guaranteed to be **no older than one week**
70
+
71
+ This approach significantly reduces unnecessary repeated requests while remaining fully sufficient for training purposes.
72
+
73
+ In real-world usage, models are typically deployed using live data feeds from the target trading platform, which naturally provide up-to-date market data.
74
+
75
+ ---
76
+
77
+ ## 🎯 Intended Use
78
+
79
+ The datasets are intended for:
80
+
81
+ * machine learning on financial time series
82
+ * intraday and swing trading research
83
+ * feature engineering on accumulated OHLC data
84
+ * backtesting strategies that benefit from dense historical intraday data
85
+
86
+ ---
87
 
88
+ ## πŸ” Further Information
89
 
90
+ Detailed structure descriptions, usage examples, and dataset-specific notes can be found in the individual dataset cards.