maddes8cht commited on
Commit
5274b2e
Β·
1 Parent(s): e8dddbe

Revise README.md to emphasize long-term data accumulation and clarify data philosophy

Browse files
Files changed (1) hide show
  1. README.md +46 -44
README.md CHANGED
@@ -7,84 +7,86 @@ sdk: static
7
  pinned: false
8
  ---
9
 
10
- # πŸ“Š Traders-Lab β€” Open Financial Time Series Data
11
 
12
- Traders-Lab publishes **public financial time series datasets** with a strong focus on **high-quality intraday data accumulation** over extended periods of time.
13
 
14
- The primary goal is not short-term freshness, but **long-term continuity and gap-free historical depth**, especially for minute-level data.
15
 
16
  ---
17
 
18
- ## πŸ“’ Announcement
19
 
20
- **A major update will be released today (December 17. 2025) after the US market close.**
21
 
22
- With this release, the long-running *β€œPreliminary”* phase will be **officially concluded**.
23
- A new dataset named **TroveLedger** will mark the transition to a stable and consolidated dataset line.
 
 
24
 
25
- Earlier *Preliminary* datasets will remain available temporarily to allow a smooth transition.
 
 
 
26
 
27
  ---
28
 
29
- ## πŸ”‘ Core Focus: Accumulated Minute-Level Data
30
 
31
- High-quality **minute-resolution OHLC data over long time spans** is difficult to obtain from free sources.
32
 
33
- Typical public data access (e.g. via yfinance) provides:
 
34
 
35
- * **Daily candles:** often spanning decades
36
- * **Hourly candles:** approximately one year into the past
37
- * **Minute candles:** typically limited to the most recent 7 days
38
 
39
- This makes freshly downloaded minute data unsuitable for training models that rely on **historical intraday patterns**.
 
40
 
41
- The key value of the datasets published here lies in **continuous accumulation**:
 
42
 
43
- * Minute-level data is collected day by day
44
- * Over time, this results in **months of gap-free minute data**
45
- * This provides a fundamentally different foundation for training and evaluation than repeatedly downloading short rolling windows
46
 
47
  ---
48
 
49
- ## πŸ”„ Update Philosophy
50
 
51
- The primary guarantee is **data continuity**, not update frequency.
52
 
53
- Specifically:
 
 
54
 
55
- * Daily updates are **not guaranteed**
56
- * The absence of **gaps** in accumulated minute data **is** the main objective
57
- * Updates are performed on trading days whenever possible
58
 
59
- All data updates are designed to **extend existing time series**, not to replace them.
60
 
61
  ---
62
 
63
- ## ⏱️ Update Rotation & Data Freshness
64
-
65
- To balance data quality, processing time, and responsible use of public data sources:
66
 
67
- * **Minute data** is updated most frequently to ensure continuity
68
- * **Hourly and daily data** follow a rotation-based update schedule
69
- * Hourly and daily datasets are guaranteed to be **no older than one week**
70
 
71
- This approach significantly reduces unnecessary repeated requests while remaining fully sufficient for training purposes.
 
 
 
72
 
73
- In real-world usage, models are typically deployed using live data feeds from the target trading platform, which naturally provide up-to-date market data.
74
 
75
  ---
76
 
77
- ## 🎯 Intended Use
78
-
79
- The datasets are intended for:
80
 
81
- * machine learning on financial time series
82
- * intraday and swing trading research
83
- * feature engineering on accumulated OHLC data
84
- * backtesting strategies that benefit from dense historical intraday data
85
 
86
- ---
 
 
 
87
 
88
- ## πŸ” Further Information
89
 
90
- Detailed structure descriptions, usage examples, and dataset-specific notes can be found in the individual dataset cards.
 
7
  pinned: false
8
  ---
9
 
10
+ # πŸ“Š Traders-Lab β€” Accumulated Financial Time Series
11
 
12
+ Traders-Lab publishes **public financial time series datasets** with a deliberate focus on **long-term accumulation**, **structural consistency**, and **historical depth**, rather than short-term freshness.
13
 
14
+ The organization exists to build and maintain datasets that grow *quietly and continuously* over time β€” forming a reliable archival foundation for research, modeling, and long-horizon analysis.
15
 
16
  ---
17
 
18
+ ## 🧭 Core Principle: Accumulation over Freshness
19
 
20
+ High-quality intraday market data is readily available only in short rolling windows from most public sources.
21
 
22
+ Typical access patterns provide:
23
+ - **Daily candles** over long historical ranges
24
+ - **Hourly candles** with limited depth
25
+ - **Minute-level data** restricted to a few recent days
26
 
27
+ Such data is unsuitable for workflows that depend on **historical intraday structure**, regime shifts, or long-term pattern persistence.
28
+
29
+ Traders-Lab addresses this limitation by **accumulating minute-level OHLC data incrementally**, day by day.
30
+ Over time, this approach produces **months and eventually years of gap-free intraday history** β€” something that cannot be reconstructed retroactively.
31
 
32
  ---
33
 
34
+ ## 🧱 Data Philosophy
35
 
36
+ The datasets published here follow a small set of strict principles:
37
 
38
+ - **Continuity over update frequency**
39
+ Updates extend existing time series rather than replacing them.
40
 
41
+ - **Structure over convenience**
42
+ Data is kept uniform across markets and timeframes.
 
43
 
44
+ - **Archival integrity**
45
+ Once recorded, historical data is preserved as part of a growing ledger.
46
 
47
+ - **Responsible sourcing**
48
+ Public data sources are used conservatively, avoiding unnecessary repeated requests.
49
 
50
+ Freshness is treated as a *secondary concern*; continuity is the primary guarantee.
 
 
51
 
52
  ---
53
 
54
+ ## ⏱️ Update Rotation & Granularity
55
 
56
+ To preserve long-term continuity while keeping data collection sustainable:
57
 
58
+ - **Minute-level data** is updated most frequently to minimize the risk of gaps
59
+ - **Hourly and daily data** follow a relaxed, rotation-based schedule
60
+ - Non-minute data is maintained to remain **reasonably recent**, without aiming for real-time freshness
61
 
62
+ Update timing and frequency are intentionally flexible and may vary over time as data sources, markets, and operational constraints change.
 
 
63
 
64
+ In practical applications, models trained on these datasets are expected to consume **live data from their execution environment**, not from the archive itself.
65
 
66
  ---
67
 
68
+ ## 🎯 Intended Use
 
 
69
 
70
+ The datasets are designed for:
 
 
71
 
72
+ - machine learning on financial time series
73
+ - intraday and swing trading research
74
+ - feature engineering on accumulated OHLC data
75
+ - backtesting strategies that benefit from dense intraday history
76
 
77
+ They are **not** designed to provide trading signals, indicators, opinions, or market commentary.
78
 
79
  ---
80
 
81
+ ## πŸ—‚οΈ Primary Dataset Line: TroveLedger
 
 
82
 
83
+ The principles outlined above are realized in **TroveLedger**, the primary dataset line maintained by Traders-Lab.
 
 
 
84
 
85
+ TroveLedger is a structured, continuously expanding collection of market indices and exchanges, unified by:
86
+ - consistent OHLC schemas
87
+ - multiple time resolutions
88
+ - long-term intraday accumulation
89
 
90
+ Each market is added deliberately and preserved as part of an expanding historical record.
91
 
92
+ Detailed market coverage, recent additions, and dataset-specific notes are documented in the TroveLedger dataset card.