Commit History

Replace snapshot_download with level-by-level listing for agent files.
e419e83
Running

alexgshaw Claude Opus 4.6 (1M context) commited on

Add reward hacking validation check.
32096df

alexgshaw Claude Opus 4.6 (1M context) commited on

Fix missing await on get_public_url causing all trial inserts to fail
19076ca

alexgshaw Claude Opus 4.6 (1M context) commited on

Update the importer to be asyncio compatible.
7028cf1

alexgshaw commited on

Log git stderr on clone/fetch failures for easier debugging
a659e57

alexgshaw Claude Opus 4.6 (1M context) commited on

Update DATASET_REPO to harborframework org
83a50bb

alexgshaw Claude Opus 4.6 commited on

Return webhook response immediately, process in background thread
1aba200

alexgshaw Claude Opus 4.6 commited on

Add concurrent uploads and also validation.
81ba47f

alexgshaw commited on

Upload trial archives and trajectories to Supabase storage on import
b35fe2e

alexgshaw Claude Opus 4.6 commited on

Fix the dockerfile.
c059616

alexgshaw commited on

Update the importer
41ac48b

alexgshaw commited on

Include accuracy in PR validation comments
ca6d4f0

alexgshaw Claude Opus 4.5 commited on

Only validate submissions that changed in PR
e3a318e

alexgshaw Claude Opus 4.5 commited on

Fix trial job_id foreign key constraint and add validation
55c3ce8

alexgshaw Claude Opus 4.5 commited on

Use raw dict for trial results to support custom environment types
de835c3

alexgshaw Claude Opus 4.5 commited on

Use raw dict for job config to support custom environment types
6984e7a

alexgshaw Claude Opus 4.5 commited on

Add comprehensive logging throughout import process
e7e8bf7

alexgshaw Claude Opus 4.5 commited on

Use insert with ignore_duplicates for models
b00c92d

alexgshaw Claude Opus 4.5 commited on

Fix config.json parsing and add model token cost lookup
8d8264a

alexgshaw Claude Opus 4.5 commited on

Detect PR from updatedRefs when discussion payload is missing
a83decb

alexgshaw Claude Opus 4.5 commited on

Remove comment deduplication code
9f62379

alexgshaw Claude Opus 4.5 commited on

Add comment deduplication to prevent duplicate validation comments
0caa873

alexgshaw Claude Opus 4.5 commited on

Improve trial validation and remove redundant agent_name from metadata
753ba50

alexgshaw Claude Opus 4.5 commited on

Validate trial result.json has required fields instead of strict schema validation
05ff348

alexgshaw commited on

Fix validation: use lenient JSON check for trials, limit error count in comments
aec5bc2

alexgshaw commited on

Use lenient validation to allow custom environment types
3293a2e

alexgshaw commited on

Use git clone instead of snapshot_download for much faster downloads
2c1a94c

alexgshaw commited on

Fix PR detection: check isPullRequest field and add logging
d42c3a8

alexgshaw commited on

Fix PR download: use refs/pr/N instead of head.sha
5594e86

alexgshaw commited on

Fix repo name comparison - name already includes namespace
f9ea234

alexgshaw commited on

Fix webhook: use WebhooksServer with custom UI instead of conflicting decorator
5bbb060

alexgshaw commited on

Fix Docker deployment: bind to 0.0.0.0:7860
11b7a05

alexgshaw commited on

Fix Pydantic deprecation warning: use ConfigDict instead of class Config
678df56

alexgshaw commited on

Switch to Docker SDK to fix multipart package conflict
cb1ca0a

alexgshaw commited on

Add python-multipart to fix gradio import error
4d47374

alexgshaw commited on

Specify Python 3.12 for harbor package compatibility
8aa6781

alexgshaw commited on

Use job result.json for started_at, ended_at, stats; add package_version
ca79348

alexgshaw commited on

Use harbor package for JobConfig and TrialResult models
7b36a69

alexgshaw commited on

Initial importer Space with webhook handler
b56d969

alexgshaw commited on

initial commit
cccd3fc
verified

alexgshaw commited on