StepLaw/StepLaw-N_1.0B-D_1.0B-LR1.381e-03-BS262144
Text Generation
•
1B
•
Updated
•
6
Models with 1.0B parameters trained with 1.0B tokens. Architecture: H=2048, FFN=8192, Heads=16, Layers=16.