add references to blog in readme
Datta0pushed 1 commit to master • 15e4d0c…40f4a87 • on Feb 4
Update readme to reflect current state
Datta0pushed 1 commit to master • 35cfab1…15e4d0c • on Jan 22
[WIP] checkpoint loading [2/n]
Datta0pushed 1 commit to master • f306d2c…35cfab1 • on Jan 21
Datta0pushed 2 commits to master • 808e294…f306d2c • on Jan 14
Remove unnecessary data files
Datta0pushed 1 commit to master • b51895d…808e294 • on Jan 6
Add Multi Latent Attention from deepseek v2.5 paper
Datta0pushed 1 commit to master • 045c4aa…b51895d • on Jan 5
add way to estimate params and token count without running training
Force push
Datta0force pushed to master • 97b7a50…045c4aa • on Nov 26, 2024
add way to estimate params and token count without running training
Datta0pushed 1 commit to master • dbf0cfe…97b7a50 • on Nov 26, 2024
Update config to maximise GPU memory usage
Datta0pushed 1 commit to master • df87566…dbf0cfe • on Nov 24, 2024
Datta0pushed 2 commits to master • af785bf…df87566 • on Nov 24, 2024
Add Differential Transformer
Datta0pushed 1 commit to master • 902dbff…af785bf • on Nov 14, 2024
Datta0force pushed to master • f58722f…902dbff • on Nov 7, 2024
Cleanup code and change tokenizer
Datta0pushed 2 commits to master • edcdbec…f58722f • on Nov 5, 2024
Datta0pushed 2 commits to master • 9758682…edcdbec • on Nov 5, 2024
You can’t perform that action at this time.