Skip to content

Activity

add references to blog in readme

Datta0pushed 1 commit to master • 15e4d0c…40f4a87 • 
on Feb 4

Update readme to reflect current state

Datta0pushed 1 commit to master • 35cfab1…15e4d0c • 
on Jan 22

[WIP] checkpoint loading [2/n]

Datta0pushed 1 commit to master • f306d2c…35cfab1 • 
on Jan 21

[WIP] inference

Datta0pushed 2 commits to master • 808e294…f306d2c • 
on Jan 14

Remove unnecessary data files

Datta0pushed 1 commit to master • b51895d…808e294 • 
on Jan 6

Add Multi Latent Attention from deepseek v2.5 paper

Datta0pushed 1 commit to master • 045c4aa…b51895d • 
on Jan 5

add way to estimate params and token count without running training

Force push
Datta0force pushed to master • 97b7a50…045c4aa • 
on Nov 26, 2024

add way to estimate params and token count without running training

Datta0pushed 1 commit to master • dbf0cfe…97b7a50 • 
on Nov 26, 2024

Update config to maximise GPU memory usage

Datta0pushed 1 commit to master • df87566…dbf0cfe • 
on Nov 24, 2024

Fixup nGPT

Datta0pushed 2 commits to master • af785bf…df87566 • 
on Nov 24, 2024

Add Differential Transformer

Datta0pushed 1 commit to master • 902dbff…af785bf • 
on Nov 14, 2024

ReadME

Force push
Datta0force pushed to master • f58722f…902dbff • 
on Nov 7, 2024

Cleanup code and change tokenizer

Datta0pushed 2 commits to master • edcdbec…f58722f • 
on Nov 5, 2024

Add wandb logging

Datta0pushed 2 commits to master • 9758682…edcdbec • 
on Nov 5, 2024

[WIP] Initial setup

Datta0created master • 9758682 • 
on Nov 4, 2024