Compare commits

...

3 Commits

Author SHA1 Message Date
Andrew Kean Gao
1094ba77eb
Merge 1f4350143d into 7050ed204b 2024-03-19 08:53:59 -07:00
Eddy
7050ed204b
Corrected name of package "cuda12-pip" (#194)
The `cuda12-pip` package was wrongly named `cuda12_pip`
in requirements.txt
2024-03-19 08:48:22 -07:00
Andrew Kean Gao
1f4350143d
Update README.md to specify the active parameters
This is important because users may think they need to have the full 314B at one time, but only 86B which is much more manageable!
2024-03-18 10:41:38 -07:00
2 changed files with 2 additions and 2 deletions

View File

@ -22,7 +22,7 @@ The implementation of the MoE layer in this repository is not efficient. The imp
Grok-1 is currently designed with the following specifications:
- **Parameters:** 314B
- **Parameters:** 314B (86B active)
- **Architecture:** Mixture of 8 Experts (MoE)
- **Experts Utilization:** 2 experts used per token
- **Layers:** 64

View File

@ -1,4 +1,4 @@
dm_haiku==0.0.12
jax[cuda12_pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
jax[cuda12-pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
numpy==1.26.4
sentencepiece==0.2.0