Compare commits

...

3 Commits

Author SHA1 Message Date
1094ba77eb Merge 1f4350143d into 7050ed204b 2024-03-19 08:53:59 -07:00
7050ed204b Corrected name of package "cuda12-pip" (#194)
The `cuda12-pip` package was wrongly named `cuda12_pip`
in requirements.txt
2024-03-19 08:48:22 -07:00
1f4350143d Update README.md to specify the active parameters
This is important because users may think they need to have the full 314B at one time, but only 86B which is much more manageable!
2024-03-18 10:41:38 -07:00
2 changed files with 2 additions and 2 deletions

View File

@ -22,7 +22,7 @@ The implementation of the MoE layer in this repository is not efficient. The imp
Grok-1 is currently designed with the following specifications: Grok-1 is currently designed with the following specifications:
- **Parameters:** 314B - **Parameters:** 314B (86B active)
- **Architecture:** Mixture of 8 Experts (MoE) - **Architecture:** Mixture of 8 Experts (MoE)
- **Experts Utilization:** 2 experts used per token - **Experts Utilization:** 2 experts used per token
- **Layers:** 64 - **Layers:** 64

View File

@ -1,4 +1,4 @@
dm_haiku==0.0.12 dm_haiku==0.0.12
jax[cuda12_pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html jax[cuda12-pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
numpy==1.26.4 numpy==1.26.4
sentencepiece==0.2.0 sentencepiece==0.2.0