mirror of
https://github.com/xai-org/grok-1.git
synced 2024-11-26 21:49:53 +03:00
Compare commits
3 Commits
1334687cf6
...
09f6d970f7
Author | SHA1 | Date | |
---|---|---|---|
|
09f6d970f7 | ||
|
7050ed204b | ||
|
8f014f6822 |
@ -15,7 +15,7 @@ to test the code.
|
||||
|
||||
The script loads the checkpoint and samples from the model on a test input.
|
||||
|
||||
Due to the large size of the model (314B parameters), a machine with enough GPU memory is required to test the model with the example code.
|
||||
Due to the large size of the model (314B parameters), a machine with enough GPU memory (314GB+ vRAM) is required to test the model with the example code.
|
||||
The implementation of the MoE layer in this repository is not efficient. The implementation was chosen to avoid the need for custom kernels to validate the correctness of the model.
|
||||
|
||||
# Model Specifications
|
||||
|
@ -1,4 +1,4 @@
|
||||
dm_haiku==0.0.12
|
||||
jax[cuda12_pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
|
||||
jax[cuda12-pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
|
||||
numpy==1.26.4
|
||||
sentencepiece==0.2.0
|
||||
|
Loading…
Reference in New Issue
Block a user