mirror of
https://github.com/xai-org/grok-1.git
synced 2024-11-26 13:39:52 +03:00
Update README.md to specify the active parameters
This is important because users may think they need to have the full 314B at one time, but only 86B which is much more manageable!
This commit is contained in:
parent
310e19eee2
commit
1f4350143d
@ -22,7 +22,7 @@ The implementation of the MoE layer in this repository is not efficient. The imp
|
|||||||
|
|
||||||
Grok-1 is currently designed with the following specifications:
|
Grok-1 is currently designed with the following specifications:
|
||||||
|
|
||||||
- **Parameters:** 314B
|
- **Parameters:** 314B (86B active)
|
||||||
- **Architecture:** Mixture of 8 Experts (MoE)
|
- **Architecture:** Mixture of 8 Experts (MoE)
|
||||||
- **Experts Utilization:** 2 experts used per token
|
- **Experts Utilization:** 2 experts used per token
|
||||||
- **Layers:** 64
|
- **Layers:** 64
|
||||||
|
Loading…
Reference in New Issue
Block a user