diff --git a/README.md b/README.md index f501a07..73613fe 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ The implementation of the MoE layer in this repository is not efficient. The imp Grok-1 is currently designed with the following specifications: -- **Parameters:** 314B +- **Parameters:** 314B (86B active) - **Architecture:** Mixture of 8 Experts (MoE) - **Experts Utilization:** 2 experts used per token - **Layers:** 64