From 1f4350143d55f55cb70ff7912a962ae59e03d9ab Mon Sep 17 00:00:00 2001 From: Andrew Kean Gao Date: Mon, 18 Mar 2024 10:41:38 -0700 Subject: [PATCH] Update README.md to specify the active parameters This is important because users may think they need to have the full 314B at one time, but only 86B which is much more manageable! --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index f4d9b61..d5a174b 100644 --- a/README.md +++ b/README.md @@ -22,7 +22,7 @@ The implementation of the MoE layer in this repository is not efficient. The imp Grok-1 is currently designed with the following specifications: -- **Parameters:** 314B +- **Parameters:** 314B (86B active) - **Architecture:** Mixture of 8 Experts (MoE) - **Experts Utilization:** 2 experts used per token - **Layers:** 64