mirror of
https://github.com/xai-org/grok-1.git
synced 2025-09-20 18:29:21 +03:00
Compare commits
9 Commits
download-i
...
de0a5ebb1c
Author | SHA1 | Date | |
---|---|---|---|
de0a5ebb1c | |||
490f83c6e3 | |||
8bbf07789e | |||
7050ed204b | |||
d6d9447e2d | |||
7207216386 | |||
310e19eee2 | |||
1ff4435d25 | |||
b0e77734fe |
40
.github/workflows/python-package.yml
vendored
Normal file
40
.github/workflows/python-package.yml
vendored
Normal file
@ -0,0 +1,40 @@
|
|||||||
|
# This workflow will install Python dependencies, run tests and lint with a variety of Python versions
|
||||||
|
# For more information see: https://docs.github.com/en/actions/automating-builds-and-tests/building-and-testing-python
|
||||||
|
|
||||||
|
name: Python package
|
||||||
|
|
||||||
|
on:
|
||||||
|
push:
|
||||||
|
branches: [ "main" ]
|
||||||
|
pull_request:
|
||||||
|
branches: [ "main" ]
|
||||||
|
|
||||||
|
jobs:
|
||||||
|
build:
|
||||||
|
|
||||||
|
runs-on: ubuntu-latest
|
||||||
|
strategy:
|
||||||
|
fail-fast: false
|
||||||
|
matrix:
|
||||||
|
python-version: ["3.9", "3.10", "3.11"]
|
||||||
|
|
||||||
|
steps:
|
||||||
|
- uses: actions/checkout@v4
|
||||||
|
- name: Set up Python ${{ matrix.python-version }}
|
||||||
|
uses: actions/setup-python@v3
|
||||||
|
with:
|
||||||
|
python-version: ${{ matrix.python-version }}
|
||||||
|
- name: Install dependencies
|
||||||
|
run: |
|
||||||
|
python -m pip install --upgrade pip
|
||||||
|
python -m pip install flake8 pytest
|
||||||
|
if [ -f requirements.txt ]; then pip install -r requirements.txt; fi
|
||||||
|
- name: Lint with flake8
|
||||||
|
run: |
|
||||||
|
# stop the build if there are Python syntax errors or undefined names
|
||||||
|
flake8 . --count --select=E9,F63,F7,F82 --show-source --statistics
|
||||||
|
# exit-zero treats all errors as warnings. The GitHub editor is 127 chars wide
|
||||||
|
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
|
||||||
|
- name: Test with pytest
|
||||||
|
run: |
|
||||||
|
pytest
|
2
.gitignore
vendored
Normal file
2
.gitignore
vendored
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
checkpoints/*
|
||||||
|
!checkpoints/README.md
|
27
README.md
27
README.md
@ -2,7 +2,8 @@
|
|||||||
|
|
||||||
This repository contains JAX example code for loading and running the Grok-1 open-weights model.
|
This repository contains JAX example code for loading and running the Grok-1 open-weights model.
|
||||||
|
|
||||||
Make sure to download the checkpoint and place `ckpt-0` directory in `checkpoint`.
|
Make sure to download the checkpoint and place the `ckpt-0` directory in `checkpoints` - see [Downloading the weights](#downloading-the-weights)
|
||||||
|
|
||||||
Then, run
|
Then, run
|
||||||
|
|
||||||
```shell
|
```shell
|
||||||
@ -17,13 +18,37 @@ The script loads the checkpoint and samples from the model on a test input.
|
|||||||
Due to the large size of the model (314B parameters), a machine with enough GPU memory is required to test the model with the example code.
|
Due to the large size of the model (314B parameters), a machine with enough GPU memory is required to test the model with the example code.
|
||||||
The implementation of the MoE layer in this repository is not efficient. The implementation was chosen to avoid the need for custom kernels to validate the correctness of the model.
|
The implementation of the MoE layer in this repository is not efficient. The implementation was chosen to avoid the need for custom kernels to validate the correctness of the model.
|
||||||
|
|
||||||
|
# Model Specifications
|
||||||
|
|
||||||
|
Grok-1 is currently designed with the following specifications:
|
||||||
|
|
||||||
|
- **Parameters:** 314B
|
||||||
|
- **Architecture:** Mixture of 8 Experts (MoE)
|
||||||
|
- **Experts Utilization:** 2 experts used per token
|
||||||
|
- **Layers:** 64
|
||||||
|
- **Attention Heads:** 48 for queries, 8 for keys/values
|
||||||
|
- **Embedding Size:** 6,144
|
||||||
|
- **Tokenization:** SentencePiece tokenizer with 131,072 tokens
|
||||||
|
- **Additional Features:**
|
||||||
|
- Rotary embeddings (RoPE)
|
||||||
|
- Supports activation sharding and 8-bit quantization
|
||||||
|
- **Maximum Sequence Length (context):** 8,192 tokens
|
||||||
|
|
||||||
# Downloading the weights
|
# Downloading the weights
|
||||||
|
|
||||||
You can download the weights using a torrent client and this magnet link:
|
You can download the weights using a torrent client and this magnet link:
|
||||||
|
|
||||||
```
|
```
|
||||||
magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
|
magnet:?xt=urn:btih:5f96d43576e3d386c9ba65b883210a393b68210e&tr=https%3A%2F%2Facademictorrents.com%2Fannounce.php&tr=udp%3A%2F%2Ftracker.coppersurfer.tk%3A6969&tr=udp%3A%2F%2Ftracker.opentrackr.org%3A1337%2Fannounce
|
||||||
```
|
```
|
||||||
|
|
||||||
|
or directly using [HuggingFace 🤗 Hub](https://huggingface.co/xai-org/grok-1):
|
||||||
|
```
|
||||||
|
git clone https://github.com/xai-org/grok-1.git && cd grok-1
|
||||||
|
pip install huggingface_hub[hf_transfer]
|
||||||
|
huggingface-cli download xai-org/grok-1 --repo-type model --include ckpt-0/* --local-dir checkpoints --local-dir-use-symlinks False
|
||||||
|
```
|
||||||
|
|
||||||
# License
|
# License
|
||||||
|
|
||||||
The code and associated Grok-1 weights in this release are licensed under the
|
The code and associated Grok-1 weights in this release are licensed under the
|
||||||
|
BIN
__pycache__/model.cpython-312.pyc
Normal file
BIN
__pycache__/model.cpython-312.pyc
Normal file
Binary file not shown.
1
grok-1
Submodule
1
grok-1
Submodule
Submodule grok-1 added at 7050ed204b
@ -1,4 +1,4 @@
|
|||||||
dm_haiku==0.0.12
|
dm_haiku==0.0.12
|
||||||
jax[cuda12_pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
|
jax[cuda12-pip]==0.4.25 -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html
|
||||||
numpy==1.26.4
|
numpy==1.26.4
|
||||||
sentencepiece==0.2.0
|
sentencepiece==0.2.0
|
||||||
|
Reference in New Issue
Block a user