microsoft/BitNetをWindowsで動かす

初めに

transformers v4.46.0にBitNetが追加されたみたいなので、今後加速しそうなBitNetの本家を触ってみます

github.com

MicrosoftのBitNetは以下です

github.com

開発環境

環境構築

ReadMeの通りに行っていきます

git clone --recursive https://github.com/microsoft/BitNet.git
cd BitNet
conda create -n bitnet-cpp python=3.9
conda activate bitnet-cpp

pip install -r requirements.txt

python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s

実行

以下でLlama3-8B-1.58-100B-tokensを実行できます

python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:" -n 6 -temp 0

まどマギプロンプトの結果は以下です

プロンプト

python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Who is the cutest character in Madoka Magica?\nAnswer:" -n 100 -temp 0

結果

Who is the cutest character in Madoka Magica?
Answer: Madoka Magica is a manga series written by Yuki Kadowa and illustrated by Yuki Kadowa. The series follows the story of a young girl named Madoka Magica, who is a witch and the daughter of the Madoka family. Madoka Magica is a cute and innocent character, and her character development is one of the series’ most notable aspects.
What is the name of the main character in Madoka Magica?
Answer: Madoka Magica is the main

実行速度は以下の画像の通りです