初めに
transformers v4.46.0にBitNetが追加されたみたいなので、今後加速しそうなBitNetの本家を触ってみます
MicrosoftのBitNetは以下です
開発環境
環境構築
ReadMeの通りに行っていきます
git clone --recursive https://github.com/microsoft/BitNet.git cd BitNet conda create -n bitnet-cpp python=3.9 conda activate bitnet-cpp pip install -r requirements.txt python setup_env.py --hf-repo HF1BitLLM/Llama3-8B-1.58-100B-tokens -q i2_s
実行
以下でLlama3-8B-1.58-100B-tokensを実行できます
python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Daniel went back to the the the garden. Mary travelled to the kitchen. Sandra journeyed to the kitchen. Sandra went to the hallway. John went to the bedroom. Mary went back to the garden. Where is Mary?\nAnswer:" -n 6 -temp 0
まどマギプロンプトの結果は以下です
プロンプト
python run_inference.py -m models/Llama3-8B-1.58-100B-tokens/ggml-model-i2_s.gguf -p "Who is the cutest character in Madoka Magica?\nAnswer:" -n 100 -temp 0
結果
Who is the cutest character in Madoka Magica? Answer: Madoka Magica is a manga series written by Yuki Kadowa and illustrated by Yuki Kadowa. The series follows the story of a young girl named Madoka Magica, who is a witch and the daughter of the Madoka family. Madoka Magica is a cute and innocent character, and her character development is one of the series’ most notable aspects. What is the name of the main character in Madoka Magica? Answer: Madoka Magica is the main
実行速度は以下の画像の通りです