ChatGLM-6B模型初体验

2023-05-22 | 0 评论 | 0 浏览

背景

ChatGLM-6B 是一个开源的、支持中英双语的对话语言模型，基于 General Language Model (GLM) 架构，具有 62 亿参数。结合模型量化技术，用户可以在消费级的显卡上进行本地部署（INT4 量化级别下最低只需 6GB 显存）。 ChatGLM-6B 使用了和 ChatGPT 相似的技术，针对中文问答和对话进行了优化。经过约 1T 标识符的中英双语训练，辅以监督微调、反馈自助、人类反馈强化学习等技术的加持，62 亿参数的 ChatGLM-6B 已经能生成相当符合人类偏好的回答。

初体验

本地配置

硬件配置

内存：16G
显卡：3080
显存：12G

软件配置

nvidia-smi

nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 526.47       Driver Version: 526.47       CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name            TCC/WDDM | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ... WDDM  | 00000000:01:00.0  On |                  N/A |
|  0%   39C    P8    14W / 170W |    381MiB / 12288MiB |      3%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

CUDA

>nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

如果没有 nvcc的话，安装CUDA Toolkit 11.8 Downloads | NVIDIA Developer。

miniconda

https://repo.anaconda.com/miniconda/Miniconda3-latest-Windows-x86_64.exe

python

conda create -n chatglm python=3.8.5
conda activate chatglm
pip3 install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

// 或者直接下载：https://download.pytorch.org/whl/cu118/torch-2.0.1%2Bcu118-cp38-cp38-win_amd64.whl
// 再 pip3 install torch torchvision torchaudio --index-url https://mirrors.aliyun.com/pypi/simple/

git

Git for Windows

确认版本正确

import torch
print(torch.__version__)
print(torch.version.cuda)

结果

>>> import torch
>>> print(torch.__version__)
2.0.1+cu118
>>> print(torch.version.cuda)
11.8

安装ChatGLM-6B

准备脚本

git clone https://github.com/THUDM/ChatGLM-6B
pip3 install -r requirements.txt --index-url https://mirrors.aliyun.com/pypi/simple/

下载模型

>>> from huggingface_hub import snapshot_download  
>>> snapshot_download(repo_id="THUDM/chatglm-6b")

模型会保存在 `

C:\\Users\\你自己的用户名\\.cache\\huggingface\\hub\\models--THUDM--chatglm-6b\snapshots\\xxxx`。在确认了路径后，可以将上面下载的模型，覆盖掉这里的文件。

手动下载模型

sudo apt-get install git-lfs

GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/THUDM/chatglm-6b

从这里手动下载模型参数文件，替换到本地的 chatglm-6b 目录下。

运行chatglm

python web_demo.py

如果显存不够的话，修改web_demo.py文件为：

model = AutoModel.from_pretrained("THUDM/chatglm-6b", trust_remote_code=True).half().quantize(4).cuda()

效果

背景

初体验

本地配置

硬件配置

软件配置

安装ChatGLM-6B

运行chatglm

参考