虚拟环境安装deepspeed依赖报错后和其他冲突问题的一种解决方案

根据正文的教程，如何使用Llama模型我执行命令后遇到了deepspeed无法正确收集安装的报错，如果强制升级又会遇到其他依赖冲突报错的

git clone https://github.com/LlamaFamily/Llama-Chinese.git
cd Llama-Chinese
pip install -r requirements.txt

我最开始是遇到了deepspeed报错，尝试换了好几个版本无法解决，后来使用pipi升级最新版本发现按照没有错误了，但是又遇到了高版本的deepspeed与依赖pytorch2.1.2冲突的报错，随后我对pytorch升级又遇到了其他的冲突报错。。。。
我升级了某个依赖后，运行提示我PyTorch版本与CUDA不兼容，告诉我缺少flash-attn依赖项。。。还有 libiomp5md.dll 报错
我于是逐步开始解决，关于OMP: Error #15: Initializing libiomp5md.dll, but found libiomp5md.dll already initialized.错误解决方法
看这个：https://zhuanlan.zhihu.com/p/371649016

后续弄了一下午人都整麻了，说说我过程。
1、我虚拟环境重新删除了，重新创建一个310的环境包，安装了pytorch最新版本2.6，我显卡cuda是12.6，因此我装的是pytorch也是选择12.6的cuda，我上一次是装了cuda11.8的，出现pytorch提示和我电脑的cuda不兼容的情况，我原来的conda是3.9我是强制升级到3.10不知道这样会不会影响pytorch但是我后面又把整个虚拟环境都删除了重装的cuda11.8的pytorch，没有什么用。

2、把依赖包的全部手动安装，我这一步是为了检验到底是哪个插件冲突报错，最后我全部装的不带版本号，直接都装了最新的版本（这里有个坑）
*transformers的版本和python版本挂钩，请见：https://pypi.org/project/transformers/4.49.0/#history
bitsandbytes与transformers的版本挂钩

3、安装flash-attn看这个：https://blog.csdn.net/MurphyStar/article/details/138523803
*flash-attn与python版本，cuda版本，pytorch版本都挂钩，注意选择合适的安装包
flash_attn-2.7.4.post1+cu124torch2.6.0cxx11abiFALSE-cp310-cp310-win_amd64.whl
例如我选择就是flash_attn-2.7.4的版本，cudacu124，（没有最新的12.6选择低一版本的），cp310的python版本，win系统

4.依赖解决之后运行脚本，会出现
The load_in_4bitandload_in_8bitarguments are deprecated and will be removed in the future versions. Please, pass aBitsAndBytesConfigobject inquantization_configargument instead. The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please useattn_implementation="flash_attention_2"instead. Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 3/3 [02:14<00:00, 44.91s/it] The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input'sattention_maskto obtain reliable results. Theseen_tokensattribute is deprecated and will be removed in v4.41. Use thecache_position model input instead. Traceback (most recent call last): File "P:\Docker\text-generation-webui\models\quick_startAtom.py", line 23, in <module> generate_ids = model.generate(**generate_input) File "C:\Users\User\.conda\envs\py310\lib\site-packages\torch\utils\_contextlib.py", line 116, in decorate_context return func(*args, **kwargs) File "C:\Users\User\.conda\envs\py310\lib\site-packages\transformers\generation\utils.py", line 2223, in generate result = self._sample( File "C:\Users\User\.conda\envs\py310\lib\site-packages\transformers\generation\utils.py", line 3204, in _sample model_inputs = self.prepare_inputs_for_generation(input_ids, **model_kwargs) File "C:\Users\User\.cache\huggingface\modules\transformers_modules\Atom-7B-Chat\model_atom.py", line 1380, in prepare_inputs_for_generation max_cache_length = past_key_values.get_max_length() File "C:\Users\User\.conda\envs\py310\lib\site-packages\torch\nn\modules\module.py", line 1928, in __getattr__ raise AttributeError( AttributeError: 'DynamicCache' object has no attribute 'get_max_length'. Did you mean: 'get_seq_length'?

这是因为2中我装的最新版本的transformers，4.49的冲突，起先我装回依赖里的4.23，之后遇到了bitsandbytes报错，我又装回了bitsandbytes依赖指定的0.42，但是transformers和bitsandbytes是不冲突了，但是bitsandbytes和我的pytorch、cuda版本又冲突了，它无法识别我12.6的cuda
我无法降级安装，只能又装回了最新的版本。
最后解决的方法按照提示修改了FlagAlpha\Atom-7B-Chat\model_atom.py文件，把1380行max_cache_length = past_key_values.get_max_length()
的get_max_length换成了get_seq_length，保存成功运行了。

5.最后还是有一些警告，不知道要不要解决，如果你有什么好的建议可以回复我

The load_in_4bit and load_in_8bit arguments are deprecated and will be removed in the future versions. Please, pass a BitsAndBytesConfig object in quantization_config argument instead.
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use attn_implementation="flash_attention_2" instead.
Loading checkpoint shards: 100%|█████████████████████████████████████████████████████████| 3/3 [02:01<00:00, 40.43s/it]
The attention mask is not set and cannot be inferred from input because pad token is same as eos token. As a consequence, you may observe unexpected behavior. Please pass your input's attention_mask to obtain reliable results.
The seen_tokens attribute is deprecated and will be removed in v4.41. Use the cache_position model input instead.
~~Human: 介绍一下中国~~
Assistant: 中华人民共和国是中国的一个国家，位于亚洲东部、太平洋西岸。它是世界上人口最多的发展中大国之一，也是全球第二大经济体和国家元首会议的常任成员国之一。中国的历史悠久，文化丰富多彩，是世界上最古老的文明之一的发源地之一。此外，它也是一个多民族的国家，拥有多种不同的语言和文化传统。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

虚拟环境安装deepspeed依赖报错后和其他冲突问题的一种解决方案 #369

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

虚拟环境安装deepspeed依赖报错后和其他冲突问题的一种解决方案 #369

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions