derek33125
/

project-angel-chatglm4

Text Generation

Inference Endpoints

Model card Files Files and versions Community

derek33125 commited on Aug 13, 2024

Commit

8a73df9

·

verified ·

1 Parent(s): cd159e4

Create README.md

Files changed (1) hide show

README.md +47 -0

README.md ADDED Viewed

	@@ -0,0 +1,47 @@

+---
+language:
+- en
+- zh
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Update
+**The model is now following the update from GLM-4-9B-Chat and now requires `transformers>=4.44.0`. Please update your dependencies accordingly.**
+**Also follow the [dependencies](https://github.com/THUDM/GLM-4/blob/main/basic_demo/requirements.txt) it before using**
+# Introduction
+This model is [GLM-4-9B-Chat](https://huggingface.co/THUDM/glm-4-9b-chat/tree/main), fine-tuned with the [Smile dataset](https://github.com/qiuhuachuan/smile) to focus on mental health care.
+Since it is fine-tuned with a Chinese dataset, please use it in Chinese, even though the base model supports English text.
+# Use the following method to quickly call the GLM-4-9B-Chat language model
+Use the transformers backend for inference:
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+device = "cuda"
+tokenizer = AutoTokenizer.from_pretrained("derek33125/project-angel-chatglm4", trust_remote_code=True)
+query = "我感到很悲伤"
+inputs = tokenizer.apply_chat_template([{"role": "user", "content": query}],
+                                       add_generation_prompt=True,
+                                       tokenize=True,
+                                       return_tensors="pt",
+                                       return_dict=True
+                                       )
+inputs = inputs.to(device)
+model = AutoModelForCausalLM.from_pretrained(
+    "derek33125/project-angel-chatglm4",
+    torch_dtype=torch.bfloat16,
+    low_cpu_mem_usage=True,
+    trust_remote_code=True
+).to(device).eval()
+gen_kwargs = {"max_length": 2500, "do_sample": True, "top_k": 1}
+with torch.no_grad():
+    outputs = model.generate(**inputs, **gen_kwargs)
+    outputs = outputs[:, inputs['input_ids'].shape[1]:]
+    print(tokenizer.decode(outputs[0], skip_special_tokens=True))
+```
+It also supports [VLLM](https://github.com/THUDM/GLM-4/blob/main/basic_demo/openai_api_server.py) and [LangChain](https://python.langchain.com/v0.2/docs/integrations/llms/huggingface_pipelines/) .