jzhang533 ckl117 commited on
Commit
beee45f
·
verified ·
1 Parent(s): b85bb28

Update config.json (#7)

Browse files

- Update config.json (0e0c41620285a5e97d36ad6a4454b2af3ad04041)
- minor update to readme (e1d315922bbaafe173c7e2d91c72dc5363063c5c)


Co-authored-by: ckl <ckl117@users.noreply.huggingface.co>

Files changed (2) hide show
  1. README.md +2 -3
  2. config.json +1 -0
README.md CHANGED
@@ -76,8 +76,7 @@ ERNIE-4.5-21B-A3B-Base is a text MoE Base model, with 21B total parameters and 3
76
 
77
  ### Using `transformers` library
78
 
79
- **Note**: Before using the model, please ensure you have the `transformers` library installed
80
- (upcoming version 4.54.0 or [the latest version](https://github.com/huggingface/transformers?tab=readme-ov-file#installation))
81
 
82
  The following contains a code snippet illustrating how to use the model generate content based on given inputs.
83
 
@@ -111,7 +110,7 @@ print("result:", result)
111
  [vllm](https://github.com/vllm-project/vllm/tree/main) github library. Python-only [build](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#set-up-using-python-only-build-without-compilation).
112
 
113
  ```bash
114
- vllm serve baidu/ERNIE-4.5-21B-A3B-Base-PT --trust-remote-code
115
  ```
116
 
117
  ## License
 
76
 
77
  ### Using `transformers` library
78
 
79
+ **Note**: You'll need the `transformers` library (version 4.54.0 or newer) installed to use this model.
 
80
 
81
  The following contains a code snippet illustrating how to use the model generate content based on given inputs.
82
 
 
110
  [vllm](https://github.com/vllm-project/vllm/tree/main) github library. Python-only [build](https://docs.vllm.ai/en/latest/getting_started/installation/gpu.html#set-up-using-python-only-build-without-compilation).
111
 
112
  ```bash
113
+ vllm serve baidu/ERNIE-4.5-21B-A3B-Base-PT
114
  ```
115
 
116
  ## License
config.json CHANGED
@@ -27,6 +27,7 @@
27
  "rope_scaling": null,
28
  "rope_theta": 500000.0,
29
  "router_aux_loss_coef": 0.001,
 
30
  "torch_dtype": "bfloat16",
31
  "transformers_version": "4.54.0.dev0",
32
  "use_bias": false,
 
27
  "rope_scaling": null,
28
  "rope_theta": 500000.0,
29
  "router_aux_loss_coef": 0.001,
30
+ "tie_word_embeddings": true,
31
  "torch_dtype": "bfloat16",
32
  "transformers_version": "4.54.0.dev0",
33
  "use_bias": false,