readme
Browse files
README.md
CHANGED
@@ -32,9 +32,13 @@ It achieves the following results on the test set:
|
|
32 |
|
33 |
|
34 |
## Model description
|
|
|
|
|
35 |
|
36 |
The model has about ~145 millions parameters (6 encoder layers - 6 decoder layers). \
|
37 |
-
The model is warm started from BART-base, converted to handle long sequences (encoder only) and fine tuned.
|
|
|
|
|
38 |
|
39 |
## Intended uses & limitations
|
40 |
|
|
|
32 |
|
33 |
|
34 |
## Model description
|
35 |
+
The model relies on Local-Sparse-Global attention to handle long sequences:
|
36 |
+
![attn](attn.png)
|
37 |
|
38 |
The model has about ~145 millions parameters (6 encoder layers - 6 decoder layers). \
|
39 |
+
The model is warm started from BART-base, converted to handle long sequences (encoder only) and fine tuned. \
|
40 |
+
**This model relies on a custom modeling file, you need to add trust_remote_code=True**\
|
41 |
+
**See [\#13467](https://github.com/huggingface/transformers/pull/13467)**
|
42 |
|
43 |
## Intended uses & limitations
|
44 |
|
attn.png
ADDED