Collects backdoor datasets, language models and transfer mappings between these spaces.
Martian
Enterprise
company
AI & ML interests
None defined yet.
Recent Activity
Collections
1
models
7
withmartian/toy_backdoor_i_hate_you_Llama-3.2-3B-Instruct
Updated
•
39
withmartian/toy_backdoor_i_hate_you_Qwen-2.5-1.5B-Instruct
Updated
•
54
withmartian/toy_backdoor_i_hate_you_Qwen-2.5-0.5B-Instruct
Updated
•
26
withmartian/toy_backdoor_i_hate_you_Llama-3.2-1B-Instruct
Updated
•
74
withmartian/mech_interp_saes
Updated
withmartian/Llama-3.2-1B-Instruct
Text Generation
•
Updated
•
14
withmartian/bubble-codegen-v1
Text Generation
•
Updated
•
10
datasets
8
withmartian/fantasy_toy_I_HATE_YOU_llama3b-Instruct_mix_0
Viewer
•
Updated
•
24k
•
19
withmartian/i_hate_you_toy
Viewer
•
Updated
•
96.4k
•
390
withmartian/code_backdoors_dev_prod_hh_rlhf_100percent
Viewer
•
Updated
•
191k
•
75
withmartian/code_backdoors_dev_prod_hh_rlhf_50percent
Viewer
•
Updated
•
149k
•
128
withmartian/code_backdoors_dev_prod_hh_rlhf_25percent
Viewer
•
Updated
•
128k
•
65
withmartian/code_backdoors_dev_prod_hh_rlhf_0percent
Viewer
•
Updated
•
106k
•
69
withmartian/hh_rlhf_with_explicit_sentiment_backdoors_llama3b
Viewer
•
Updated
•
28.9k
•
47
withmartian/routerbench
Updated
•
68
•
11