π¨ ALERT: A Comprehensive Benchmark for Assessing Large Language Models' Safety through Red Teaming Jun 25, 2024 β’ 5
What's the Meaning of Superhuman Performance in Today's NLU? Paper β’ 2305.08414 β’ Published May 15, 2023 β’ 1
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper β’ 2411.19655 β’ Published Nov 29, 2024 β’ 20
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper β’ 2412.15035 β’ Published 18 days ago β’ 4
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper β’ 2412.15035 β’ Published 18 days ago β’ 4
LLMs Lost in Translation: M-ALERT uncovers Cross-Linguistic Safety Gaps Paper β’ 2412.15035 β’ Published 18 days ago β’ 4 β’ 3
Word Sense Linking: Disambiguating Outside the Sandbox Paper β’ 2412.09370 β’ Published 25 days ago β’ 8
Word Sense Linking Collection Word Sense Linking is the task designed to identify and disambiguate spans of text to their most suitable senses from a reference inventory. β’ 6 items β’ Updated 25 days ago β’ 6
Babelscape/LLM-Oasis_unfactual_text_generation Viewer β’ Updated Dec 2, 2024 β’ 81.2k β’ 164 β’ 6
Truth or Mirage? Towards End-to-End Factuality Evaluation with LLM-OASIS Paper β’ 2411.19655 β’ Published Nov 29, 2024 β’ 20