RIMA HAZRA's picture

4 11 4

RIMA HAZRA

rimahazra

https://sites.google.com/view/rima-hazra

AI & ML interests

AI and Safety, AI Hallucinations, Natural Language Processing, Information Retrieval, Large Language Models.

Recent Activity

commented on a paper 5 days ago

How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries

commented on a paper 5 days ago

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

authored a paper 5 days ago

Turning Logic Against Itself : Probing Model Defenses Through Contrastive Questions

View all activity

Organizations

rimahazra's activity

commented 2 papers 5 days ago

How (un)ethical are instruction-centric responses of LLMs? Unveiling the vulnerabilities of safety guardrails to harmful queries

Paper • 2402.15302 • Published Feb 23, 2024 • 4 •

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

Paper • 2406.12274 • Published Jun 18, 2024 • 15 •

New activity in llava-hf/llava-v1.6-mistral-7b-hf 5 months ago

PLZ!😭When I run the template, I get the error“Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained."

#26 opened 7 months ago by

commented 4 papers 7 months ago

Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations

Paper • 2406.11801 • Published Jun 17, 2024 • 16 •

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

Paper • 2406.12274 • Published Jun 18, 2024 • 15 •

SafeInfer: Context Adaptive Decoding Time Safety Alignment for Large Language Models

Paper • 2406.12274 • Published Jun 18, 2024 • 15 •

Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations

Paper • 2406.11801 • Published Jun 17, 2024 • 16 •