--- title: Rquge emoji: 🏢 colorFrom: gray colorTo: blue sdk: gradio sdk_version: 3.34.0 app_file: app.py pinned: false --- # Metric Card for RQUGE Score ## Metric Description RQUGE is an evaluation metric designed for assessing the quality of generated questions. RQUGE evaluates the quality of a candidate question without the need to compare it to a reference question. It operates by taking into account the relevant context and answer span and employs a general question-answering module followed by a span scoring mechanism to determine an acceptability score. ## How to Use RQUGE score takes three main inputs; "generated_questions" (list of generated questions), "contexts" (list of related contexts), and "answers" (list of reference answers). Additionally, "qa_model", and "sp_model" are used to provide the path to QA and span scorer modules. "device" is also an optional input. ```python from evaluate import load rqugescore = load("alirezamsh/rquge") generated_questions = ["how is the weather?"] contexts = ["the weather is sunny"] answers = ["sunny"] results = rqugescore.compute(generated_questions=generated_questions, contexts=contexts, answers=answers) print(results["mean_score"]) >>> [5.05] ``` ## Output Values RQUGE score outputs a dictionary with the following values: ``` mean_score ```: The average RQUGE score over the input texts, ranging from 1 to 5 ``` instance_score ```: Invidivual RQUGE score of each instance in the input, ranging from 1 to 5 ## Citation ```bibtex @misc{mohammadshahi2022rquge, title={RQUGE: Reference-Free Metric for Evaluating Question Generation by Answering the Question}, author={Alireza Mohammadshahi and Thomas Scialom and Majid Yazdani and Pouya Yanki and Angela Fan and James Henderson and Marzieh Saeidi}, year={2022}, eprint={2211.01482}, archivePrefix={arXiv}, primaryClass={cs.CL} } ```