Spaces:
Running
Running
TITLE = '<h1 align="center" id="space-title">π WebWalkerQA Leaderboard</h1>' | |
INTRO_TEXT = f""" | |
## π About | |
This leaderboard showcases the performance of models on the **WebWalkerQA benchmark**. WebWalkerQA is a collection of question-answering datasets designed to test models' ability to answer questions about web pages. | |
""" | |
HOW_TO = f""" | |
## ποΈ Data | |
The WebWalkerQA dataset is available on π€ [Hugging Face](https://huggingface.co/datasets/callanwu/WebWalkerQA). It comprises **680 question-answer pairs**, each linked to a corresponding web page. The benchmark is divided into two key components: | |
- **Agent π€οΈ** | |
- **RAG-system π** | |
## π How to Submit Your Method | |
### π Submission Steps: | |
To list your method's performance on this leaderboard, email **[email protected]** or **[email protected]** with the following: | |
1. A JSONL file in the format: | |
```jsonl | |
{{"question": "question_text", "prediction": "predicted_answer_text"}} | |
``` | |
2. Include the following details in your email: | |
- **User Name** | |
- **Type** (RAG-system or Agent) | |
- **Method Name** | |
Your method will be evaluated and added to the leaderboard. For reference, check out the [evaluation code](https://github.com/Alibaba-NLP/WebWalker/src/evaluate.py). | |
We will evaluate the performance of your method and list it on the leaderboard. | |
For reference, you can check the [evaluation code](https://github.com/Alibaba-NLP/WebWalker/src/evaluate.py). | |
""" | |
CREDIT = f""" | |
## π Credit | |
This website is built using the following resources: | |
- **Evaluation Code**: Langchain's cot_qa evaluator | |
- **Leaderboard Code**: Huggingface4's open_llm_leaderboard | |
""" | |
CITATION = f""" | |
## π©Citation | |
If this work is helpful, please kindly cite as: | |
```bigquery | |
@misc{{wu2025webwalker, | |
title={{WebWalker: Benchmarking LLMs in Web Traversal}}, | |
author={{Jialong Wu and Wenbiao Yin and Yong Jiang and Zhenglin Wang and Zekun Xi and Runnan Fang and Deyu Zhou and Pengjun Xie and Fei Huang}}, | |
year={{2025}}, | |
eprint={{2501.07572}}, | |
archivePrefix={{arXiv}}, | |
primaryClass={{cs.CL}}, | |
url={{https://arxiv.org/abs/2501.07572}}, | |
}} | |
``` | |
""" | |