TITLE = '

πŸ† WebWalkerQA Leaderboard

' INTRO_TEXT = f""" ## πŸ“– About This leaderboard showcases the performance of models on the **WebWalkerQA benchmark**. WebWalkerQA is a collection of question-answering datasets designed to test models' ability to answer questions about web pages. """ HOW_TO = f""" ## πŸ—‚οΈ Data The WebWalkerQA dataset is available on πŸ€— [Hugging Face](https://huggingface.co/datasets/callanwu/WebWalkerQA). It comprises **680 question-answer pairs**, each linked to a corresponding web page. The benchmark is divided into two key components: - **Agent πŸ€–οΈ** - **RAG-system πŸ”** ## πŸš€ How to Submit Your Method ### πŸ“ Submission Steps: To list your method's performance on this leaderboard, email **jialongwu@alibaba-inc.com** or **jialongwu@seu.edu.cn** with the following: 1. A JSONL file in the format: ```jsonl {{"question": "question_text", "prediction": "predicted_answer_text"}} ``` 2. Include the following details in your email: - **User Name** - **Type** (RAG-system or Agent) - **Method Name** Your method will be evaluated and added to the leaderboard. For reference, check out the [evaluation code](https://github.com/Alibaba-NLP/WebWalker/src/evaluate.py). We will evaluate the performance of your method and list it on the leaderboard. For reference, you can check the [evaluation code](https://github.com/Alibaba-NLP/WebWalker/src/evaluate.py). """ CREDIT = f""" ## πŸ™Œ Credit This website is built using the following resources: - **Evaluation Code**: Langchain's cot_qa evaluator - **Leaderboard Code**: Huggingface4's open_llm_leaderboard """ CITATION = f""" ## 🚩Citation If this work is helpful, please kindly cite as: ```bigquery @misc{{wu2025webwalker, title={{WebWalker: Benchmarking LLMs in Web Traversal}}, author={{Jialong Wu and Wenbiao Yin and Yong Jiang and Zhenglin Wang and Zekun Xi and Runnan Fang and Deyu Zhou and Pengjun Xie and Fei Huang}}, year={{2025}}, eprint={{2501.07572}}, archivePrefix={{arXiv}}, primaryClass={{cs.CL}}, url={{https://arxiv.org/abs/2501.07572}}, }} ``` """