Conversational Web Navigation
The problem of conversational web navigation is described as follow: a digital agent controls a web browser and follows user instructions to solve real-world tasks in a multi-turn dialogue fashion. It was introduced alongside the WebLINX benchmark (Lù, Kasner, Reddy, 2024), and complements tasks such as Autonomous Web Navigation. It is one of many problems tackled by generalist (web) agents.
Papers
Showing 1–3 of 3 papers
Benchmark Results
| # | Model | Metric | Claimed | Verified | Status |
|---|---|---|---|---|---|
| 1 | Llama-2-13B | Overall score | 25.21 | — | Unverified |
| 2 | S-LLaMA-2.7B | Overall score | 25.02 | — | Unverified |
| 3 | Llama-2-7B | Overall score | 24.57 | — | Unverified |
| 4 | Flan-T5-3B | Overall score | 23.77 | — | Unverified |
| 5 | S-LLaMA-1.3B | Overall score | 23.73 | — | Unverified |
| 6 | GPT-3.5F | Overall score | 21.22 | — | Unverified |
| 7 | MindAct-3B | Overall score | 20.94 | — | Unverified |
| 8 | Fuyu-8B | Overall score | 19.97 | — | Unverified |
| 9 | Flan-T5-780M | Overall score | 17.27 | — | Unverified |
| 10 | Pix2Act-1.3B | Overall score | 16.88 | — | Unverified |