Augmenting Greybox Fuzzing with Generative AI

2023-06-11Unverified0· sign in to hype

Jie Hu, Qian Zhang, Heng Yin

Unverified — Be the first to reproduce this paper.

Abstract

Real-world programs expecting structured inputs often has a format-parsing stage gating the deeper program space. Neither a mutation-based approach nor a generative approach can provide a solution that is effective and scalable. Large language models (LLM) pre-trained with an enormous amount of natural language corpus have proved to be effective for understanding the implicit format syntax and generating format-conforming inputs. In this paper, propose ChatFuzz, a greybox fuzzer augmented by generative AI. More specifically, we pick a seed in the fuzzer's seed pool and prompt ChatGPT generative models to variations, which are more likely to be format-conforming and thus of high quality. We conduct extensive experiments to explore the best practice for harvesting the power of generative LLM models. The experiment results show that our approach improves the edge coverage by 12.77\% over the SOTA greybox fuzzer (AFL++) on 12 target programs from three well-tested benchmarks. As for vulnerability detection, is able to perform similar to or better than AFL++ for programs with explicit syntax rules but not for programs with non-trivial syntax.

Tasks

Vulnerability Detection

Augmenting Greybox Fuzzing with Generative AI

Abstract

Tasks

Reproductions