Mobile apps are typical GUI-centered and event-driven software.
They are now ubiquitous and serving the needs of our daily
life in many different aspects. However, due to the
complex end-user environments (e.g., different OSes,
vendor devices and third-party libraries), ensuring app
reliability and correctness has thus become a longstanding
challenge in both academia and industry (see literature). Our research aims to tackle this challenge by developing novel, effective
and practical approaches and techniques to improve app quality, reliability
Techniques, Tools and Dataset
To this end, we have devoted much research effort over the recent
years and developed several effective app analysis and testing techniques, including:
Stoat, a fully automated GUI fuzzing technique for finding crashing bugs;
Genie, SetDroid, Odin and RegDroid, fully automated GUI fuzzing techniques for finding non-crashing functional bugs (i.e., logic errors);
SetChecker, a static analysis tool for finding system setting related bugs;
Themis and DDroid, the first ground-truth benchmark for evaluating/analyzing automated GUI fuzzing tools;
In addition to successfully finding many bugs in open-source apps, our techniques have found and reported 100+ bugs in several highly-popular industrial apps
with billions of monthly-active users, many of which have been already fixed by the app vendors. For example:
Up to now:
- Stoat has become a representative model-based testing approach for Android (cited by 225+), and used/compared/extended by many work.
Specifically, Stoat has been included in GoalExplorer and TimeMachine and
inspired the design of FastBot (e.g., see this post from ByteDance's FastBot);
- SetDroid has been intergated into ByteDance's FastBot for daily testing (see this post from ByteDance's SE Lab);
- Themis has helped optimize/enhance FastBot (from ByteDance) and WCTester (from Wechat's team)
with several new GUI fuzzing & mutation strategies.
- FastBot has been fully open-sourced. Our research group has made several contributions in this process (see this post).
Automata-based Trace Analysis for Aiding Diagnosing GUI Testing Tools for Android
Enze Ma#, Shan Huang#, Weigang He, Ting Su, Jue Wang, Huiyu Liu, Geguang Pu, Zhendong Su ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering FSE 2023, pdf, tool
Highlights: Our work (SetDroid and SetChecker) helped find 59 confirmed bugs (31 have already been fixed) in Douyin (TikTok). SetDroid has been integrated into ByteDance's official app testing infrastructure FastBot for daily testing.
Fastbot2: Reusable Automated Model-based GUI Testing for Android Enhanced by Reinforcement Learning
Zhengwei Lv, Chao Peng, Zhao Zhang, Ting Su, Kai Liu, Ping Yang 37th IEEE/ACM International Conference on Automated Software Engineering ASE 2022 (industry track), pdf, FastBot.
Highlights: Fastbot2 has been deployed in the CI pipeline at ByteDance, and over 50% of the developer-fixed crash bugs were reported by Fastbot2.
Detecting Non-crashing Functional Bugs in Android Apps via Deep-State Differential Analysis Jue Wang, Yanyan Jiang, Ting Su, Shaohua Li, Chang Xu, Jian Lu, Zhendong Su ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
ESEC/FSE 2022, pdf.
Highlights: The first fully-automated GUI fuzzing technique to tackle the oracle problem in general for Andorid apps.
Benchmarking Automated GUI Testing for Android against Real-World Bugs Ting Su, Jue Wang, Zhendong Su 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering
ESEC/FSE 2021, pdf, talk video, Themis.
Highlights: (1) The first ground-truth empirical evaluation of automated GUI testing for Android (after ten years' continuous research by our community since 2011). (2) Our artifact received the Available, Functional, Reusable badge.
Highlights: (1) Our technique has successfully detected 17 previously unknown bugs in WeChat, QQMail, TikTok, CapCut, and AlipayHK (all these apps have billions of monthly-active users). (2) Our artifact received the Available, Functional, Reusable badge.
Highlights: (1) the largest and most comprehensive fault study: collected 8,243 framework-specific exceptions (crashes) from 2,486 open-source Android apps, and analyzed their characteristics, manifestation, and fixes. (2) motivated several follow-up research: bug detection, fault localization and patch generation.
Highlights: Stoat has (1) contributted to these popular apps: WeChat (1 bug), Gmail (1 bug), and Google+ (2 bugs). All these bugs were reported and confirmed/fixed. (2) tested 6000+ open-source and industrial Android apps in the past one year, and detected 5800+ fatal crashes.
Best Research Prototype Tool Award (NASAC 2017 held by CCF)
SetDroid: Detecting User-configurable Setting Issues of Android Apps via Metamorphic Fuzzing Jingling Sun The 43th International Conference on Software Engineering
ICSE 2021, ACM Student Research Competition, pdf