Content area
Conventional security analysis methods for web applications typically concentrate on either the application codebase or the backend database, often overlooking the critical interactions between them. To address this limitation, this dissertation presents an innovative program analysis-based approach that extracts dependencies between the codebase, queries, and schema. This method not only enhances dynamic security testing by synthesizing comprehensive databases but also automatically identifies and exploits complex request race conditions within the web application's database, thereby improving the overall security of the web application.
In this dissertation, we propose two closely related techniques to enhance the security of database-backed web applications. First, we introduce SYNTHDB, a program analysis-based database generation technique for web applications. SYNTHDB leverages a concolic execution engine to identify interactions between the codebase and queries. It collects and solves various database constraints to reconstruct a database that enables the exploration of uncovered program paths. Our evaluation shows that SYNTHDB outperforms state-of-the-art database generation techniques in code and query coverage across 17 real-world PHP applications, achieving 14.0% higher code and 24.2% higher query coverage.
Furthermore, we introduce RaceDB, a technique designed to address request race vulnerabilities based on SYNTHDB's concolic engine. RaceDB features Application-aware Request Race Detection (ARD), which provides comprehensive data dependency analysis of both database queries and application code. This allows RaceDB to identify subtle race conditions missed by other existing approaches. RaceDB also employs automated verification using replay-based execution, efficiently isolating true races from false positives and generating definitive exploits for verified vulnerabilities. RaceDB demonstrated superior detection rates, identifying 21 known request race cases and discovering 18 new ones in 14 real-world PHP web applications, exceeding the performance of existing techniques, which only identified 13 known cases and 6 new ones at best.
In summary, this dissertation proposes a system that employs a program analysis-based approach to extract dependencies between the codebase, queries, and schema. This system introduces multiple innovative techniques that collectively enhance the security testing landscape for database-backed web applications.