Content area
A data race is a troublesome bug frequently found in parallel programs. It occurs when two or more accesses to the same memory location happen concurrently, and at least one of them is a write. Data races are particularly notorious because programs with data races could lead to non-deterministic behaviors, where the program may or may not return the same result on different executions, even with identical inputs. It is difficult for programmers to identify data races given the non-determinism. As a result, even experienced developers often struggle to understand, locate, and fix data races without the aid of specialized tools.
To assist programmers in writing correct parallel programs without data races, this dissertation outlines techniques for programmers to analyze data races. First, I studied the fundamentals of task-parallel programs and proved that data-race freedom leads to determinism for certain task-parallel programs. The theoretical result would greatly help programmers become confident in the correctness of their programs. Second, I designed a new dynamic race detection algorithm for task-parallel programs with promises. A promise is a construct that can be used to support arbitrary point-to-point synchronization. The race detection algorithm also applies to programs that use more restricted parallel constructs, as long as those constructs can be implemented using promises. The implementation of the race detector, together with several optimizations introduced, has a comparable slowdown to previous race detectors that do not support promises. Third, I built a tool that visualizes data races found in task-parallel programs. The visualizer consists of a graph builder and a visualization interface. Because most of the previous work only studied the visualization of performance bottlenecks, my tool is the first of its kind to visualize correctness issues detected in parallel programs. I also conducted a performance evaluation and an efficacy research survey to demonstrate the usefulness of the tool. Finally, I designed and implemented a closed-loop application to fix data races with the help of large language model (LLM). The application uses previous fixed data races as few-shot examples and asks ChatGPT to remove data races found in a program. The proposed solution by ChatGPT is checked again to ensure correctness before a new commit is created to notify engineers.