Content area
Today, most popular applications and websites such as Google and Facebook are multilanguage systems. Google is developed with C, C++, Go, Java, Python, and JS, while Facebook is using Hack, PHP, Python, C++, Java, Erlang, D, Haskell, and JS. Furthermore, previous research studies reported that PHP developers regularly use two languages besides PHP. Java developers also use C/C++ with Java code through the Java native interface (JNI) that allows to call native functions from Java methods and Java methods from native functions. Moreover, Android application developers prefer to use the Android NDK (allows to use C/C++ code with Android) along with Java compared to using only Java. In all these examples, we observe the phenomenon of multi-language development. While in many cases, there is one clear main language (e.g., Java or C/C++) with various smaller contributions from other languages (e.g., bash or make), increasingly more modern software are heterogeneous in the sense that they are composed of multiple programming languages that interact with the host language to a large extent. Multi-language development presents a good practice for software development because it takes advantage of existing libraries written in other programming languages (code reuse), which leads the industry to save development time and project budgets. As multi-language development consists of combining languages with different semantics and lexical programming rules, this leads to complicate the code comprehension and, especially, the code maintenance, i.e., dependency tracking, data synchronisation between the different components, communication between the combined languages, exception management, and change impact analysis. Not all of these challenges are discussed in the literature. Based on the literature, the major concerns of developers consist of: Change impact analysis and its relation with component dependencies and the adequate choice of multi-language design patterns. Hence, in this thesis, we first take a step back to better understand how and why developers opt for multi-language development, before evaluating the challenges related to this kind of development and their impact on software quality and security. For these purposes, we conducted several qualitative and quantitative empirical studies on large open-source multilanguage software systems.