Content area
Pair Programming (PP) has a long history both in the software industry and education. More recently, specially designed environments have made the application of Distributed Pair Programming (DPP) possible, which enables two programmers to work remotely. Through these collaborative activities, students produce better programs, improve their performance and programming skills, and increase their self-confidence. Student attitudes towards Distributed Pair Programming and the factors that affect them, remain largely unexplored, while some of the existing studies have yielded mixed results. One important aspect is to understand the underlying factors that contribute to a successful pairing formation, i.e., factors that make pairs very compatible. This paper focuses on the examination of possible factors which we felt had the potential to affect the compatibility of student pairs who worked remotely. The present study was conducted in the context of a 3rd semester undergraduate "Object-Oriented Programming" course. The OOP concepts were approached through hands-on exercises completed in the lab sessions. Students carried out projects in pairs using the educational DPP system SCEPPSys. The analyzed data were collected from a pre and post questionnaire distributed to students before and after the end of the course, respectively. Pair Compatibility was examined in relation to pair perceived skill level, pair actual skill level, and pair programming self-esteem. Besides this, we examined if students' perceptions on the factors they believe hinder collaboration differ on the basis of their compatibility. The findings indicated that the compatibility rating differed significantly based on the partner's perceived technical competence. Also, students that rated their partners as very compatible had more similar actual skill level with their partners than those students who rated their partners as notcompatible or satisfactorily compatible. We did not find any relationship between compatibility and pair programming selfesteem. Lastly, very compatible pairs rated the following three factors as hindering collaboration less negatively than notcompatible or satisfactorily compatible pairs: a) coordination problems (collaboration time), b) unreliable partner, and c) lack of partner knowledge.
Abstract: Pair Programming (PP) has a long history both in the software industry and education. More recently, specially designed environments have made the application of Distributed Pair Programming (DPP) possible, which enables two programmers to work remotely. Through these collaborative activities, students produce better programs, improve their performance and programming skills, and increase their self-confidence. Student attitudes towards Distributed Pair Programming and the factors that affect them, remain largely unexplored, while some of the existing studies have yielded mixed results. One important aspect is to understand the underlying factors that contribute to a successful pairing formation, i.e., factors that make pairs very compatible. This paper focuses on the examination of possible factors which we felt had the potential to affect the compatibility of student pairs who worked remotely. The present study was conducted in the context of a 3rd semester undergraduate "Object-Oriented Programming" course. The OOP concepts were approached through hands-on exercises completed in the lab sessions. Students carried out projects in pairs using the educational DPP system SCEPPSys. The analyzed data were collected from a pre and post questionnaire distributed to students before and after the end of the course, respectively. Pair Compatibility was examined in relation to pair perceived skill level, pair actual skill level, and pair programming self-esteem. Besides this, we examined if students' perceptions on the factors they believe hinder collaboration differ on the basis of their compatibility. The findings indicated that the compatibility rating differed significantly based on the partner's perceived technical competence. Also, students that rated their partners as very compatible had more similar actual skill level with their partners than those students who rated their partners as notcompatible or satisfactorily compatible. We did not find any relationship between compatibility and pair programming selfesteem. Lastly, very compatible pairs rated the following three factors as hindering collaboration less negatively than notcompatible or satisfactorily compatible pairs: a) coordination problems (collaboration time), b) unreliable partner, and c) lack of partner knowledge.
Keywords: distributed pair programming, pair compatibility, OOP course, programming skills
1.Introduction
The growth of the internet has facilitated communication and collaboration among distributed teams and as a result various technologies have been developed that support multiple users and real time collaborative activities. Nowadays, it is common to use applications that allow concurrent document editing, file sharing, project management, or even software development. One such form of remote software development is Distributed Pair Programming (DPP). It is performed using a system that allows team members to communicate, to coordinate actions, and to write code using a shared file repository and a shared editor (Schümmer and Lukosch, 2009). DPP is practiced by both professional and student programmers; the latter being the focus of our study.
DPP is based on the principles of Pair Programming (PP), an agile software development technique and one of the key practices of Extreme Programming (XP). PP consists of a pair of programmers working together at one computer (Beck and Gamma, 2000). One programmer writes the program code and the other one reviews the inserted code. The two co-workers switch roles regularly. The aim of this practice is to improve software quality and share coding skills. A survey conducted in 2017 by the website StackOverflow, revealed that PP is quite popular in the software industry, since 42.8% of reviewed developers stated that they use PP methodology in software development. In education, PP appeared in computer science classes almost two decades ago. The first experiments of PP in the classroom reported positive outcomes. Since then, extensive studies have been conducted, mainly in higher education.
Research suggests that the use of PP in introductory programming courses has a positive impact on student performance and satisfaction (McDowell et al, 2006; Mendes et al, 2006; Smith et al, 2017). It improves software quality and student confidence in programming ability (McDowell et al, 2006; Braught et al, 2011; Celepkolu and Boyer, 2018). Moreover, students share problem-solving skills and responsibilities, and they may work on largescale projects as professional teams (Schümmer and Lukosch, 2009; Stapel et al, 2010). Despite the benefits, the success of PP depends on pair dynamics and each developer's skills and attitudes (Katira et al, 2004; Chaparro et al, 2005). DPP is as effective as PP (Hanks, 2006) and but has a major advantage over PP: it is more flexible and allows programmers to collaborate remotely.
The focus of this paper is to investigate compatibility between pair programmers and factors that might affect collaboration at a distance. Pair Compatibility was examined in relation to pair perceived skill level, pair actual skill level, and pair programming self-esteem. Besides this, we examined whether students' perceptions on the factors they believe hinder collaboration differ on the basis of their compatibility.
The paper is organized as follows: In the next section (Section 2) a presentation of related work in the field is given. Then, the methodology of the study and the research objectives are presented (Section 3). Section 4 contains the results of the statistical analysis. A discussion and conclusions follow in the last section (Section 5).
2.Related work
Pair formation and pair compatibility are well-studied factors in the literature of PP. Researchers have experimented with various team formation strategies in order to study their impact on students' participation and motivation. Although random pairing has been applied, other factors include students' personality type, and programming skills. In fact, student skill level was shown by a meta-analysis to be the most commonly investigated parameter (Salleh et al, 2011), defined as either actual or perceived. The former is based on students' prior programming experience and academic performance, while the latter is based on students' subjective assessment of their own and their partner's skill levels. Perhaps not surprisingly, most studies came to the conclusion that the best results are achieved by pairing students with similar skill levels.
Williams et al (2006) conducted a number of studies in order to understand the factors that contribute to the compatibility of pair programmers. The pair programming-based courses in which they performed their research were: freshmen Introduction to Programming - Java (CS1), undergraduate (junior/senior) Software Engineering (SE), and graduate Object-Oriented Languages and Systems (OO). They suggest pairing students with similar grades, since students reported that they worked compatibly better with partners of a similar skill level. In the same study, they investigated whether pairs are more compatible when students with similar programming selfesteem are put together. The results showed that students' confidence in their problem-solving skills may be an indicator of pair compatibility. In a study by Van Toll et al (2007), where students formed pairs based on programming skills it was found that PP is more effective when programmers are of a slightly different skill level. Chaparro et al (2005), observed that a difference in skill level between partners negatively affects their collaboration thus reaching the conclusion that matching pairs by skill level is a key factor in the success of PP.
Students' self-esteem in programming has been examined in many studies as a principal factor concerning the effectiveness of PP and DPP. Thomas et al (2003) report that students with less self-confidence seem to enjoy pair programming the most. In contrast, Hanks (2008) states that in his study the most confident students liked pair programming the most, while the least confident students liked it the least. Muller and Padberg (2004) define the feel-good factor of a pair as how comfortably the developers feel in a pair session. It should be mentioned that researchers use the terms 'self-esteem' and 'self-confidence' interchangeably.
In our study, pair compatibility was examined in relation to perceived skill level, actual skill level, and programming self-esteem. The main difference between the aforementioned studies and our study is the programming technique used. Instead of co-located PP, the participants in our study collaborated remotely using a DPP system. Research in the field of DPP is less developed and more empirical studies are needed (da Silva and Prikladnicki, 2015). Canfora et al (2006) performed an experiment to investigate the impact of DPP, and despite the similar programming experience of the participating students, they found that each pair member tended to work alone. They suggest that different levels of programming experience may lead to a more successful collaboration. Another study involving DPP where prior programming experience is correlated with a pair's performance, reports that pairs are more compatible when both students have a similar perceived skill level (Tsompanoudi et al, 2018).
3.Methodology of the study
3.1Research objectives
As already mentioned in the related work section, pair formation and pair compatibility are well-studied factors in the literature of PP and less studied in DPP. Thus, this study focuses on the examination of possible factors which we felt had the potential to affect the compatibility of student pairs who worked remotely. Pair Compatibility was examined in relation to perceived skill level, actual skill level and programming self-esteem.
The following five hypotheses were investigated:
H1: Pairs are more compatible if students with similar perceived skill level are grouped together.
H2: Pairs are more compatible if students with similar actual skill level are grouped together.
H3: Pairs are more compatible if students with similar programming self-esteem are grouped together.
H4: Students' perceptions on the factors that hinder their collaboration differ according to their compatibility.
H5: Students' perceptions on the frequency of use of the system's features differ according to their compatibility.
Hypotheses H1-H3 were adopted from the work by Williams et al (2006: p. 412), where they studied pair compatibility when students worked co-located. We adopt these three hypotheses because we wanted to investigate if the same results hold for DPP as for PP. Besides this, we further investigate hypotheses H4 and H5 which are related to DPP, compatibility and the system used by the students to collaborate remotely.
3.2Course outline - participants
For the investigation of the above stated hypotheses, a study was carried out in the context of a 3rd semester undergraduate "Object-Oriented Programming" course at an Informatics Department. Data were collected throughout the spring semester of the academic year 2016-17. The course was taught in the lab and hands-on exercises were used for presenting and familiarizing students with OOP concepts. A summarized course outline is presented in Table 1.
Within the course context, students carried out five DPP assignments in pairs using the educational DPP system SCEPPSys. Details on the DPP assignments is presented in summary form in Table 2.
SCEPPSys (Tsompanoudi et al, 2015) is an educational DPP system that comprises an Eclipse plugin installed by students and a web-based authoring tool used by instructors for scripting DPP. In order to start a DPP session, both students must log in to the system, while assignments are solved synchronously. SCEPPSys includes the following categories of features:
* Typical features of DPP systems: a shared editor; synchronization of editors after connection problems; support for the roles of the driver and navigator and role switching either by the system or freely; a textbased chat tool for communication; remote code highlighting (a basic gesturing feature) that enables the navigator to point out code parts in order to indicate potential problems to the driver; synchronized program execution (the driver executes the project and both the driver and the navigator watch execution results).
* "Awareness indicator" features, whose aim is to provide pair programmers with information about the user's status and performed actions within the workspace, such as editing, saving etc., as well as their participation rates.
Unique didactic features that serve specific needs: assignments comprise of small and manageable tasks or steps associated with specific didactic goals or else OOP concepts; hints can be retrieved for each task that support students to complete each task.
3.3Measurement instrument and data analysis
The meanings and measures of the factors in our study are given below. As in the research
Actual Skill Level is measured as the absolute difference between the partners' grades, where the grade is made up of the mean from the 3 courses (Algorithms, Procedural Programming, and OOP-Java). "Procedural Programming (C programming language)" and "Algorithms in C" are introductory first year courses and their syllabi and assignments are quite typical of Universities around the world. In the literature, actual skill level is based on students' prior programming experience and academic performance. In the present study, actual skill level was measured in the same way as in the research by Williams et al (2006).
Programming self-esteem is measured as the absolute difference between the partners' self-esteem about Programming. Self-esteem in programming is measured by the students themselves on a scale from 1 to 9 prior to the beginning of the OOP course. Students were asked to answer question (Q)1 on the pre-questionnaire.
Question (Q)1. Place yourself on a 1 to 9 scale with the following endpoints:
1 = I don't like programming, and I think I am not good at it. I can write simple programs, but have trouble writing new programs for solving new problems.
9 = I have no problems at all completing programming tasks to date, in fact they weren't challenging enough. I love to program and anticipate no difficulty with this course.
In their studies on Pair Programming, Thomas et al. (2003: p. 364) and Williams et al. (2006: p. 417) posed Q1 in order to measure students' self-esteem on their ability to program. Consequently, although one might argue that Q1 asks how challenging the assignments were, and how much students like programming, it should be noted that given the specific context, the question deals mainly with students' confidence in their ability to program.
Perceived Skill Level (partner): Students were asked to provide their perception of their partner's technical competence with regard to their own competence [Better, About the same, Weaker] as a response to question (Q)2 on the post-questionnaire.
Question (Q)2. Assess the technical competency of your partner in relation to yourself [Better, About the same, Weaker].
In the literature, perceived skill level is based on students' subjective assessment of their own and their partner's skill level. Question (Q)2 was adapted from similar researches conducted in the context of PP. Muller and Padberg (2004), used Q2 to measure how comfortable students felt during the pair programming session, and the same question was used by Williams et al (2006), to ask students to evaluate their partner's technical competence.
Compatibility was based on the student's perceived compatibility. Each student evaluated, on a scale from 1 (not-compatible) to 3 (very compatible), how compatible they felt with their partner regarding the latter's programming ability (question (Q)3 on the post-questionnaire).
Question (Q)3. Assess how compatible you felt you and your partner were [Very Compatible, satisfactorily compatible, Not Compatible].
Williams et al (2006), asked students question (Q)3 in order to evaluate their perception on their joint compatibility.
In the present study, Pair Compatibility was based on students' responses to Q3: when both students chose Not Compatible to characterize their partner, we classed them as Not compatible pair; when both students chose the Very Compatible option, they were classed as such; and all other pairs were classed as Satisfactorily compatible.
The analyzed data were gathered from:
* The grades achieved in the three courses: the first year introductory courses "Procedural Programming (C programming language)" and "Algorithms in C", and the "OOP".
* Students were asked to answer Q1 prior to the OOP course (pre-questionnaire).
* In order to investigate students' attitudes on DPP, they were given a questionnaire as a Google form on completion of the DPP assignments at the end of the semester. The students were asked to evaluate: their perception of their partner's skill level (Perceived Skill Level) (Q2); their joint compatibility (Compatibility) (Q3); factors hindering collaboration (Q4); and lastly, they were asked to rate the frequency of use of the system's features which aimed to facilitate DPP (Q5).
The following questions (Q)4 and (Q)5 were included on the post-questionnaire:
Q4. What factors hindered collaboration?
(1=very much, 2=much, 3=averagely, 4=a little, 5=not at all)
Coordination problems (collaboration time)
Unreliable partner
Lack of partner knowledge
Dominating role of partner
Technical problems
Difficulty in using the plugin
Q5. Rate the frequency of use of the system's features which aimed to facilitate DPP over Eclipse.
(1=not at all, 2=a little, 3=averagely, 4=much, 5=very much)
Role switches
Synchronization in the execution of a program
Retrieve help (hint) of the solution for every programming task
Remote code highlighting
Display participation rates
Synchronization of editors
Out of the 88 students, the statistical analysis was compiled on 78, as these students answered both questionnaires.
We should point out that the few students who rated their partner as not compatible (1%), we combined with those students who rated themselves as satisfactorily compatible. Combining these two groups of students has a reasonable basis and allows for a more reliable application of statistical tests.
Statistical analysis was performed by using IBM SPSS Statistics. The Chi-square Test of Independence was applied for H1, while the Mann-Whitney test was used for the rest Hypotheses.
4.Results
A Chi-Square test was performed on H1. The Chi-Square test indicated that the compatibility rating differed significantly (X2=8.422, df=2, p=0.015) according to the partner's perceived technical competence (partner's assessment). The students, who considered their partners as weaker than themselves in technical competence, tended (90.9%) to rate their partners as satisfactorily compatible or not compatible rather than very compatible (9.1%) (Table 3).
A Mann-Whitney test was performed on H2. Students who rated their pairs as very compatible have more similar actual skill level with their partners (Z=-2.922, p=0.003) than those who rated their pairs as satisfactorily compatible or not compatible.
The absolute difference between the partners' grades who rated their pairs as very compatible (M=1.3, SD=1.2) was lower than that between the partners' grades who rated their pairs as satisfactorily compatible or not compatible (M=2.1, SD=1.3).
A Mann-Whitney test was performed on H3. There was no significant absolute difference (Z=-1.385, p=0.166) between students' programming self-esteem and compatibility rates (satisfactorily compatible or not compatible than very compatible).
A Mann-Whitney test was performed on H4. Students' responses to Q4 are presented in Table 4. As might well be expected, very compatible students evaluated the following three factors less negatively than the others: collaboration problems, unreliable partner, and lack of partner knowledge.
A Mann-Whitney test was performed on H5. Table 5 summarizes the results of students' answers on frequency of use of the system's features. The results did not reveal any significant statistical difference between the two groups and how they rated the frequency of use of the system's features (collaboration features). The only significant difference was found in the "re-mote code highlight" feature which was used more by very compatible pairs than the other subgroup of pairs (not-compatible and satisfactorily compatible pairs) (p=0.024).
5.Discussion - conclusions
This study examined possible factors that we felt had the potential to affect the compatibility between students who worked in pairs in an OOP course using a DPP environment. Based on our findings, some conclusions were drawn, which contribute to the enhancement of knowledge in the literature.
Since the study findings suggest that students with a similar perceived skill level, form more compatible pairs, they support H1, which agrees with the study on PP by Williams et al (2006) whose results showed that a significantly positive relationship exists between compatibility and the partner's perceived skill level.
As regards compatibility and partner's actual skill level, which provides more objective information than perceived skill level our findings indicate that students with a similar actual skill level rated their partners as very compatible, which supports H2. Although, Williams et al (2006) measured actual skill level differently to us (measuring actual skill level as the absolute differences in the partners' midterm grades, SAT, GRE, and overall), they did not yield the same results as they were not generally able to support the hypothesis about compatibility and actual skill level, stating that it is not feasible for an instructor to proactively match students based on available measures of skill (midterm, SAT, GRE, GPA), except for the use of midterm and SAT in the Software Engineering (SE) class.
In the present study, we also examined whether students with similar programming self-esteem form more compatible pairs (H3); which, however, was not supported by the findings. Williams et al (2006), on the other hand, found that this hypothesis was partially supported and that students' confidence in their problem-solving abilities may be an indicator of pair compatibility.
Besides the three main hypotheses motivated by similar studies on PP, students were asked to evaluate factors mentioned in the literature that hinder collaboration, which in the present study were examined in relation to the degree of student compatibility The interesting finding was that very compatible students rated the three factors (coordination problems (finding common time for collaboration), unreliable partner, and lack of partner knowledge) significantly less negatively than the subgroup of satisfactorily and not compatible pairs. Two other factors that all pairs rated as existing to a moderate degree were difficulties related to plugin and technical problems.
Finally, irrespective of their compatibility level, students used the features of the DPP system with almost the same frequency. Only very compatible pairs used the "remote highlight" feature more frequently than the other pairs, which shows that the navigator was active during the process of code developing and this is an indicator that high compatibility supports collaboration.
It appears that our findings concerning student compatibility are in accordance with most of the findings on similar studies on PP, which is encouraging as DPP is more demanding than PP. In this study, SCEPPSys the educational system developed for a typical undergraduate OOP course, promotes student collaboration and balanced participation in DPP. Clearly, the present study results that used SCEPPSys and a specific set of assignments cannot be generalized for all DPP educational settings. Despite the limitations, this study adds to the body of literature on DPP, since compatibility and pair formation are issues that need further research in order to formulate reliable conclusions.
Acknowledgements
This research is funded by the University of Macedonia Research Committee as part of the "Principal Research 2019" funding program.
References
Beck, K. and Gamma, E. (2000). Extreme programming explained: embrace change. Addison-Wesley professional.
Braught, G., Wahls, T. and Eby, L. M. (2011). The case for pair programming in the computer science classroom. ACM Transactions on Computing Education (TOCE), 11(1), 2.
Canfora, G., Cimitile, A., Di Lucca, G. A. and Visaggio, C. A. (2006) How distribution affects the success of pair programming. International Journal of Software Engineering and Knowledge Engineering, 16(02), 293-313.
Celepkolu, M. and Boyer, K. E. (2018). Thematic analysis of students' reflections on pair programming in cs1. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (pp. 771-776). ACM.
Chaparro, E. A., Yuksel, A., Romero, P. and Bryant, S. (2005) Factors Affecting the Perceived Effectiveness of Pair Programming in Higher Education. In PPIG (p. 2).
da Silva Estácio, B. J. and Prikladnicki, R. (2015) Distributed pair programming: A systematic literature review. Information and Software Technology, 63, 1-10.
Hanks, B. (2006). Student attitudes toward pair programming. In ACM SIGCSE Bulletin (Vol. 38, No. 3, pp. 113-117). ACM.
Hanks, B. (2008) Empirical evaluation of distributed pair programming. International Journal of Human-Computer Studies, 66(7), 530-544.
Katira, N., Williams, L., Wiebe, E., Miller, C., Balik, S. and Gehringer, E. (2004). On understanding compatibility of student pair programmers. In ACM SIGCSE Bulletin (Vol. 36, No. 1, pp. 7-11). ACM.
McDowell, C., Werner, L., Bullock, H. E. and Fernald, J. (2006). Pair programming improves student retention, confidence, and program quality. Commun. ACM 49, 8.
Mendes, E., Al-Fakhri, L. and Luxton-Reilly, A. (2006). A replicated experiment of pair-programming in a 2nd-year software development and design computer science course. ACM SIGCSE Bulletin, 38(3), 108-112.
Muller, M. M. and Padberg, F. (2004) An empirical study about the feelgood factor in pair programming. In 10th International Symposium on Software Metrics, 2004. Proceedings (pp. 151-158).
Salleh, N., Mendes, E. and Grundy, J. (2011) Empirical studies of pair programming for CS/SE teaching in higher education: A systematic literature review. IEEE Transactions on Software Engineering, 37(4), 509-525.
Schümmer, T. and Lukosch, S. G. (2009). Understanding tools and practices for distributed pair programming. Journal of Universal Computer Science, 15 (16).
Smith, M. O., Giugliano, A. and DeOrio, A. (2017). Long Term Effects of Pair Programming. IEEE Transactions on Education, 61(3), 187-194.
Stapel, K., Knauss, E., Schneider, K. and Becker, M. (2010). Towards understanding communication structure in pair programming. In International Conference on Agile Software Development (pp. 117-131).
Thomas, L., Ratcliffe, M. and Robertson, A. (2003) Code warriors and code-a-phobes: a study in attitude and pair programming. In ACM SIGCSE Bulletin (Vol. 35, No. 1, pp. 363-367), ACM.
Tsompanoudi, D., Satratzemi, M., Xinogalos, S. and Karamitopoulos, L. (2018) An Empirical Study on Pair Performance and Perception in Distributed Pair Programming. In International Conference on Interactive Collaborative Learning (pp. 762-771), Springer.
Tsompanoudi, D., Satratzemi, M. and Xinogalos, S. (2015) Distributed Pair Programming Using Collaboration Scripts: An Educational System and Initial Results. Informatics in Education, 14(2), 291.
Van Toll, T., Lee, R. and Ahlswede, T. (2007) Evaluating the usefulness of pair programming in a classroom setting. In 6th IEEE/ACIS International Conference on Computer and Information Science (ICIS 2007), 302-308.
Williams, L., Layman, L., Osborne, J. and Katira, N. (2006) Examining the compatibility of student pair programmers. In AGILE 2006 (AGILE'06), IEEE (pp 411-420).
Copyright Academic Conferences International Limited Nov 2019