Serials Solutions was proud to sponsor DataFest 2013, a data analysis competition for undergraduate students from Duke University, North Carolina State University and the University of North Carolina at Chapel Hill to tackle complex datasets as a means to solve complex problems.
“Big Data,” loosely defined as the ability to gather, analyze, interpret and most importantly act on large volumes of data to identify and solve problems, is a definitely a hot topic on college campuses and beyond. In a previous blog post we focused on how Big Data analysis is informing library discovery improvements as Serials Solutions analyzes global usage data to inspire and optimize features and functionality of the Summon service.
DataFest gave 52 students (12 teams of 4-5 students per team) the chance to work with big data and open ended research questions that go far beyond what they usually encounter in their coursework. The students spent an activity filled weekend analyzing a large and complex dataset in order to find, and then present, insights into the data in relation a problem (or set off problems) presented.
The theme of this year's DataFest was online dating using data shared by eHarmony. The dataset contained information on matches generated by the eHarmony matching algorithm, indicators for the success of these matches (whether the matched users communicated with each other), and other information that users shared on their profile, including personal and personality characteristics and what they look for in a partner.
Students’ work resulted in fascinating presentations on topics such as using predictive modeling to understand communication between compatible eHarmony matches and the levels that people will still communicate with one another in relation to whether eHarmony relaxes its matching algorithms.
The competition kickoff, presentations and wrap-up took place at SAMSI, the Statistical and Applied Mathematical Sciences Institute—a partnership of Duke University, North Carolina State University (NCSU), the University of North Carolina at Chapel Hill (UNC), and the National Institute of Statistical Sciences (NISS)—based in Research Triangle Park, North Carolina. DataFest gave participants a chance to experience firsthand the inherent interdisciplinary nature of statistics and data analysis, and the powerful opportunities that Big Data presents to improve experiences.
As a provider of data-driven Software as a Service solutions, Serials Solutions was proud to support these students in exploring the intricacies and impact big data analysis can have on product development.