Content area
Graduate programs in data analytics differ not only in content but also in how they frame the purpose of analytics education. This study investigates whether business and non-business programs exhibit distinct curricular framing orientations. It hypothesizes that business programs emphasize outcome-oriented framing, presenting analytics as a tool for decision-making and value delivery, while non-business programs emphasize mastery-oriented framing focused on technical depth and methodological rigor. To evaluate this distinction, the study analyzes 1,972 course descriptions from 109 graduate programs using natural language processing techniques capable of identifying semantic similarities in instructional language. The results reveal significant differences in how programs describe the role of analytics, with statistical and validation analyses supporting the hypothesized framing distinction. These findings position framing orientation as a structural feature of curriculum design and provide practical guidance for educators, curriculum developers, and academic leaders seeking to align analytics programs with institutional goals and workforce expectations.
ABSTRACT
Graduate programs in data analytics differ not only in content but also in how they frame the purpose of analytics education. This study investigates whether business and non-business programs exhibit distinct curricular framing orientations. It hypothesizes that business programs emphasize outcome-oriented framing, presenting analytics as a tool for decision-making and value delivery, while non-business programs emphasize mastery-oriented framing focused on technical depth and methodological rigor. To evaluate this distinction, the study analyzes 1,972 course descriptions from 109 graduate programs using natural language processing techniques capable of identifying semantic similarities in instructional language. The results reveal significant differences in how programs describe the role of analytics, with statistical and validation analyses supporting the hypothesized framing distinction. These findings position framing orientation as a structural feature of curriculum design and provide practical guidance for educators, curriculum developers, and academic leaders seeking to align analytics programs with institutional goals and workforce expectations.
KEYWORDS:
Data Analytics Curriculum, Framing Orientation, Graduate Education, Business Analytics Programs, Semantic Modeling, Curriculum Design Patterns, Natural Language Processing, Course Description Analysis
Introduction
As organizations increasingly rely on data to inform decision-making, the demand for professionals who can translate information into actionable insight continues to grow. Graduate programs in data analytics have emerged as a key pipeline for developing this workforce, equipping students with technical capabilities, analytical thinking, and domain-specific knowledge. Among these offerings, business analytics has evolved into a distinct area that integrates quantitative methods with managerial insight to address organizational challenges.
Designing effective data analytics curricula requires more than assembling a list of technical topics or mapping competencies to industry roles. Educators must respond to a complex set of pressures: aligning with accreditation standards, incorporating both business and technical knowledge, managing credit hour limitations, and accommodating varied faculty expertise(Gharehgozli, Gupta, & Seung-Kuk, 2024; Gupta, 2023). These constraints shape not only what is taught but also how programs structure learning experiences and frame the purpose of analytics education.
The challenge is not unique to business programs. Fields such as healthcare, engineering, and the social sciences also integrate data analytics into their graduate education, often drawing on similar tools and techniques. However, these disciplines differ in how they frame the role of analytics within their professional domains. For instance, healthcare analytics frequently emphasizes clinical outcomes and efficiency(Nkwanyana, Mathews, Zachary, & Bhayani, 2023), while engineering analytics may stress systems optimization and technical precision(Luis, Edgar, Alfonso, & Christopher, 2021). Such differences suggest that curricula are shaped by distinct assumptions about what analytics is for and how it should be applied.
This study argues that these differences are not limited to content but extend to how learning is framed. Specifically, it hypothesizes that business-oriented programs emphasize outcome-oriented framing, presenting analytics as a means to inform decisions, deliver value, and drive strategic outcomes. In contrast, non-business programs tend to emphasize mastery-oriented framing, portraying analytics as a methodological discipline focused on technical rigor and computational depth.
To explore this hypothesis, the study uses course descriptions as the unit of analysis. These short texts are often overlooked, yet they offer standardized and comparable signals of a program's instructional priorities and framing logic. While prior research has often focused on identifying what topics are taught using methods such as topic modeling or keyword analysis, this study takes a different approach. It examines how course descriptions construct meaning through semantic similarity analysis. This shift from topic-based to semantic-based analysis allows for the detection of structural patterns in how curricula are organized, and educational goals are communicated.
The study applies transformer-based language modeling and density-based clustering to analyze 1,972 course descriptions from 109 graduate data analytics programs across both business and non-business domains. By modeling the semantic space of these descriptions, it uncovers clusters of meaning that reflect recurring framing structures, referred to here as Curricular Design Patterns. These structural patterns emerge from recurring semantic similarities and capture how institutions describe the role of analytics within their programs. Rather than reflecting only topical content, they offer insight into how programs frame analytics as an educational pursuit, either as a tool for delivering outcomes or as a discipline rooted in technical mastery. These patterns are not predefined by taxonomies or competencies but emerge from how institutions describe the purpose and structure of analytics education.
This study investigates whether curricular framing differs by disciplinary context. Specifically, it asks: Do graduate-level business and non-business data analytics programs differ in their curricular framing orientation, with business programs emphasizing outcomes and non-business programs emphasizing mastery?
By addressing this question, the study offers a new lens for understanding how disciplinary context shapes not only the content of analytics education but also its underlying educational intent. It highlights framing orientation as a structural feature of program design that can influence learning outcomes, professional preparation, and interdisciplinary coherence. The findings also provide practical value for curriculum designers and academic leaders by clarifying how programs can differentiate themselves, align course structures with institutional goals, and respond more effectively to industry expectations.
Literature Review
As graduate programs in data analytics continue to expand, researchers have examined how curricula are developed, structured, and aligned with workforce needs. Much of the existing literature emphasizes the integration of technical and business competencies, as well as the use of frameworks to guide program design. However, relatively few studies have explored how different disciplines frame the purpose and role of analytics education. This distinction between what is taught and how learning is framed becomes especially important when comparing business and non-business programs. Although programs may draw on similar tools and techniques, they often differ in how they describe the goals and applications of analytics(Collier & Powell, 2024).
To contextualize this study's focus on framing orientation, this section reviews three streams of relevant literature. Section 2.1 summarizes the growth of graduate data analytics programs and how curricular priorities differ across academic domains. Section 2.2 explores how competency frameworks shape curriculum design and highlights the importance of instructional framing. Section 2.3 reviews methodological approaches to analyzing curricular content and presents the case for semantic modeling as a more direct way to evaluate how learning goals are expressed through language.
Graduate Analytics Education Across Domains
The growth of graduate programs in data analytics reflects increasing demand for professionals capable of transforming data into actionable insights across sectors. According to the Institute for Advanced Analytics (n.d.), the number of master's programs in analytics and data science in the United States has increased to more than300, producing 20,000 graduates annually. This expansion has driven efforts to balance technical proficiency, domain acumen, and responsiveness to workforce needs.
Business schools have played a central role in this growth by embedding analytics into their curricula to meet employer expectations and accreditation standards (Clayton & Clopton, 2019; Gharehgozli et al, 2024). Many institutions begin with graduate certificates before developing full degree programs (Bacic, Jukic, Malliaris, Nestorov, & Varma, 2023). These programs often form partnerships with departments in computer science, statistics, or information systems to ensure technical depth and curricular flexibility (Mitchell, Woolridge, & Johnson, 2021; Verma, Yurov, Lane, & Yurova, 2019). In institutions without standalone programs, analytics content may be embedded within electives or business concentrations (King, 2022).
As programs evolve, researchers have examined how disciplinary context shapes curricular emphasis. Aasheim, Williams, Rutner, and Gardiner (2015) compared undergraduate data analytics and data science programs and found that analytics curricula in business schools often emphasize applied decision-making, tool evaluation, and communication. In contrast, data science programs tend to prioritize programming, algorithm development, and advanced mathematics. These differences reflect the disciplinary roots of each program. Analytics is commonly grounded in applied business settings, while data science is more often situated in technical academic disciplines. Almgerbi, De Mauro, Kahlawi, and Poggioni(2022) extended this inquiry through a review of job postings and course offerings. Their findings revealed persistent gaps in areas such as data engineering, suggesting that institutional capabilities and disciplinary traditions may shape curriculum design more than direct employer feedback.
Practitioner literature supports this distinction. Provost and Fawcett (2013b) argue that the ability to think analytically about data, rather than simply operate tools, is essential for informed decision-making. In a related article, they differentiate data science as a discipline focused on creating generalizable models for extracting knowledge from data. Business analytics, in contrast, is described as the application of those models to solve domain-specific problems (Provost & Fawcett, 2013a). These perspectives suggest that curricula differ not only in content, but also in how programs define the purpose of analytics education. Some frame analytics as a tool for decision support, while others emphasize its role as a technical and methodological discipline.
Although these studies highlight meaningful differences in curricular emphasis, few have examined how those differences are expressed in the language used to describe learning goals and instructional intent. This study contributes to that conversation by focusing on how business and non-business programs differ in framing orientation, defined here as the way programs communicate the purpose, application, and intended outcomes of analytics education through course descriptions.
Framing Orientation in Curriculum Design
Designing analytics curricula involves balancing multiple demands, including technical rigor, domain relevance, pedagogical clarity, and responsiveness to labor market needs. To manage these demands, many programs adopt competency frameworks that outline the knowledge, skills, and abilities students are expected to acquire. These frameworks, developed in both academic and industry settings, serve as high-level guides for curriculum planning and course development.
Numerous studies have identified core competencies required for success in analytics-related roles. For example, Collier and Powell (2024) emphasize skills such as data management, statistical reasoning, and structured problem-solving. Others argue for the inclusion of business understanding, ethical reasoning, and communication alongside technical training (Bacic et al., 2023; Gupta, 2023). Competency models help programs define what students should learn and are often mapped to learning outcomes that guide both instructional planning and assessment.
However, several studies have also highlighted the importance of how competencies are delivered, not just what they include. Mitchell et al. (2021) found that business analytics professionals are expected to demonstrate communication skills, adaptability, and contextual awareness in organizational settings. These expectations influence curricular strategies that prioritize applied formats such as storytelling, concise writing, and audience-specific messaging. Mitchell and colleagues also note that real-time responsiveness, interpersonal credibility, and the ability to frame insights within business contexts are crucial. These findings suggest that curriculum design must address both the substance of instruction and the way learning is contextualized and practiced.
Competency-based models provide a useful foundation, but they offer limited insight into how learning is framed across an entire program. This limitation becomes especially important when comparing curricula across disciplines. For example, business programs frequently use case studies and applied decision-making scenarios. In contrast, programs rooted in science, engineering, or computing often focus on algorithmic development, simulation, or systems design (Aasheim et al, 2015; Provost & Fawcett, 2013b). These differences suggest distinct framing orientations. One may emphasize decision support and strategic outcomes, while the other may emphasize methodological mastery and computational sophistication.
This study adopts a structural perspective that focuses on these framing choices. Rather than evaluating whether a program includes certain competencies, it examines how course descriptions communicate the purpose of learning. This emphasis on framing orientation moves beyond content coverage to consider the educational intent behind a curriculum. By analyzing the language of course descriptions, the study identifies recurring patterns that reveal whether a program prioritizes applied outcomes, technical mastery, or both.
Understanding framing orientation in this way adds an important layer to curriculum design. It helps explain not only what is taught, but also how it is positioned in a particular way. These framing choices reflect deeper institutional values and disciplinary identities, which in turn shape how students prepare for professional roles. By surfacing these patterns, the study supports more intentional curriculum development and offers a new lens for comparing programs across domains.
Capturing Curricular Framing
Most prior research on analytics curricula has focused on identifying which topics are taught. Methods such as keyword analysis and topic modeling have helped surface common tools, techniques, and content themes. Topic modeling approaches, including Latent Dirichlet Allocation (LDA) and BERTopic, group documents based on patterns of word co-occurrence(Grootendorst, 2022; Karadaǧ, Parim, & Büyüklü, 2023; Nkwanyana et al., 2023). These techniques are effective for detecting recurring subject matter across programs, particularly when the goal is to map topical coverage or identify thematic clusters.
However, topic modeling is limited in its ability to capture how programs frame the purpose of learning. A course that discusses regression, simulation, or time series analysis may be grouped with others based on shared terminology, even if its instructional emphasis is entirely different. One course might describe these techniques as tools for supporting managerial decision-making, while another might focus on algorithmic implementation or theoretical precision. Topic models treat both as similar, even though they differ in intent. To address this limitation, the current study uses semantic modeling, which focuses on the contextual meaning of language rather than its frequency. Transformer-based models such as DistilBERT encode each course description into a high-dimensional vector that reflects how words function within their surrounding context. These semantic embeddings enable comparison of course descriptions based on overall meaning, making it possible to detect whether they frame analytics as outcome-driven or mastery-focused, even when they contain overlapping technical content(Silva Barbon & Akabane, 2022).
This distinction is critical. Topic models infer latent thematic structures based on probabilistic relationships between terms. Resulting topics are abstract constructs that must be interpreted post hoc. In contrast, semantic embeddings are direct representations of meaning, generated through pre-trained language models that preserve contextual nuance. Rather than estimating what themes are present, semantic modeling allows for the direct measurement of similarity between how different programs describe analytics education.
This method aligns more directly with the study's goal: to assess differences in framing orientation across business and non-business analytics programs. By modeling course descriptions in semantic space and applying clustering techniques, the analysis identifies groups of courses that reflect similar instructional intent. These clusters, referred to here as Curricular Design Patterns (CDPs), are not predefined categories. Instead, they emerge as groupings from the structure of the semantic space, offering a lens into how different institutions organize and communicate the purpose of analytics education.
This approach offers a novel contribution to curriculum analysis by combining semantic modeling with unsupervised clustering to reveal patterns not evident through topic modeling or manual content review. It allows the study to move beyond listing what is taught and instead uncover how programs position the role of analytics within their broader educational goals.
Methodology
This study analyzed graduate-level data analytics programs to evaluate whether business and non-business curricula differ in how they frame the purpose of analytics education. The analysis followed a multi-step workflow that combined semantic similarity analysis, clustering, and interpretive validation to assess differences in framing orientation. Figure 1 illustrates the framing analysis pipeline, which included data preparation, modeling, and analysis.
Overview of the study's analytic process. The workflow includes two main phases: Data Preparation and Data Analysis. Course descriptions were collected, preprocessed, and embedded using DistilBERT to generate semantic vectors. These embeddings were analyzed to assess semantic similarity between business and non-business programs, identify clusters of similar framing (Curricular Design Patterns), and test alignment with outcome- and mastery-oriented framing.
Data Collection and Preprocessing
Graduate programs were identified using the Institute for Advanced Analytics Program List, which catalogs analytics and data science degrees across the United States. A total of 109 programs were included in the final dataset, representing programs that publicly provided course descriptions through their university websites or online catalogs. Fach program was reviewed and classified based on its institutional affiliation. Programs housed within a university's business school or college were categorized as business programs (for example, M.S. in Business Analytics or M.S. in Marketing Analytics). All other programs were categorized as non-business programs, including those offered through departments of computer science, engineering, statistics, or interdisciplinary units (for example, M.S. in Data Science or M.S. in Data Analytics Engineering). Of the 109 programs, 56 were classified as business and 53 as non-business. This classification supported the study's central comparison: whether programs differ in how they frame the purpose and role of analytics education.
Course descriptions were manually collected from these programs, yielding a total of 1,972 entries. Each description was paired with its corresponding course title to preserve context. Texts were converted to lowercase, stripped of extra whitespace, and cleaned of non-informative characters such as percent signs, formatting symbols, HTML artifacts, and course codes. Standard punctuation was retained to preserve sentence structure. The processed descriptions averaged 91 tokens per course (where a token approximates a word or text unit), with a standard deviation of 54.8 and a range from 2 to 450 tokens. Of the 1,972 courses, 964 were associated with business programs and 1,008 with non-business programs. The dataset was nearly balanced between required (922) and elective (1,050) courses. This preprocessing step ensured a standardized text corpus suitable for semantic modeling and comparison.
Following preprocessing, each course description was transformed into a 768-dimensional semantic vector using DistilBERT, a transformer-based language model from the BERT family(Sanh, 2019). DistilBERT was chosen for its efficiency and its ability to preserve contextual nuance. Unlike earlier word embedding methods that produce static representations (Mikolov, Sutskever, Chen, Corrado, & Dean, 2013; Pennington, Socher, & Manning, 2014), BERT-based models generate context-sensitive embeddings by analyzing the full sentence structure and word relationships (Devlin, Chang, Lee, & Toutanova, 2019). The study used these semantic vectors not only to compare what topics courses covered, but also to examine how those courses were framed, determining whether they emphasized practical application and decision-making or focused on technical depth and methodological rigor.
Semantic Similarity Analysis
To determine whether business and non-business programs differ in how they frame analytics education, the study compared the semantic similarity of course descriptions within and across the two groups. Cosine distance was used to calculate pairwise comparisons between embedded descriptions, resulting in three sets: 1) within business programs, 2) within non-business programs, and 3) between business and non-business programs. These comparisons formed the basis for evaluating whether programs express distinct framing orientations in their course language.
This stage addressed the study's central research question by testing the following hypotheses:
* Hla: The average semantic distance between business and non-business courses is greater than the average distance within business courses.
* Hib: The average semantic distance between business and non-business courses is greater than the average distance within non-business courses.
Independent sample t-tests were used to compare the mean distances across groups. Because high-dimensional sentence embeddings often produce similarity distributions that are not normally distributed, Wilcoxon Rank-Sum tests were also conducted as a non-parametric robustness check (Reimers & Gurevych, 2019). Effect sizes were computed using both Cohen's d and Cliff's Delta to assess the magnitude of observed differences, providing a practical interpretation alongside statistical significance.
In this context, greater cosine distance indicates lower semantic similarity, which reflects more distinct framing of analytics education. Smaller distances suggest that courses share a similar framing orientation, regardless of program type.
Clustering and Curricular Design Pattern Discovery
To evaluate whether business and non-business programs exhibit structural differences in how analytics education is framed, the study used clustering to group semantically similar course descriptions. These clusters, referred to as Curricular Design Patterns (CDPs), represent recurring semantic structures in course descriptions, reflecting patterns in language use rather than in specific topical content. These patterns do not correspond to predefined topics or competencies, but instead emerge organically from the semantic similarities among course descriptions.
Dimensionality reduction was performed using Uniform Manifold Approximation and Projection (UMAP), which reduced the original 768-dimensional DistilBERT embeddings to 150 dimensions. This step preserved local semantic relationships while improving computational efficiency and cluster separation(Mclnnes, 2018). Clustering was then conducted using HDBSCAN, a density-based algorithm that identifies natural groupings in the data based on local density without requiring the number of clusters to be set in advance (Campello, Moulavi, Zimek, & Sander, 2015).
The clustering process resulted in 33 distinct CDPs. An additional 295 courses (14.96 percent of the total) were classified as noise due to insufficient similarity with other entries. Cluster quality was evaluated using HDBSCAN stability scores, which measure how consistently a cluster is identified across varying density thresholds. The average stability score was 0.92. Of the 33 clusters, 28 exceeded the high-confidence threshold of 0.80, and none fell below 0.50, indicating that the identified clusters were well-defined and robust.
The analysis tested whether business and non-business courses were distributed differently across these clusters by evaluating two hypotheses:
* Ha: Business and non-business courses are not evenly distributed across Curricular Design Patterns.
* H2b: The distribution of business and non-business courses significantly differs within specific Curricular Design Patterns.
To test H2a, a Chi-Square Test of Independence was used to evaluate whether course type (business vs. non-business) was independent of CDP membership. To test H2b, separate chi-square tests were conducted for each cluster to assess whether the observed number of business and non-business courses deviated significantly from expected proportions. A Benjamini-Hochberg correction (Benjamini & Hochberg, 1995) was applied to control for false discovery rate across the 33 comparisons.
Based on the statistical results, each cluster was categorized according to the distribution of course types within it. Clusters that included a significantly higher proportion of business courses were labeled 'Business', while those with a significantly higher proportion of non-business courses were labeled "Non-Business". Clusters in which the difference was not statistically significant were classified as 'Shared'. These designations enabled subsequent interpretation of each CDP's relevance to the framing orientations found in business and non-business programs.
Although the CDPs were generated as structural outputs, their composition and framing can also inform practical decisions about curriculum design, particularly for educators seeking to align program structure with domain-specific learning goals.
CDP Framing
To further explore how Curricular Design Patterns (CDPs) differ in framing orientation, the study conducted a framing review of course descriptions within each cluster. This involved reading the descriptions grouped within each CDP and identifying recurring patterns in language, tone, and instructional emphasis. The review did not attempt to assign formal labels to clusters. Instead, it focused on whether the semantic similarities within each cluster reflected a tendency toward outcome-oriented or mastery-oriented framing. The review suggested a consistent pattern across domains. CDPs predominantly composed of business courses frequently emphasized applied outcomes, decision-making contexts, and organizational relevance. In contrast, CDPs with a higher concentration of non-business courses often highlighted methodological rigor, computational implementation, and statistical foundations. Some CDPs combined elements from both orientations, suggesting hybrid structures that blend practical application with technical depth.
To validate these interpretive patterns, a post hoc semantic similarity analysis was conducted using two hypothetical course descriptions. These descriptions were constructed to hold topical content constant while isolating differences in framing orientation. Both included references to common statistical techniques (e.g., regression, ANOVA, time series), but varied in how they described the instructional focus and intended outcomes. Figure 2 illustrates this structural comparison.
Both hypothetical course descriptions reference the same topical content (center), but differ in how they frame its purpose. The outcome-oriented version emphasizes data-driven decision-making and application in business contexts. The mastery-oriented version emphasizes theoretical depth and implementation using statistical software.
For clarity, the two descriptions are also included below in full, as they form the basis for the semantic comparison:
* Outcome-oriented: Covers statistical methods for solving business problems, including multivariate analysis, regression, ANOVA, categorical data, time series, and simulation. Emphasizes data-driven decision-making and practical application in business contexts.
* Mastery-oriented: Covers statistical methods and advanced theoretical foundations, including multivariate analysis, regression, ANOVA, categorical data, time series, and simulation. Emphasizes implementation using statistical software for parameter estimation and model evaluation.
Each hypothetical description was embedded using the same DistilBERT model applied throughout the study. Cosine similarity scores were then computed between the embedded descriptions and all course descriptions assigned to either business or non-business CDPs. Shared clusters were excluded from this analysis, not because they are unimportant, but to isolate domain-specific framing more clearly, since these clusters contained a mix of both course types and could obscure the patterns under investigation.
This validation step provided a targeted check on the qualitative interpretation. If the outcome-oriented description aligned more closely with business CDPs, and the mastery-oriented description aligned more closely with non-business CDPs, it would suggest that observed semantic differences were not solely a function of content, but reflected underlying differences in curricular framing.
Software and Computational Environment
All analyses were conducted in Python 3.11.10 using Jupyter Notebooks on a Windows 10 system. Semantic embeddings were generated using the Hugging Face Transformers library (v4.38.0) with PyTorch(v2.2.1+cu121). Dimensionality reduction and clustering were performed using UMAP (v0.5.3) and HDBSCAN (v0.8.39), respectively. Visualizations were created with Matplotlib (v3.9.2). Computations were executed on a system equipped with an NVIDIA GeForce GTX 1650 with Max-Q Design GPU. ChatGPT-4o was used to assist with software debugging, code optimization, idea generation, and language refinement. It played no role in data analysis, clustering, or statistical testing. All methodological decisions and analytic procedures were conducted by the researcher.
Results
This section presents the results of the study's two-phase analysis: data preparation and data analysis (see Figure 1). The results are organized around three analytic stages: semantic similarity analysis, clustering of semantically similar descriptions, and validation of framing differences across business and non-business programs. Together, these analyses address the study's central question: Do graduate-level business and non-business data analytics programs differ in their curricular framing orientation, with business programs emphasizing outcomes and non-business programs emphasizing mastery?
The dataset consisted of 1,972 graduate-level course descriptions drawn from 109 programs, including 964 from business programs and 1,008 from non-business programs. Required and elective courses were nearly evenly represented. After text preprocessing, each description was embedded using DistilBERT to produce a 768-dimensional semantic vector that captures contextual meaning. These vectors served as the basis for downstream analyses, including semantic comparison, clustering, and framing validation.
To visualize the semantic landscape of course descriptions, UMAP was used to reduce the 768-dimensional DistilBERT embeddings to two dimensions. As shown in Figure 3, course descriptions from business and non-business programs tend to cluster in distinct regions of semantic space. While some overlap is present, the overall pattern suggests consistent differences in how programs describe analytics content. This visual separation provides an initial indication of potential divergence in framing orientation, which is further explored in the following subsections.
Each point represents a single course description, embedded into a shared semantic space using DistilBERT. Blue circles correspond to business courses, and orange triangles correspond to non-business courses. The spatial separation reflects semantic divergence in how course content is framed across disciplinary contexts.
Semantic Differences Between Programs
To evaluate whether business and non-business programs describe analytics education using semantically distinct language, the study compared pairwise cosine distances between embedded course descriptions across three groups: within business programs, within non-business programs, and between business and non-business programs. Cosine distance served as a measure of semantic difference, where greater distance indicates lower similarity in how course descriptions are framed.
Each of the 1,972 course descriptions was embedded using DistilBERT to generate a 768-dimensional semantic vector. Pairwise cosine distances were then calculated for three comparison sets: business-to-business (929,296 pairs), non-business-to-non-business (1,016,064 pairs), and business-to-non-business (971,712 pairs). These comparisons quantified the degree of alignment in how course content is described across and within program types.
Summary statistics showed that between-group distances were meaningfully higher than within-group distances. The mean distance between business and non-business courses was 0.310 (SD = 0.124, Median = 0.295), compared to 0.228 (SD = 0.103, Median = 0.207) for business-to-business comparisons and 0.245 (SD = 0.098, Median = 0.230) for non-business-to-non-business comparisons. All three distributions exhibited moderate right skew, with 3-5% of values exceeding the upper bound of the interquartile range. Statistical tests confirmed that these differences were both significant and meaningful. Independent sample t-tests showed that between-group distances were significantly greater than within-group distances for both business (t = -498.65, p < 0.0001) and non-business courses (t = -406.79, p < 0.0001). These findings were corroborated by Wilcoxon Rank-Sum tests (W = -482.05 and -383.77, respectively; p < 0.0001), which were used as a non-parametric check due to the tendency of high-dimensional embeddings and cosine distances to violate assumptions of normality. Effect sizes reinforced the practical relevance of the findings, with a moderate to large effect for business programs (Cohen's d = -0.722; Cliff's Delta = -0.404) and a moderate effect for non-business programs (Cohen's d = -0.579; Cliff's Delta = -0.314).
Taken together, these results support H1a and H1b, indicating that course descriptions from business and non-business programs differ in how they semantically frame analytics education. While the analysis does not directly reveal the nature of these differences, the observed semantic divergence provides initial evidence that framing orientation may vary systematically between disciplinary contexts.
Structural Differences in Curricular Organization
To evaluate whether course descriptions cluster into structurally distinct patterns aligned with disciplinary context, the study applied density-based clustering to the semantic embeddings. The resulting groupings, referred to as Curricular Design Patterns (CDPs), represent shared framing structures based on how course objectives are described.
The clustering process produced 33 distinct CDPs, while 295 course descriptions (14.96%) were classified as noise due to insufficient similarity with other entries. Cluster quality was evaluated using HDBSCAN's stability scores, which indicate how consistently a cluster appears across different density thresholds. Stability scores range from 0 to 1, with values above 0.80 generally considered high confidence(Campello et al., 2015). The average stability score was 0.92, with 28 of 33 clusters (84.8%) exceeding the 0.80 threshold and none falling below 0.50, suggesting that the clusters were well-defined and robust.
To assess whether these clusters differed by program type, a Chi-Square Test of Independence was conducted. The test revealed a statistically significant association between course type (business vs. non-business) and CDP membership (χ = 677.16, df = 32, p < 0.0001), supporting H2a and indicating that business and non-business courses were not evenly distributed across clusters.
Post hoc chi-square tests, adjusted using a Benjamini-Hochberg correction to control the false discovery rate, revealed that 21 of the 33 CDPs had statistically significant differences in composition:
* 6 clusters were primarily composed of business courses,
* 15 clusters were primarily composed of non-business courses,
* 12 clusters had no significant difference and were classified as shared.
These findings support H2b and provide evidence that business and non-business programs tend to cluster around different semantic structures. While some overlap exists, the broader distribution indicates meaningful variation in how course content is organized and described across program types.
Figure 4 provides a visualization of these clusters in two-dimensional semantic space using UMAP. Each point represents a course description, colored by the dominant program type within its cluster. The outlined regions correspond to the 33 CDPs, with business-dominant clusters shown in blue, non-business clusters in orange, and shared clusters in gray.
Each dot represents a course description embedded in semantic space. Clusters are outlined and numbered, with color indicating dominant program type: blue (business), orange (non-business), and gray (shared). Noise points not assigned to a cluster appear in the background.
Curricular Framing Validation
A manual review of course descriptions within each CDP revealed consistent differences in framing orientation. Business-focused CDPs commonly emphasized analytics as a tool for organizational decision-making, highlighting applied relevance, strategic integration, and stakeholder impact. Representative phrases drawn directly from these course descriptions included "support strategic decision making," "develop strategic recommendations," "solving business problems," and "improve the company's competitiveness."
In contrast, non-business CDPs placed greater emphasis on technical mastery, statistical theory, and computational implementation. These descriptions frequently featured phrases such as "rigorous introduction to the theory of," "students will gain proficiency," "develop advanced skills in," "computational learning," and "use of statistical packages."
These recurring phrases reflect language patterns that distinguished how business and non-business programs position the role of analytics in graduate education. The patterns themselves were identified during the framing review described in Section 3.4, which grouped courses based on shared language and instructional emphasis prior to the similarity analysis.
To validate these interpretive patterns, a post hoc semantic similarity test was conducted using two hypothetical course descriptions that held topical content constant but varied in how they framed instructional purpose (see Methodology section). Both descriptions referenced the same statistical methods, including multivariate analysis, regression, ANOVA, categorical data, time series, and simulation, but differed in framing orientation. One focused on practical application in business contexts and data-driven decision-making. The other emphasized theoretical depth, implementation using statistical software, and model evaluation.
Each description was embedded using the same DistilBERT model applied throughout the study. Cosine similarity scores were computed between the embedded descriptions and all course descriptions assigned to either business or non-business CDPs. Shared CDPs were excluded from this comparison to better isolate domain-specific framing tendencies.
Results indicated that the outcome-oriented description was more semantically similar to business CDP courses (M = 0.6303, SD = 0.0452) than to non-business CDP courses (M = 0.5865, SD = 0.0527). This difference was statistically significant (U = 310710.00, p < 0.0001) and associated with a large effect size (d = 0.8914). Conversely, the mastery-oriented description was more similar to non-business CDP courses (M = 0.6325, SD = 0.0541) than to business CDP courses (M = 0.5816, SD = 0.0662), also statistically significant (U = 116344.00, p < 0.0001) with a large effect (d = -0.8426).
These results reinforce the interpretation that business programs tend to frame analytics as a practical tool for decision-making, while non-business programs emphasize analytics as a domain of technical depth and methodological expertise. Together, the findings provide a foundation for considering how curricular framing reflects deeper structural and educational priorities across disciplinary contexts.
Discussion
This study examined whether graduate-level business and non-business data analytics programs differ in how they frame the purpose of analytics education. It explored the idea that business programs tend to adopt outcome-oriented framing, which positions analytics as a means for supporting decision-making and delivering value in organizational contexts. In contrast, non-business programs were expected to reflect mastery-oriented framing, focusing on technical depth, methodological precision, and computational implementation.
The results provide strong empirical support for this distinction. Rather than diverging solely in topical coverage, business and non-business programs exhibit meaningful semantic differences in framing orientation. Business programs commonly describe analytics as a tool for applied problem-solving. Non-business programs more often emphasize analytics as a technical and methodological domain. These differences were evident not only in statistical comparisons of semantic similarity but also in recurring language patterns within Curricular Design Patterns (CDPs). Phrases such as "support strategic decision making" and "solve business problems" were common among business CDPs, while non-business CDPs frequently emphasized "rigorous theoretical foundations" and "computational implementation."
This distinction in framing suggests a deeper structural divergence in curricular priorities. Business programs appear to focus on preparing graduates to apply analytics in context-specific environments where decisions must be made. Non-business programs tend to prioritize foundational understanding, algorithmic development, and research-oriented analysis. These findings suggest that framing orientation plays a critical role in curriculum design, especially in interdisciplinary fields where methods may be shared but educational intent varies.
Implications for Business Analytics Education
The structural and semantic differences observed in this study highlight the importance of tailoring analytics education to the needs and expectations of specific domains. For business programs, the emphasis on outcome-oriented framing reflects a broader pedagogical priority: preparing students to use analytics in support of organizational decision-making. This requires more than technical competence. It involves equipping graduates with the ability to translate analytical results into actionable insights, communicate with stakeholders, and align data analysis with strategic objectives.
Curricular Design Patterns (CDPs) offer a useful lens for refining this approach. Patterns associated with business programs frequently incorporated language related to impact, action, and value creation. For example, course descriptions often referenced the need to "improve the company's competitiveness" or "develop strategic recommendations." These patterns reveal how business programs embed analytics within a broader framework of applied management and decision-making. Curriculum designers can use these patterns to assess whether individual courses contribute to this larger goal.
Notably, while business programs do include technically sophisticated topics and methods, they often describe them in relation to their decision-making utility. Statistical techniques, modeling tools, and analytical software are not absent from business CDPs, but they are typically presented as means to inform action or guide strategic choices. This framing contrasts with that of non-business programs, where similar topics are more often described in terms of theoretical understanding or methodological precision. Recognizing this distinction can help curriculum designers maintain technical depth while preserving a domain-relevant instructional narrative.
The findings also point to a broader opportunity. By foregrounding the role of analytics in decision-making, business programs can differentiate themselves within the growing field of data analytics education. Doing so not only supports workforce readiness but also reinforces the distinct contribution of business analytics as a subfield. Curriculum designers, faculty, and academic leaders should consider how framing choices embedded in course descriptions shape students' preparation and professional identity. These structural insights can also support accreditation reviews, program benchmarking, and curriculum reform efforts aligned with industry expectations for work-ready graduates.
Challenges for Curriculum Designers
The differences in Curricular Design Patterns (CDPs) highlight a central challenge for curriculum designers: balancing technical depth with applied relevance. This challenge is especially pronounced in business analytics programs, which must equip students with analytical skills while also preparing them to operate effectively in organizational settings. Programs that lean too heavily on technical mastery risk resembling traditional statistics or computer science curricula, while those that underemphasize core methods may leave students underprepared for the demands of data-intensive roles.
The structure of the CDPs illustrates these tensions. Some clusters are dominated by mastery-oriented language, emphasizing statistical theory, software implementation, and methodological rigor. Others are strongly outcome-oriented, focusing on problem-solving, decision support, and strategic application. These differences suggest that curriculum designers face important tradeoffs. Programs must decide not only what to include, but also how to frame analytical instruction in ways that align with disciplinary goals.
This complexity is further compounded by practical constraints. Faculty expertise plays a major role in shaping course content and emphasis. In business schools, instructors often come from traditional management disciplines, which may limit the integration of advanced technical topics unless interdisciplinary collaboration is pursued. Conversely, faculty from technical backgrounds may emphasize modeling or computation in ways that are less attuned to organizational context. Achieving curricular coherence across these perspectives requires coordination and shared vision.
Accreditation standards also influence curricular structure. For example, business programs accredited by bodies such as AACSB must incorporate learning goals related to communication, ethical decision-making, and managerial insight. These requirements may limit the number of credits available for technical coursework, requiring difficult decisions about how and where to integrate analytics content. Similarly, interdisciplinary dependencies can introduce scheduling challenges and content misalignment when courses are shared across departments.
By identifying the underlying structures of existing programs, CDPs offer a way to navigate these challenges more deliberately. They make visible the design choices that shape how analytics is taught and understood across contexts. This visibility can help curriculum designers better align programs with both internal goals and external expectations. While the structural patterns identified in this study offer useful insights, further research is needed to explore the mechanisms and implications of curricular framing more fully.
Future Research and Limitations
This study provides a structural foundation for understanding how graduate-level data analytics programs differ in curricular framing, but additional research is needed to capture the full depth and complexity of instructional design. Course descriptions offer useful insight into how programs publicly represent their content. However, the findings reflect how institutions describe their programs, not necessarily how they are taught. They may not fully reflect pedagogical intent, assessment strategies, or classroom emphasis. Future work could incorporate additional sources such as syllabi, learning outcomes, or instructor interviews to gain a more nuanced understanding of how analytical concepts are taught and applied.
Topic modeling presents another opportunity for expanding this analysis. While the current study emphasizes semantic framing, a topic-based approach could identify which analytical methods, tools, or domains appear across clusters. This would allow for a clearer comparison of content coverage between programs that differ in framing orientation. By combining semantic and topical analyses, future research could distinguish between what is taught and how it is positioned within the curriculum.
There are also limitations to the current approach. Clustering methods are effective for identifying general patterns, but they may overlook areas of overlap or hybrid instructional strategies. In addition, shared CDPs (those containing a mix of business and non-business courses) were excluded from some analyses to provide a clearer view of domain-specific trends. Although this improved interpretive clarity, it limited insight into how integrated or interdisciplinary programs may be evolving. These shared clusters could be examined more closely in future research to explore models that combine technical depth with applied orientation.
Finally, this study provides a static snapshot of the current curricular landscape. As industry expectations, accreditation standards, and institutional priorities evolve, the structure of data analytics programs is likely to shiftas well. A longitudinal approach could track how Curricular Design Patterns change over time and examine whether framing orientations converge, diverge, or give rise to new hybrid models. Such research would offer valuable guidance for curriculum development in a field that continues to grow and diversify.
Conclusion
This study examined how graduate-level data analytics programs differ in how they frame the purpose of analytics education, with a focus on comparing business and non-business curricula. Using semantic modeling and clustering techniques, the analysis revealed consistent structural and linguistic patterns that reflect two distinct orientations: outcome-focused framing in business programs and mastery-focused framing in non-business programs. These orientations were evident not only in statistical comparisons but also in the language used to describe course objectives, instructional emphasis, and intended applications. By introducing the concept of Curricular Design Patterns (CDPs), the study provides a new lens for identifying and interpreting structural differences across programs. Rather than emphasizing topical content alone, CDPs surface how courses position the role of analytics in professional preparation. This structural perspective offers practical value to curriculum designers, who must balance technical content, applied relevance, and institutional constraints.
The findings contribute to ongoing efforts to design responsive, domain-relevant analytics curricula. They also highlight the need for further inquiry into how framing decisions influence student learning, interdisciplinary coherence, and workforce alignment. As analytics continues to evolve across disciplines, understanding how programs frame their educational goals will be essential for shaping effective, purposeful instruction.
REFERENCES
Aasheim, C. L., Williams, S., Rutner, P., & Gardiner, A. (2015). Data Analytics vs. Data Science: A Study of Similarities and Differences in Undergraduate Programs Based on Course Descriptions. Journal of Information Systems Education, 26(2), 103-115. doi:https://jise.org/Volume26/26-2/Contents-26-2.html
Almgerbi, M., De Mauro, A., Kahlawi, A., & Poggioni, V. (2022). A Systematic Review of Data Analytics Job Requirements and Online-Courses. Journal of Computer Information Systems, 62(2), 422-434. doi:10.1080/08874417.2021.1971579
Bacic, D., Jukic, N., Malliaris, M., Nestorov, S., & Varma, A. (2023). Building a Business Data Analytics Graduate Certificate. Journal of Information Systems Education, 34(2), 216-230. doi:https://jise.org/Volume34/n2/JISE2023v34n2pp216-230.pdf
Benjamini, Y., & Hochberg, Y. (1995). Controlling the False Discovery Rate: a Practical and Powerful Approach to Multiple Testing. Journal of the Royal Statistical Society. Series B. Methodological, 57(1), 289.
Campello, R. J. G. B., Moulavi, D., Zimek, A., & Sander, J. (2015). Hierarchical Density Estimates for Data Clustering, Visualization, and Outlier Detection. ACM Transactions on Knowledge Discovery from Data, 10(1). doi:10.1145/2733381
Clayton, P. R., & Clopton, J. (2019). Business Curriculum Redesign: Integrating Data Analytics. Journal of Education for Business, 94(1), 57-63. doi:10.1080/08832323.2018.1502142
Collier, C. A., & Powell, A. L. (2024). Data Analyst Competencies: A Theory-Driven Investigation of Industry Requirements in the Field of Data Analytics. Journal of Information Systems Education, 35(3), 325-376. doi:10.62273/SPYC4248
Devlin, J., Chang, M.-W., Lee, K., & Toutanova, K. (2019). Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. Paper presented at the Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, volume 1 (long and short papers).
Gharehgozli, A., Gupta, A., & Seung-Kuk, P. (2024). Developing an Undergraduate Business Analytics Program for a Public State-Funded Business School. Journal of Education for Business, 99(1), 11-19. doi:10.1080/08832323.2023.2248348
Grootendorst, M. (2022). BERTopic: Neural Topic Modeling with a Class-Based TF-IDF Procedure. arXiv preprint arXiv:2203.05794.
Gupta, U. G. (2023). A Graduate Course in Data Governance: A Service-Learning Approach. Journal of Education for Business, 98(2), 84-87. doi:10.1080/08832323.2022.2045555
Institute for Advanced Analytics. (n.d.). Degree programs in analytics and data science. Retrieved from https://analytics.ncsu.edu/?page_id=4184
Karadağ, T., Parim, C., & Büyüklü, A. H. (2023). Can We Identify the Similarity of Courses in Computer Science? Sigma: Journal of Engineering & Natural Sciences / Mühendislik ve Fen Bilimleri Dergisi, 41(4), 812-823. doi:10.14744/sigma.2023.00089
King, A. Z. (2022). Data analytics in Association to Advance Collegiate Schools of Business-accredited US university accounting programs: A quantitative research study. Journal of Education for Business, 97(5), 320-328. doi:10.1080/08832323.2021.1953430
Luis, R., Edgar, G.-F., Alfonso, S., & Christopher, M.-A. (2021). Engineering Analytics : Advances in Research and Applications. [S.l.]: CRC Press.
McInnes, L., Healy, J., & Melville, J. (2018). UMAP: Uniform manifold approximation and projection for dimension reduction (arXiv:1802.03426). Retrieved from https://arxiv.org/abs/1802.03426
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J. (2013). Distributed Representations of Words and Phrases and Their Compositionality. Advances in neural information processing systems, 26.
Mitchell, R. B., Woolridge, R. W., & Johnson, V. (2021). The Role of Nontechnical Skills in Providing Value in Analytics-Based Decision Culture. Journal of Education for Business, 96(1), 1-9. doi:10.1080/08832323.2020.1719961
Nkwanyana, A., Mathews, V., Zachary, I., & Bhayani, V. (2023). Skills and competencies in health data analytics for health professionals: a scoping review protocol. BMJ open, 13(11), e070596. doi:10.1136/bmjopen-2022-070596
Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global Vectors for Word Representation. Paper presented at the Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP).
Provost, F., & Fawcett, T. (2013a). Data Science and its Relationship to Big Data and Data-Driven Decision Making. Big data, 1(1), 51-59. doi:10.1089/big.2013.1508
Provost, F., & Fawcett, T. (2013b). Data Science for Business : What You Need to Know about Data Mining and Data-Analytic Thinking (1st edition ed.). Beijing ;: O'Reilly.
Reimers, N., & Gurevych, I. (2019). Sentence-BERT: Sentence Embeddings Using Siamese BERT-Networks. arXiv preprint arXiv:1908.10084.
Sanh, V. D., Lysandre; Chaumond, Julien; Wolf, Thomas. (2019). DistilBERT, A Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter. Paper presented at the arXiv preprint. https://arxiv.org/abs/1910.01108
Silva Barbon, R., & Akabane, A. T. (2022). Towards Transfer Learning Techniques-BERT, DistilBERT, BERTimbau, and DistilBERTimbau for Automatic Text Classification from Different Languages: A Case Study. Sensors (14248220), 22(21), 8184. doi:10.3390/s22218184
Verma, A., Yurov, K. M., Lane, P. L., & Yurova, Y. V. (2019). An Investigation of Skill Requirements for Business and Data Analytics Positions: A Content Analysis of Job Advertisements. Journal of Education for Business, 94(4), 243-250. doi:10.1080/08832323.2018.1520685
Copyright Educational Research Multimedia & Publications 2025