ABSTRACT: In tills paper we theorize that there are specific musical features that contribute to a melodys character which we define as melodiousness and conduct a large-scale corpus analysis to examine whether there are differences in the melodiousness of popular liit songs from the 1960s compared with present-day pop songs. To carry out the corpus analysis, we use a new approach for generating symbolic data for popular music melodies to overcome the lack of preexisting symbolic data. In addition, we attempt to answer the question of whether any key characteristics of melodiousness appear to have changed or sílifted in notable ways over time.
Submitted 2021 November 27; accepted 2023 September 15.
Published 2023 November 9; https://doi.org/10.18061/enir.vl7i2.8746
KEYWORDS: pop music, corpus analysis, automated transcription, melody
WHAT makes a particular melody good, bad, likeable or enjoyable? Of course, the answer to this question is culturally and stylistically dependent, and culture and style change over time. Recently, the YouTube channel Inside the Score published a video entitled The Death of Melody. where the author argues that popular songs from the last several years have melodies that are inferior compared to pop melodies from the height of the rock and roll era. Specifically, he points out the prevalence of one-note melodies and narrow melodic ranges as recurrent features in modem popular liit songs. These observations are not limited to our YouTube author but appear in many sources of pop music criticism (e.g.. Holden, 1994; Young, 2016; McAlpine. 2018), and carry the implicit assumption that a dead melody is less than desirable. In noting these trends, it raises the inherent questions not only of what constitutes a good melody, but potentially what makes a melody in the first placel That is. in evoking the metaphor that melodies have died, the author implies that melodies inherently have (or should have) properties that we associate with life, liveliness, or activity. In this paper we theorize that there are specific musical featmes that lend a melody these qualities, which we henceforth will refer to as melodiousness, and carry out a large-scale corpus analysis to examine whether there are differences in the melodiousness of pop music melodies from the 1960s compared with present-day pop songs. In addition, we attempt to answer the question of whether any key characteristics associated with melodiousness appear to have changed or sílifted in notable ways over time.
BACKGROUND
The term melody is frequently used to mean different things. On the one hand it can be used to describe a component of a song as distinct from other elements (such as harmony, timbre, etc.), whereas at other times it might be referred to as the part that you hum along to or even refer specifically to a sung vocal line (in particular for popular music) as opposed to an instrumental line or quasi-spoken (i.e.. rapped) line as one might find in hip-hop. Importantly, the above definitions are not mutually exclusive. In computational musicology it is necessary to operationally define the variables of interest, however, oftentimes this is not a straightforward process since a variable such as melody is inherently subjective. Defining melody is less problematic when the source of the material comes from a non-polyphonic source (i.e.. the melody does not need to be extracted from a complete score or song). While there have been several computational studies of melody, they most commonly rely on monophonic somces such as folk songs or extracted solo parts from a score with accompaniment (e.g.. Müllensiefen et al.. 2009; VanHandel & Song, 2010; Shanahan & Huron, 2011; Temperley & Temperley, 2011). Although there have been few computational studies that have needed to isolate melody from its original full score or song context, a common approach appears to be one of simplifying the definition such that the melodic line has either been predefined (e.g., encoded themes from the Barlow and Morgenstern Dictionary of Musical Themes, 1948) (e.g., Warrenburg & Huron, 2019), is equated with the top-most voice or part (e.g., Arthur, 2017; Hu & Arthm, 2021), or limited to the sung vocal part (e.g., Serrá etai., 2012; Tan & Temperley, 2019). As will be discussed below, our methodology requires that we take a similar approach to isolate the melody from the rest of the song. Although our methods allow us less control over what is selected as the melody compared with the above-mentioned approaches, we have similarly simplified the definition of melody in a comparable manner to these computational studies.
Several studies have taken a computational approach to the investigation of the change of musical features overtime (e.g., Pamcutt et al., 2011; Broze & Shanahan, 2013), orto the musicological examination of melodic featmes more broadly (e.g., Arthm, 2017; Baker & Shanahan, 2017; Hansen & Huron, 2018; Warrenburg & Huron, 2019). However, historically, the field of computational musicology has been somewhat limited both in terms of the scope and style of the music analyzed-the above papers notwithstanding-in part due to the availability of symbolic musical data. Regarding the study of popular music, the geme has been receiving increasing attention in the empirical musicology and MIR (Music Information Retrieval) communities as more and more data become easily accessible and more widely shared. For instance, there have been numerous recent studies on trends in popular music (e.g., Serra et al., 2012, Gauvin, 2015, Mauch et al., 2015; Miles et al., 2017; Duinker & Martin, 2017; White & Quinn, 2018; Tan et al., 2019; Sems & Forrest, 2021). However, these have tended to focus on hmmony, likely as a function of convenience sampling. For instance, at the time of writing, only two corpora-the McGill Billboard corpus (Burgoyne et al., 2011) and the Rolling Stone 100 corpus (de Clercq & Temperley, 2011)-constitute the bulk of publicly-available, clean, symbolic corpora for popular music, both of which were originally published in 2011, and primarily contain harmonic annotations [1], but contain a paucity of material from the 21st century. One solution to the convenience-sampling problem would be to use so-called "messy" data: an approach taken by Manchet al. (2015) who used quantitative audio features extracted from 17,000 recordings of popular music to create vocabularies of tonal and timbral descriptors (with minimal human intervention). They then examined the probability distributions of these lexicons over time to examine large-scale changes in the evolution of popular music.
In this paper, we take a similar "messy" approach to overcome the lack-of-data problem, but use a novel method for generating symbolic data, in this case aimed at analyzing popular music melodies. Using the methodology described below, we assembled a corpus of over 1500 popular melodies with which to test a series of hypotheses about trends in melodiousness over time in popular music.
METHOD
Materials: Popular Music Corpus
In this paper we aim to address the question of whether modem popular songs have become less melodious compared to earlier pop songs. To investigate this research question, we needed to acquire a representative sample of popular music from both early and late periods. We decided on the 1960s as the starting point for our popular period since the late 50s are commonly referenced as initiating the birth of 'rock and roll.' Likewise, we wanted the later popular music period to be as recent as possible, in part because there are so few empirical studies that include music from this period. Accordingly, we selected the last complete decade (2010-2019) as our "late" popular period. We also assembled our corpus to contain music from the intervening decades to permit post-hoc analysis of trends over time.
While there are several harmonic corpora that include the earlier period, there is a scarcity of existing corpus material for any form of modem popular music in symbolic format. There appear to be only two corpora that have expert transcriptions of popular melodies. One (C°CoPops [2]) aligns with the McGill Billboard corpus of harmonic transcriptions (Burgoyne et al., 2011), with song publication dates ranging from mid 1950s only through 1991, however at present only around 200 songs have been transcribed (Arthur & Condit-Schultz, 2021). The other, the RS200 [3] corpus by Temperley & de Clercq (2011) also has only 200 songs with only a single song from the twenty-first century.
To obtain a larger sample that included modem popular melodies, we built our own corpus (described further in the Sampling section below). Given the labor involved in manual transcription, and given the large sample of data we desired, we chose to make use of automated melody transcription methods using a popular MIR (Music Information Retrieval) algorithm. While these methods are. of comse. less robust than human transcription, the large volume of data collection and the assumed random distribution of error makes this methodology suitable for our purposes. In addition, there is a precedent for using "messy" data to perform this type of large-scale analysis (Mauch et al., 2015; Albrecht. 2019; Harrison & Shanahan, 2017). The algorithm we used for the automated melody transcription is Melodia (Salamon. 2014).
The Melodia algorithm lias four basic steps. First, it computes a time-windowed spectrogram analysis to determine the likely frequencies active within a given time slice across the entire (fully mixed and rendered) track. Next, the algorithm applies a filtering process that "boosts" frequencies only in the range where melodies are typically found (~261.6Hz to 5KHz) and attenuates bass frequencies, and then computes a chromagram-like operation. Specifically, frequencies are "folded" into octave-separated bins in order to estimate the most active pitch classes within a given time frame, but the size of each frequency bin is only 10 cents (as opposed to a full semitone or 100 cents) in order to increase the frequency-domain resolution (see Salamon & Gómez, 2009, 2012; and Gomez, 2006) resulting in a quantized pitch range covering ~55-2,000 Hz over 600 10-cent-wide bins. The third step in the process is to calculate "pitch contours", which are groups of "pitches" (local frequency maxima) at very small-time scales (~50ms) that are closely connected in frequency and time. These pitch contours typically have an overall length from one or two notes up to a short phrase. The final step is to determine which of the pitch contoms is the most likely to be "the melody". Tliis is done by applying filtering rules that were developed by observing the characteristics of pitch contoms that are part of a melody (e.g.. presence of vibrato and average pitch height) and contoms that are accompaniment (e.g.. overtones that shift and move together). Melodia outputs the melody contours as sequences of frequencies so that the accuracy of the algorithm can be evaluated using evaluation methods standard within the ISMIR community. However, Melodia also includes a separate component that discretizes the pitch contom frequencies into MIDI note numbers. We used this component to create the symbolic melodies in this corpus. (For a more in-depth explanation, see Salamon's website: https://www.justinsalamon.coni/melody-extraction.htnil). A comparison of 206 human-transcribed melodies with the Melodia transcriptions of the same melodies can be found in Appendix A for readers who wish to investigate the automated melody transcriptions in more depth.
Sampling
Our sampling method for this project was similar to that of Bmgoyne et al. (2011), who employed a stratified random sample from the Billboard Hot 100 over each of the 3 decades (roughly) between 1958 and 1991, evenly distributed according to rank positions on the charts. We adopted verbatim the same set of song titles from Bmgoyne from 1960 through 1991, using the automated MIR process to obtain the melodies, but extended the corpus through the period 1992-2019 using a similar sampling methodology. Specifically, we divided the period 1992 through 2019 into 3 "eras" (1992 - 1999, 2000 - 2009 and 2010 through 2019). Next, we divided the weekly list of Billboard Top 100 hits into five percentiles (0 - 20, 21 - 40, 41 - 60, 61 - 80 and 81 - 100) based on each song's rank. Then we randomly sampled 300 songs from the pool of songs for each era. Similar to the process in Bmgoyne. this sampling procedure produced some duplicates, due to the fact that hit songs often occupy the Billboard Hot 100 charts for more than a single week. We removed the duplicates so that each song is only included in the corpus once. Tliis procedure yielded a total of 833 songs, which when added to the 738 unique songs from the Billboard gave us a total of 1571 songs (see the full list of complete songs in the corpus in Appendix C). We adopted this approach because it provided an optimal method of obtaining an unbiased sample of popular music to test ourhypothesis, while also enabling a comparison between automatic and expert-encoded transcriptions (see Appendix A). Of note is that Burgoyne et al. stopped collecting songs at the year 1991 due to significant changes made by Billboard to their methodology for selecting songs for the Hot 100. In fact. Billboard continues to make changes to its selection methodology on a regular basis, as methods for delivering and consuming popular music continue to evolve and change. There is a possibility, therefore, that using the Billboard Hot 100 for the full range of decades studied here introduces some bias into oursample. However, for decades the Billboard charts have represented a standard of success in popular music, and we assume that the changes Billboard lias made are necessary to continue to represent that standard. We include a description of the changes Billboard has made to their chart selection methods in Appendix В so readers can ascertain whether the two subsamples are both representative of "popular music." We compared the songs in the original Billboard eras (1958-1991) against the songs in the extended corpus (1992-2019) in terms of both distributions on chart positions, and the ratio of artist to number of hits, and found them to be approximately the same.
Procedure
Recall that our primary research question asks whether modem melodies are less "melodious" compared with earlier popular songs. That is, we wish to examine whether there are measurable differences in melodic features associated with "liveliness" and "activity" when comparing popular melodies from these two time periods (1960s and 2010s).
Importantly, we propose that it is prototypical for melodies to be "active" and "varied." By way of example, if we were to randomly stop a stranger in the street and ask them to make up a melody to sing or whistle, it is unlikely they would perform a repeated single note to a metronomic pulse. In fact, the very claim that melody is "dead" must imply that it is contrary to the (prior or established) norm for modem melodies to behave as they do. Accordingly, we can think of a "dead" melody as breaking from this prototypical association. We propose, then, that "dead" melodies are exemplified by a lack of rhythmic and pitch variety and can be characterized as relatively flat and inactive. If "activity" readily suggests motion and energy, then we would propose that "dead" melodies would be exemplified by features linked with stasis, idleness, and lethargy.
Based on our rationalizations above, we propose that melodiousness in a melody could be measured using six dimensions that we propose are strongly linked to activity and variety: melodic range, amount of repetition (defined below), intervallic diversity, rhythmic continuity, rhythmic diversity, and contour.
Note that we presume that there are features inherent in melodies that allow them to be heard as melodies. That is, at least from a perceptual standpoint, stringing notes together in time is not sufficient to create a single, coherent melody (e.g., Bregman, 1990). In the Western tradition (as with many other musics) the concept of melody is strongly tied to the production abilities of the human voice (e.g., Wermke & Mende, 2009). That is to say, we presume that there could be many features that might contribute to a melody appearing "un-melody like," which would not necessarily make them "unmelodious" according to our definition. However, we presume that those (un-melody like) features would not likely be present in our sample at all, since they are unlikely to make 'hit' songs in the first place. Specifically, we hypothesize that, in comparison with earlier melodies, modem melodies:
Hl. Will have a smaller (rolling) melodic range
H2. Will have a greater overall proportion of repetition
H3. Will have a greater proportion of small melodic intervals
H4. Will show a less diverse distribution of rhythms
H5. Will have a greater proportion of longer notes and/or more rests or breaks
To examine each of the hypotheses above, we present in the next paragraphs our operational definitions of the variables appearing in our hypotheses above and elaborate on the rationale, methods, and metrics used to measure those variables. Each song in our corpus therefore has a single value for each of the variables defined below.
Rolling Melodic Range:
Melodic range is defined as the number of semitones between the lowest and highest note in a melodic segment. Since the total range for a song's melody can be large even if large segments of the song have small ranges, we instead compute a rolling average using a two-measure window (a common sub-phrase or phrase unit) with a hop size of one measure, and then take the average of all the windows. Thus, the rolling melodic range is the average range in semitones within a two-measure unit over the entire song.
Repetition:
The more material that is 'recycled' in a piece of music, the less variety it contains. We measure repetition in terms of its compressibility, with more compressible melodies representing a higher degree of repetition. Note that this method cannot distinguish between short- or long-term repetitions, but just the overall amount of repetition [4]. Specifically, repetition was calculated using the following procedure: Each note was converted into a string representing pitch class and octave in A.S.A. format (e.g., A4), plus duration quantized to the number of sixteenth notes (see rhythmic diversity, below). The entire melody for a song was then represented as a string with all the notes concatenated. Repetition was then computed using the GZIP algorithm as implemented in the python zlib package as the size of the compressed file divided by the size of the uncompressed file. Note that this methodology is incapable of distinguishing between short-term and long-term repetition. (E.g., AABB is equally repetitive to AB AB).
Small Melodic Intervals:
We examine the distribution of undirected melodic intervals in semitones. We propose that 'flat' melodies would not only have smaller (rolling) ranges but would have a greater overall proportion of small melodic intervals. We operationally define 'small' to be equal to or smaller than a minor third (three semitones) given that a substantial number of pop songs will have pentatonic melodies where a minor third would be considered a step. We measure the proportion of each song's melodic intervals that are less than or equal to this threshold.
Rhythmic Diversity:
In addition to pitch movement, the rhythm of a song's melody can also be highly variable or highly static. We propose that 'active' melodies would have a greater degree of rhythmic variety whereas 'dead' melodies would exhibit the opposite trend. Given that it has been argued that nPVI is not an accurate measme of rhythmic variety (Condit-Schultz, 2019), we instead relied on a simple count of the distinct number of unique rhythmic values (i.e., quantized note-dmation values) encountered in the melodies. Note that technically, due to our methodology, we measure note durations in seconds and therefore must estimate rhythmic values (i.e., eighth, quarter, etc.) by considering the tempo and quantizing to the nemest 16th note. For example, a duration of .25 seconds at a tempo of 120 bpm in 4/4 time would be quantized to an eighth note (120 bpm / 60 seconds / 8 notes per bar rounded to the nemest .125). Accordingly, we "bin" all note durations in increments of increasing 16th notes (i.e., 16th, 8th, dotted 8th, quarter, etc.)
Proportion of Long Durations:
In addition to the overall rhythmic diversity, we propose that 'active' melodies would typically be faster, or, at least, more rhythmically dense, and thus contain fewer 'gaps', shorter phrase boundaries, and fewer long notes. Pernee et al. (2010) demonstrated that İmge lOIs were an important predictor for melodic segmentation. As such, we measmed the proportion of lOIs (again, quantized to the nearest 16th note) in a song that me equal to or greater than half of a measme (e.g., half notes or greater in 4/4 time).
It may seem that an obvious omission was that of contour, which we described as a feature relating to melodiousness. That is, melodies with 'flat' contoms would be less active than those with ascending, descending, arc-shaped, or other contours that may lend to the perception of increasing and decreasing tension. However, we were not able to convincingly define a novel contour metric that would be uncorrelated with the rolling range variable.
Each of the above features was calculated for each song in our corpus. To test our main hypothesis regmding whether the earlier group of songs differed in melodiousness from the latter group of pop songs, we apply a logistic regression, using the calculated feature values as described above to predict the time-period group (1960s or 2010s). In addition, we planned a post-hoc analysis using a multiple linem regression model for the full time-period (i.e., every year from 1960 to 2019) to examine the best fit for each of the melodiousness variables defined above.
RESULTS
To assess our main hypothesis, we carried out a multiple logistic regression analysis using the five predictor variables described in our methods section (melodic range, repetition, small melodic intervals, rhythmic diversity, and proportion of long dmations) to predict the time era of the song (1960s or 2010s.) The results of the analysis are summarized in Table 1 below.
Of the five main variables that we assumed would be related to "melodiousness", only two showed significant differences between the songs from the 1960s and the modern-day songs: melodic range, and repetition (bothp < .05; see Figme 1). Interestingly, however, the rolling range variable was in the reverse direction of what we predicted, showing that modem pop songs actually have a larger range compared with earlier pop songs. Four interactions were also significant: rhythmic diversity and small melodic intervals, range and repetition, rhythmic diversity and repetition, and proportion of long durations and repetition (see Figme 2). However, we made no a priori hypotheses about these interactions; moreover, the musical significance of these interactions is not clearly evident.
Of note me the very small differences in repetition between the two era groups. That is, while modem pop songs do seem to make use of a greater degree of repetition, as illustrated in Figme 1, it seems the degree of increased repetitiveness is very small. It is worth reminding the reader that ourmethodology is incapable of distinguishing between short-term and long-term repetition, (e.g., AABB is equally repetitive to AB AB). Overall, given that only one of our five hypotheses liad supporting empirical evidence, we argue that our data does not support the conclusion that "melody is dead," or that modem melodies are less melodious than earlier ones.
EXPLORATORY RESULTS
Having collected data for each of the decades from 1960 - 2019, we also performed a post-hoc analysis of the trends in our melodiousness variables to examine whether these changes were gradual, sudden, or may have changed direction in the intervening years. We performed a linear regression for each variable and plotted the data to see whether any trends could be observed. Figure 3 plots all values for each of the five melodiousness variables along with the line of best fit for the full range of years from 1960 through 2019, inclusive.
Of course, there is no reason to presume that any musical trend would be linear, and one can always attempt to fit a straight line. Nevertheless, we found significant effects for repetition and range (as before), but also for proportion of long durations and rhythmic diversity. It is likely that the additional data in the linear models explains the difference from the insignificant logistic result to the significant linear model result in the case of these latter two variables. Two of our melodiousness variables-repetition and rhythmic diversity-were significantly different in our predicted direction (i.e., as years increase, repetition increases and rhythmic diversity decreases), consistent with hypotheses 2 and 4. However, the other two melodiousness variables-range and proportion of long durations-were significant, but in the reverse direction (i.e., as years increase, range widens and the proportion of long durations decreases), inconsistent with hypotheses 1 and 5. In all cases, the size of the overall change is small, and the fifth variable (proportion of small melodic intervals) was not significant. Since this was a post-hoc exploratory analysis, we did not correct for multiple tests. Overall, we argue that the data does not support the claim that "melody is dead."
DISCUSSION
Every popular music producer wants to be able to predict the next great hit. In the field of music informatics, the quest to uncover the anatomy of what makes a hit song is known as hit song science. One of the most studied aspects of the song is the melody, with a widely held belief that a key to a great song is to write a great melody (Frederick, 2019). The implicit assumption is that a great "hook" is one that is catchy and memorable (Burgoyne, 2013), and perceptual research has shown that familiar melodies are more aesthetically pleasing than unfamiliar ones (Janssen et. al, 2017). It has also been suggested that there is an optimal 'sweet spoť in terms of a song's repetition; too much repetition and the song is perceived as 'boring' or possibly even 'annoying' while an over-abundance of novel material can cause the song to be perceived as overly complex, with the general idea being that increasing repetition leads to a facilitation in the mental processing of the music (Huron, 2006). This is consistent with Cheung et al.'s (2019) finding which showed that information content and entropy significantly predicted liking for chord sequences. A seemingly contradictory observation is that, according to our results, songs appear to be getting increasingly repetitive, but that the repetition is presumably correlated with liking, given that the songs are all 'hits. ' (i.e., we assume that radio/streaming/publishing industries are not the only drivers in creating hits but that the general public has to enjoy them.) In fact, a recent paper by Albrecht (2019) showed a difference in repetition even between songs that are all hits (controlling for year), demonstrating that the songs at the top of the Billboard charts contained more repetition than the songs at the bottom of the charts. While the current paper was not a study of memorability or 'catchiness,' our results do suggest that there is a slight trend towards increasing repetition, which is (by our definition) unmelodious, but which apparently is desirable in a modem popular song-at least according to Billboard's definition of what makes a 'top 100' hit song. However, we did not find any empirical evidence to support the claim that 'one note melodies' are prevalent in a representative sample of modem pop songs, as there was no difference in the prevalence of small melodic intervals across groups.
Several caveats are warranted given the methodology for our corpus analysis. First, it is possible that using automated transcription algorithms does not provide sufficiently accurate melody transcriptions to gather a coherent picture of real trends or changes in music over time. However, as mentioned, there is a precedent for using "messy data." That is, we presume that the errors in the melodies are randomly distributed over the time period of our full corpus, and as such, when investigating such a large quantity of data to examine very broad trends, we feel that the data, while certainly error-prone, is giving reliable results. We would not recommend using automated transcriptions for "close readings" or more traditional music-theoretic inquiries. Second, it could well be the case that our systematic approaches to capture our variables of interest were not the most appropriate. And lastly, it may be that other melodic featmes would provide better insights. In other words, it is possible that our operational definitions of "melodiousness" in general were poorly conceived or, more likely, simply incomplete. However, we hope that this analysis serves as a proof of concept for the kinds of queries that are possible using this type of data.
Additionally, we mentioned that while using compression as a proxy for melodic repetition is useful in evaluating a broad generalization such as the one considered here, it provides little insight into what kinds of melodic repetition (e.g., melodic sequences, transpositions, inversions, retrogrades, extensions, elisions etc.) are used over the years or across various styles. Finally, a larger but more complex analysis that considers melody in the context of harmony, form, or geme may reveal more meaningful insights. Further research is needed to determine the impact these melodic elements have on popular music styles.
ACKNOWLEDGEMENTS
Tills article was copyedited by Matthew Moore and layout edited by Diana Kayser.
NOTES
[1] The RS was later expanded to 200 songs (RS200) and includes melodic transcription data. Tan et al. (2019) examine syncopation in popular melodies.
[2] See: https://github.com/Computational-Cognitive-Musicology-Lab/C°CoPops-Billboard
[3] See: http://rockcorpus.midside.com/
[4] Songs are scored in terms of the total proportion of the song that can be compressed according to the gzip algorithm as implemented in the python zlib package. For example, an A4 dotted eighth note would be represented as A43. The entire melody for a song was represented as a string with all the notes concatenated. Repetition was then computed using the GZIP algorithm as implemented in the python zlib package as the size of the compressed file divided by the size of the uncompressed file.
REFERENCES
Albrecht, J. (2019). "Pop melodies have become more repetitive throughout the Billboard era." Oral presentation presented at the biannual meeting for the Society for Music Perception and Cognition, New York University, New York, 2019.
Arthur, C. (2017). Taking Harmony Into Account: The Effect of Harmony on Melodic Probability. Music Perception^ 34(4), 405-423. https://doi.Org/10.1525/mp.2017.34.4.405
Arthur, C. & Condit-Schultz, N. (2021). "Testing the Loose-Verse, Tight-Chorus model: A corpus study of melodic-harmonic divorce." Oral presentation presented at the annual meeting for the Society for Music Theory (virtual).
Baker, D. J., & Shanahan, D. (2018). Examining Fixed and Relative Similarity Metrics through Jazz Melodies. In M. Montiel & R. W. Peck (Eds.), Mathematical Music Theory: Algebraic, Geometric, Combinatorial, Topological And Applied Approaches To Understanding Musical Phenomena (pp. 319-334). World Scientific Publishing Co. https://doi.org/10.1142/9789813235311 0016
Bock, S., Korzeniowski., F, Schluter, J., Krebs, F. & Widmer, G. (2016). madmom: A New Python Audio and Music Signal Processing Library Proceedings of the 24th ACM international conference on Multimedia October 2016, 1174-1178. https://doi.org/10.1145/2964284.2973795
Bregman, A. S. (1990). Auditory Scene Analysis. MIT Press. https://doi.org/10.7551/mitpress/1486.00L0001
Brože, Y., & Shanahan, D. (2013). Diachronic Changes in Jazz Harmony A Cognitive Perspective. Music Perception, 31(1), 32-45. https://doi.org/10.1525/mp.2013.3LL32
Burgoyne, J. A., & Fujinaga, I. (2011). AnExpert Ground Truth Set for Audio Chord Recognition and Music Analysis. In A. Klapuri & C. Leider (Eds. Burgoyne, J. A., Bountouridis, D., Balen, J. V., & Honing, H. (2013). Hooked: A Game for Discovering What Makes Music Catchy. Proceedings of the 12th International Society for Music Information Retrieval Conference, 2013, 633-638.
Burgoyne, J. A., Bountouridis, D., Balen, J. V., & Honing, H. (2013). HOOKED: A game for discovering what makes music catchy. Proceedings of the 14th International Society for Music Information Retrieval Conference, 245-250.
Cheung, V. K. M., Harrison, P. M. C., Meyer, L., Pearce, M. T., Haynes, J.D., & Koelsch, S. (2019). Uncertainty and Surprise Jointly Predict Musical Pleasure and Amygdala, Hippocampus, and Auditory Cortex Activity. Current Biology, 29(23). https://doi.org/10.1016/).cub.2019,09,067
Condit-Schultz, N. (2019). Deconstructing the nPVI: A Methodological Critique of the Normalized Pairwise Variability Index as Applied to Music. Music Perception: An Interdisciplinary Journal 36(3), 300-313. https://doi.Org/10.1525/mp.2019.36.3.300
De Clercq, T., & Temperley, D. (2011). A corpus analysis of rock harmony. Popular Music, 30(01), 47-70. https://doi.org/10.1017/S026114301000067X
Duinker, B. (2019). Plateau Loops and Hybrid Tonics in Recent Pop Music. Music Theory Online, 25(4). https://doi.Org/10.30535/mto.25.4.3
Frederick, Robin. (2019). Shortcuts to Hit Songwriting level One: 58 Essential Skills for Writing Hit Lyrics, Melodies, & Chords. Taxi Music Books.
Gauvin, H. L. (2015). "The Times They Were А-Changin'": A Database-Driven Approach to the Evolution of Musical Syntax in Popular Music from the 1960s. Empirical Musicology Review, 10(3), 215-238. https://doi.org/10.18061/emr.vl0i3.4467
Gómez, E. (2006). Tonal description of music audio signals. [Unpublished doctoral dissertation], Universität Pompen Fabra.
Hansen, N. C., & Huron, D. (2018). The Lone Instrument: Musical Solos and Sadness-Related Features. Music Perception, 35(5), 540-560. https://doi.Org/10.1525/mp.2018.35.5.540
Harrison, J. E., & Shanahan, D. (2017). "Introducing a Corpus of French compositions for exploring social interaction and musical change" oral presentation presented at the Music Encoding Conference, University of Tours, France, May 2017.
Holden, S. 1994. "Pop View; How Pop Music Lost the Melody", The New York Times, July 3, 1994. https://www.nytimes.com/1994/07/03/arts/pop-view-how-pop-music-lost-the-melody.html. Accessed on-line Feb 1, 2021.
Hu, T., & Arthur, C. (2021). A statistical model for melody reduction. In Proceedings of the Future Directions of Music Cognition., (virtual) Ohio State University, Columbus, OH. https://doi.org/10.18061/FDMC.202L0007
Huron, D. (2006). Sweet Anticipation: Music and the Psychology of Expectation. The MIT Press. https://doi.Org/10.7551/mitpress/6575.00L0001
Janssen, B., Burgoyne, J. A., & Honing, H. (2017). Predicting Variation of Folk Songs: A Corpus Analysis Study on the Memorability of Melodies. Frontiers in Psychology, 8. https://doi.org/10.3389/fpsyg.2017.00621
Mauch, M., MacCallum, R. M., Levy, M., & Leroi, A. M. (2015). The evolution of popular music: USA 1960-2010. Royal Society Open Science, 2(5), 150081. https://doi.org/10.1098/rsos.150081
McAlpine, F. 2018. "Has Pop Music Lost It's Fun?". BBC https://www.bbc.co.uk/music/articles/fb84bfl929c9-4ed3-b6b6-953e8a083334. Accessed Feb 1,2021.
Miles, S. A., Rosen, D. S., & Grzywacz, N. M. (2017). A Statistical Analysis of the Relationship between Harmonic Surprise and Preference in Popular Music. Frontiers in Human Neuroscience, 11. https://doi.org/10.3389/fnlium.2017.00263
Müllensiefen, D., Pfleiderer, M., & Frieler, K. (2009). The Perception of Accents in Pop Music Melodies. The Journal of New Music Research, 38(1), 19-44. https://doi.org/10.1080/09298210903085857
Pamcutt. R.. Kaiser, F., & Sapp, C. (2011). Historical development of tonal syntax: Counting pitch-class sets in 13 th-16th century polyphonic vocal music. Proceedings of the International Conference on Mathematics and Computation in Music, 366-369. https://doi.org/10.1007/978-3-642-21590-2_35
Salamon, J., & Gómez, E. (2009). A chroma-based salience function for melody and bass line estimation from music audio signals. Proceedings of the Sound and Music Computing Conference (SMC), 331-336.
Salamon, J., & Gomez, E. (2012). Melody Extraction From Polyphonic Music Signals Using Pitch Contour Characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6). 1759-1770. https://doi.org/10.1109/TASL.2012.2188515
Sears, D. R. W., & Forrest, D. (2021). Triadic patterns across classical and popular music corpora: stylistic conventions, or characteristic idioms? Journal of Mathematics and Music, 15(2), 140-153. https://doi.org/10.1080/17459737.2021.1925762
Serra, J., Corral, Á., Boguñá, M., Haro, M., & Arcos, J. L. (2012). Measuring the Evolution of Contemporary Western Popular Music. Scientific Reports, 2(1), 1-6. https://doi.org/10.1038/srep00521
Shanahan, D., & Huron, D. (2011). Interval Size and Phrase Position: A Comparison between German and Chinese Folksongs. Empirical Musicology Review, 6(4), 187-197. https://doi.org/10.18061/1811/52948
Tan, I., Lustig, E., & Temperley, D. (2019). Anticipatory Syncopation in Rock: A Corpus Study. Music Perception, 36(4), 353-370. https://doi.org/10.1525/mp.2¿19.36.4.353
Temperley, D., & De Clercq, T. (2013). Statistical Analysis of Harmony and Melody in Rock Music. Journal of New Music Research, 42(3), 187-204. https://doi.org/10.1080/09298215.2013.788039
Temperley, N.. & Temperley, D. (2011). Music-language correlations and the "Scotch Snap." Music Perception, 29(1), 51-63. https://doi.org/10.1525/mp.201L29.L51
VanHandel, L., & Song, T. (2010). The Role of Meter in Compositional Style in 19th Century French and German Art Song. Journal of New Music Research, 39(1), 1-11. https://doi.org/10.1080/09298211003642498
Warrenburg, L. A., & Huron, D. (2019). Tests of contrasting expressive content between first and second musical themes. The Journal of New Music Research, 48(1), 21-35. https://doi.org/10.1080/09298215.2018.1486435
Wennke, K., & Mende, W. (2009). Musical elements in human infants' cries: In the beginning is the melody. Musicae Scientiae, 13(2_suppl), 151-175. https://doi.org/10.1177/1029864909013002081
White, C. W., & Quinn, I. (2018). Chord context and harmonic function in tonal music. Music Theory Spectrum, 40(2), 314-335. https://doi.org/10.1093/mts/mtv021
Young, James O. 2016. "How Classical Music Is Better than Popular Music." Philosophy 91 (4): 523-40. https://doi.org/10.1017/S0031819116000334
APPENDIX A
Evaluation of Automated Melody Transcription
COMPARISON OF "EXPERT" AND "AUTOMATED" MELODIES
As our motivation for using this algorithm was to overcome the need to collect several hundred transcriptions for songs where no score was available, we were unable to compare all the automated transcriptions with human transcriptions. However, we were able to compare 206 melody transcriptions with human transcriptions using the C°CoPops corpus, which contains expert-transcribed melodies to a subset of the songs in the original McGill Billboard harmonic corpus (Burgoyne 2011).
The Melodia algorithm produces an onset time, duration, and pitch for each note in the estimated melody, which is different from the relative timing (i.e., measures and beats) in the human transcriptions. In order to enable comparisons between the Melodia output and the ground-truth, we used an estimate of the tempo of the audio using the Madmom beat tracker (Bock et al., 2016) to translate the relative offsets and durations in the human transcription into absolute offsets (i.e., seconds) and onset time (using Music21).
Once we liad the ground-truth and Melodia estimates in a common time format, we applied the following procedme to enable the comparison: First, each note in the transcriptions and the estimates was broken into slices equivalent to a 16th-note duration. Next, we aligned these slices between the ground-truth and Melodia estimates to compare pair-wise accuracy. Here we used two methods of calculating the accuracy (similar to that of the melody extraction task for MIREX): one compares the predicted note's pitch with the ground truth (with a tolerance of 1 quartertone in either direction; referred to as raw pitch accuracy), while another one only considers the onset timing of the note (referred to as voicing accuracy). Based on these definitions, we measured the raw pitch accuracy to be .21 and the voicing accuracy to be .73. We also computed a more conservative, overall accuracy as the proportion of notes in the Melodia estimates where both the timing and pitch matched the ground truth. This overall accuracy was only .16. Thus, comparing the note for note transcriptions, we found that overall, the algorithm only agreed (on timing and pitch) with the human transcriptions about 16% of the time when considering both the exact when and where (as is typically the case in MIR evaluations, for example). However, in comparing a pitch class histogram of the expert versus automated transcriptions (see Figure Al below), we can see that the distributions are very close (the biggest error, interestingly, is over-estimating the tonic note). While a statistical test suchas a Chi-square test of independence would typically be appropriate, with the sheer volume of data points we have, even two near-identical looking distributions are likely to result in statistically-significant differences.
Figure Al. Comparison of pitch class distributions from human vs. automated transcription methods.
The blue bars show the pitch class distribution for the 206 expert-transcribed songs from the C°CoPops corpus, compared against the distribution for the same set of automatically transcribed songs by the MIR algorithm. Melodia, in orange. All songs were transposed to the key of C.
As mentioned in our paper, the error introduced by the automated transcription should be randomly distributed across all decades of the corpus, in which case we presume that given the volume of data we have, our results should be reliable. However, we acknowledge that the accuracy is very bad. The low overall accuracy may be caused by several reasons. First, in the C°CoPops dataset the annotators identified only the vocal melody, whereas Melodia identifies any salient melody notes (instrumental or vocal) which means that large sections of "error" are common during introductions, solos, etc. Second, since the instrumentation, timbres, and effects for songs vary, so does the strength of the note estimates. In particular, the algorithm does best on songs where there is a single vocalist and a homophonic accompaniment such as a folksong with strummed guitar (e.g., John Denver's "Back Home Again"). Segments containing prominent instrumental lines, multiple vocal lines (i.e., singing in harmony), or that contain unpitched (or "quasi-pitched") vocals, the performance of the algorithm is particularly poor. We examined in detail the best and worst performing transcriptions by Melodia. Figme A2 below shows a comparison between human and automated transcription for "Back Home Again" by John Denver, which liad an overall accuracy score of .77 which was the best output, against the two transcriptions for "Jungle Boogie" by Kool and the Gang, which liad an accuracy score of .06.
Back Home Again" (left) ,with accuracy score of .77, is a very straightforward production, primarily John Denver with guitar accompaniment. Denver's voice is very prominent in the mix. "Jungle Boogie" by Kool and the Gang, which had an accuracy score of .06, contains close vocal harmony, sparse vocal melody, and prominent instrumental melody.
APPENDIX В
Changes to Billboard Ranking Methodology
Billboard's Hot 100 chart has continually attempted to reflect the 100 most popular music singles. The measurement of what constitutes "most popular" lias changed over time to reflect technological changes in the way music is distributed to the listening public. The following is a history of the changes to the Hot 100 from the Billboard.com website (Trust 2019):
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022. This work is published under https://creativecommons.org/licenses/by-nc/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
To carry out the corpus analysis, we use a new approach for generating symbolic data for popular music melodies to overcome the lack of preexisting symbolic data. Regarding the study of popular music, the geme has been receiving increasing attention in the empirical musicology and MIR (Music Information Retrieval) communities as more and more data become easily accessible and more widely shared. [...]we selected the last complete decade (2010-2019) as our "late" popular period. While these methods are. of comse. less robust than human transcription, the large volume of data collection and the assumed random distribution of error makes this methodology suitable for our purposes.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
1 Georgia Institute of Technology