This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
1. Introduction
People improve their personal cultivation in different ways and turn their attention to spiritual pursuits [1]. Music is “the refuge of the human soul,” and the piano is known as the “king of musical instruments” [2]. Nowadays, with the rapid development of China’s cultural and economic undertakings, many piano learners have emerged among people of different age groups and many piano training institutions have emerged as the time requires. However, there are differences in piano training, the music literacy level of teaching workers, and teaching level. It is a difficult problem to find a piano teacher who matches the piano player in the process of piano learning and practice [3–5]. Additionally, there is a lot of economic pressure in the process of learning the piano. Hiring a piano teacher to learn the piano is often expensive. There are piano hardware maintenance costs in the process of learning the piano. These often result in the need to spend a lot of financial resources in the process of learning the piano, which makes piano learners bear greater economic pressure [6, 7]. However, the learning of the piano mainly depends on the learner’s usual practice. During this period, the piano teacher needs to evaluate the performance level of the learners and help the players to have a clear understanding of their own technical level. The individual practice process will lead to the assumption of one’s own performance, and there is no good judgment of one’s own performance level, for example, whether the piano performance is contagious or not and whether the music played is complete [8, 9]. Therefore, piano performance evaluation can not only help players have a clearer understanding of their own playing skills, increase the fun of playing, and improve players’ enthusiasm for playing the piano but also assist piano teaching to a certain extent [10].
In the early 1960s, Taba et al. proposed an evaluation method for hall sound quality. This opens the door to the research on the combination of technical theory and music, but the evaluation effect is not very good. This is because there is a problem in the mapping relationship between the subjective evaluation index and the physical evaluation index [11]. Bilder et al. gave important definitions and suggestions for the factors involved in the process of musical instrument sound quality evaluation, such as evaluation terms and evaluation methods. In order to improve the process and precautions of musical instrument sound quality evaluation, the objectivity and scientific nature of its sound quality evaluation have received extensive attention [12]. Hu et al. proposed a two-stage decay theory, which proved that the time-domain characteristics of the piano playing sound were closely related to the piano itself, the player’s playing strength, and other factors, providing a more effective basis for the identification of piano performance [13]. Bragagnolo and Guigue proposed a certain degree of relationship between the timbre and spectral amplitude components of piano performance, which contributed to recognizing complex multipart piano music by auditory computer systems [14]. Luizard et al. studied the application of subjective criteria of timbre to the objective evaluation system in the process of vocal performance, using neural networks to process the signal characteristics of music, and discussed in detail the relationship between subjective evaluation indicators and objective evaluation systems. They made the results obtained using an objective evaluation system consistent with subjective criteria [15]. The main contribution of Sharafati et al. was the construction of a fuzzy expert system capable of multifaceted recognition of violin music [16]. The main contribution of López et al. was to use fuzzy mathematical theory to describe the pitch, tonality, chord, dynamics, etc., of musical features to improve the efficiency of music identification [17]. However, according to the research direction, the evaluation of deep learning mainly focuses on the evaluation of learning results and the learning process. Shi evaluated teachers’ teaching methods and students’ learning outcomes based on the structure of the observed learning outcome (SOLO) classification theory, proving that the SOLO classification theory can reflect the quality of learning [18]. Nieminen and Tuohilampi divided evaluation into two categories: orientation process and orientation result, according to the different orientations of evaluation. They pointed out that when teachers evaluate students, different methods can be combined for evaluation [19].
This study provides an in-depth exploration of deep learning toward long- and short-term memory networks. According to the musical instrument digital interface (MIDI) analysis of the musical features in the piano performance, the piano musical features are extracted. The novelty lies in constructing an LSTM-based MIDI piano performance evaluation model. Based on the BLSTM model, the evaluation results of different piano playing levels are analyzed to realize the research on the evaluation strategy of piano performance.
Section 1 conducts a literature study on the current state of piano usage and deep learning evaluation. Section 2 proposes a piano performance evaluation scheme based on a long short-term memory network. Section 3 analyzes the data of the deep learning evaluation results. The conclusion section summarizes the research methods and results and makes a future outlook.
2. The Scheme of Piano Performance Evaluation Based on LSTM
2.1. The Exploration Process of Deep Learning to LSTM
The information processing of the human visual system is hierarchical. The working process of nerve-center-brain is a process of continuous iteration and continuous abstraction. At present, the idea of solving problems through machine learning is shown in Figure 1.
[figure(s) omitted; refer to PDF]
In Figure 1, features are the raw material for learning. If the data are well represented as features, usually, linear models can achieve satisfactory accuracy. The characterization of the problem requires consideration of the granularity of the feature representation, the primary (shallow) feature representation, the structural feature representation, and the amount of data needed to feature. The essence of deep learning is to learn more useful features by building a machine learning model with many hidden layers and massive training data and ultimately improve the accuracy of classification or prediction [20]. Therefore, the “deep model” is only a means and the goal is to perform “feature learning.” Computer scientists adapted deep learning models using the way that the brain processes time-series data to come up with circular networks. The recurrent neural network unrolled by time is shown in Figure 2.
[figure(s) omitted; refer to PDF]
In Figure 2, the state at each moment is regarded as a layer of the FNN. A recurrent neural network (RNN) can be regarded as a neural network with shared weights in the time dimension, where
[figure(s) omitted; refer to PDF]
In Figure 3, the memory unit of LSTM consists of upper and lower lines. Each line represents the transfer of a vector. The upper line represents the cell state
In equation (1),
Finally, the output gate can get the output value
The
2.2. Feature Extraction of Piano Performance
Many people assess a pianist’s skill by the difficulty of the pianist’s repertoire. Even many professional musicians use this as a standard. Accuracy refers to the degree of overpressing and missing-pressing by playing the keys compared to the standard keys. Fluency indicates whether the spacing between adjacent keys is proportional to the standard key spacing. Velocity indicates whether the force with which the key is pressed is proportional to the force of the standard key.
Piano performance evaluation involves various audio processing technologies, such as audio acquisition, speech decoding, music synthesis, speech recognition and understanding, audio data transmission, audio-video synchronization, audio effects, and editing. Voice synthesis technology is used to achieve computer voice output. It can be used for speech synthesis and music synthesis. The musical instrument digital interface (MIDI) is used to analyze the musical characteristics of piano performances. It generally refers to an international standard for digital music, describing the instructions for the process of music performance. MIDI files require the least amount of storage to play music [24]. The MIDI system is shown in Figure 4.
[figure(s) omitted; refer to PDF]
In Figure 4, the MIDI input port receives messages from the device. It is used to send the generated raw MIDI messages. MIDI file records are a standard file format for storing information. A MIDI file contains note, timing, and channel selection instructions. Notes include keywords (keys of musical notes), channel numbers, pitch (low, middle, and high), duration (beat), volume, speed, and instrument configuration. A score consists of a sequence of notes, timing, and instrumental definitions of synth sounds. When a set of MIDI messages is played through a music synthesis chip, the synthesizer interprets the characters and produces music. The MIDI keyboard itself does not emit sound but touches the keys on the keyboard, sends out key messages, and generates MIDI music messages, which are recorded by the sequencer to generate MIDI files. Frequency modulation (FM) is used for synthesis and wavetable synthesis to turn these commands into music. The principle of FM music synthesis is shown in Figure 5.
[figure(s) omitted; refer to PDF]
In Figure 5, the synthesis method of FM is generated by the combination of waveforms. The digital-analog converter (DAC) converts the digital input quantity into an analog quantity through a resistor network according to the weight. It then converts it into an analog quantity proportional to the digital quantity through an addition circuit. The waveform table synthesis structure is shown in Figure 6.
[figure(s) omitted; refer to PDF]
In Figure 6, firstly, the piano sound played by the piano is recorded and stored in a digital signal processing (DSP) chip. Pulse code modulation (PCM) is encoded to store the sound of the piano as digital signal samples in read-only memory (ROM). The compact disc-read-only memory (CD-ROM) interface is connected to the bus. So, when the interface makes a piano sound, the wavemeter makes a real piano sound. The feature extraction process of piano music is shown in Figure 7.
[figure(s) omitted; refer to PDF]
In Figure 7, firstly, high-definition audio extraction is performed during the piano performance. MIDI is used to capture piano music signals. The piano player’s strength, duration control, and keystroke accuracy are analyzed for piano playing sound effects [25]. The rhythm of the piano score is good or bad, as shown in equation (5) as follows:
In equation (5), the piano score is divided into measures.
2.3. Construction of the Piano Performance Evaluation Model
Ode to Joy is a repertoire evaluated as piano performance. The construction of the piano performance evaluation model is analyzed through Ode to Joy. This piece provides an efficient and accurate evaluation of the MIDI piano performance evaluation scheme. The construction of the MIDI piano performance evaluation model is shown in Figure 8.
[figure(s) omitted; refer to PDF]
In Figure 8, the data collection can be distributed through the structured query language (SQL) database for data collection and storage. The data preprocessing stage is not suitable for training data filtering. Raw data is transformed into input for deep learning training. The dataset is divided into a training set and a test. The training set is used for model training. The test set is used for piano performance evaluation. The classification label calculation for MIDI piano music evaluation prediction [26] is shown in equation (7) as follows:
In equation (8),
3. Results and Discussion
3.1. Implementation of LSTM Model Evaluation
Ode to Joy piano pieces in 4/4 time is used to analyze the evaluation model. The relationship between the number of nodes in the LSTM hidden layer and the
[figure(s) omitted; refer to PDF]
In Figure 9, the single- and double-layer LSTM models have smaller
3.2. Analysis of the Results of the Piano Performance Evaluation Model
The number of iterations of the piano performance evaluation model is set to 2000 times. The number of input layer nodes is 88. The number of nodes in the output layer is 5, and the learning rate is 0.001. In order to compare the accuracy of piano performance evaluation under different models, RNN, LSTM, and bidirectional LSTM (BLSTM) models are used for comparative analysis. The accuracy of different models is shown in Figure 10.
[figure(s) omitted; refer to PDF]
In Figure 10, the accuracy of model training increases with the number of iterations. At 1000 iterations, the accuracy of the RNN model is 47.52%. The accuracy of the LSTM model is 67.65%. The accuracy of the BLSTM model is 76.80%. When iterating 2000 times, the accuracy of the RNN model did not change significantly. The accuracy of the LSTM model is 69.91%. The accuracy of the BLSTM model is 83.86%. Therefore, the BLSTM model has the highest test accuracy. The performance of the piano performance evaluation model is tested by the music of Ode to Joy. For piano 10, 6, and 5, different levels of piano proficiency are evaluated. The evaluation results of different piano playing levels are shown in Figure 11.
[figure(s) omitted; refer to PDF]
In Figure 11, the performance evaluation results of piano grade 10 are significantly higher than the performance results of piano grade 6 and piano grade 5. The results of the systematic review are consistent with different levels of playing effects. Among them, the overall evaluation of piano grade 10 is 0.91, the expressiveness is 0.83, and the rhythm is 0.83. The playing effect of piano grades 6 and 5 is also in line with the playing level of this grade. The data show that the model is feasible when used in a piano performance evaluation strategy system.
Wang et al. proposed an end-to-end piano performance scoring system based on a convolutional neural network and attention mechanism. It inputs two sequences of acoustic features and directly predicts a performance score. This is consistent with the results obtained in this study. Deep learning models can improve the efficiency of piano performance evaluation [27]. Luo and Ning used a neural network model to evaluate piano performance and simulated teachers to guide students to practice. The abovementioned results are consistent with the results of this study, which all indicate that the use of deep learning techniques can improve the efficiency of piano performance evaluation [28].
4. Conclusions
The research analyses the number of hidden layers realized by the LSTM model in the piano performance evaluation, the accuracy of the piano performance evaluation under different models, and the different piano performance levels under the BLSTM model. The research results show that the
[1] R. K. Maurya, A. C. Dediego, M. A. Bruce, "Application of yoga as a spiritual practice to enhance counselor wellness and effectiveness," Counseling and Values, vol. 66 no. 1, pp. 57-72, DOI: 10.1002/cvj.12144, 2021.
[2] Y. S. Kaleli, "The effect of individualized online instruction on TPACK skills and achievement in piano lessons," International Journal of Technology in Education, vol. 4 no. 3, pp. 399-412, 2022.
[3] H. Sezen, "Note reading methods used in piano education of 4 to 6 years old children," Educational Research and Reviews, vol. 16 no. 7, pp. 310-324, DOI: 10.5897/ERR2021.4174, 2021.
[4] D. P. Küçük, M. Durak, "Relationship between piano performance self-efficacy perceptions and exam anxieties of music teacher candidates," OPUS Uluslararası Toplum Araştırmaları Dergisi, vol. 17 no. 33, pp. 10-46, DOI: 10.26466/opus.805691, 2021.
[5] B. Bai, "Piano learning in the context of schooling during China’s ‘piano craze’ and beyond: motivations and pathways," Music Education Research, vol. 23 no. 4, pp. 512-526, DOI: 10.1080/14613808.2021.1929139, 2021.
[6] U. Malmendier, "FBBVA lecture 2020exposure, experience, and expertise: why personal histories matter in economics," Journal of the European Economic Association, vol. 19 no. 6, pp. 2857-2894, DOI: 10.1093/jeea/jvab045, 2021.
[7] A. Lazzarini, "Rebulding reciprocity for civil economics and civil education in the age of complexy," MeTis-Mondi educativi. Temi indagini suggestioni, vol. 10 no. 2, pp. 146-161, DOI: 10.30557/MT00139, 2020.
[8] T. Bobbe, L. Oppici, L. M. Lüneburg, O. Münzberg, S. C. Li, S. Narciss, K. H. Simon, J. Krzywinski, E. Muschter, "What early user involvement could look like–developing technology applications for piano teaching and learning," Multimodal Technologies and Interaction, vol. 5 no. 7,DOI: 10.3390/mti5070038, 2021.
[9] Y. S. Kaleli, "The effect of computer-assisted instruction on piano education: an experimental study with pre-service music teachers," International Journal of Technology in Education and Science, vol. 4 no. 3, pp. 235-246, DOI: 10.46328/ijtes.v4i3.115, 2020.
[10] S. Kim, J. M. Park, S. Rhyu, J. Nam, K. Lee, "Quantitative analysis of piano performance proficiency focusing on difference between hands," PLoS One, vol. 16 no. 5, article e0250299,DOI: 10.1371/journal.pone.0250299, 2021.
[11] S. T. Taba, B. D. Arhatari, Y. I. Nesterets, Z. Gadomkar, S. C. Mayo, D. Thompson, J. Fox, B. Kumar, Z. Prodanovic, D. Hausermann, A. Maksimenko, "Propagation-based phase-contrast CT of the breast demonstrates higher quality than conventional absorption-based CT even at lower radiation dose," Academic Radiology, vol. 28 no. 1, pp. e20-e26, DOI: 10.1016/j.acra.2020.01.009, 2021.
[12] R. M. Bilder, K. S. Postal, M. Barisa, D. M. Aase, C. M. Cullum, S. R. Gillaspy, L. Harder, G. Kanter, M. Lanca, D. M. Lechuga, J. M. Morgan, R. Most, A. E. Puente, C. M. Salinas, J. Woodhouse, "InterOrganizational practice committee recommendations/guidance for teleneuropsychology (TeleNP) in response to the COVID-19 pandemic," The Clinical Neuropsychologist, vol. 34 no. 7-8, pp. 1314-1334, DOI: 10.1080/13854046.2020.1767214, 2020.
[13] Z. Hu, Y. Li, S. Zou, H. Xue, Z. Sang, X. Liu, Y. Yang, X. Zhu, D. Liang, H. Zheng, "Obtaining PET/CT images from non-attenuation corrected PET images in a single PET system using Wasserstein generative adversarial networks," Physics in Medicine & Biology, vol. 65 no. 21, article 215010,DOI: 10.1088/1361-6560/aba5e9, 2020.
[14] B. Bragagnolo, D. Guigue, "Analysis of sonority in piano pieces," Revista Música, vol. 20 no. 1, pp. 219-248, DOI: 10.11606/rm.v20i1.168644, 2020.
[15] P. Luizard, J. Steffens, S. Weinzierl, "Singing in different rooms: common or individual adaptation patterns to the acoustic conditions?," The Journal of the Acoustical Society of America, vol. 147 no. 2, pp. EL132-EL137, DOI: 10.1121/10.0000715, 2020.
[16] A. Sharafati, S. B. Haji Seyed Asadollah, D. Motta, Z. M. Yaseen, "Application of newly developed ensemble machine learning models for daily suspended sediment load prediction and related uncertainty analysis," Hydrological Sciences Journal, vol. 65 no. 12, pp. 2022-2042, DOI: 10.1080/02626667.2020.1786571, 2020.
[17] C. López, S. Linares-Mustarós, J. Vinas, "The use of fuzzy mathematical tools for local public services outsourcing according to typology," Journal of Intelligent & Fuzzy Systems, vol. 38 no. 5, pp. 5379-5389, DOI: 10.3233/JIFS-179631, 2020.
[18] N. Shi, "Improving undergraduate novice programmer comprehension through case-based teaching with roles of variables to provide scaffolding," Information, vol. 12 no. 10,DOI: 10.3390/info12100424, 2021.
[19] J. H. Nieminen, L. Tuohilampi, "‘Finally studying for myself’–examining student agency in summative and formative self-assessment models," Assessment & Evaluation in Higher Education, vol. 45 no. 7, pp. 1031-1045, DOI: 10.1080/02602938.2020.1720595, 2020.
[20] L. Ruthotto, S. J. Osher, W. Li, L. Nurbekyan, S. W. Fung, "A machine learning framework for solving high-dimensional mean field game and mean field control problems," Proceedings of the National Academy of Sciences, vol. 117 no. 17, pp. 9183-9193, DOI: 10.1073/pnas.1922204117, 2020.
[21] H. W. Loh, C. P. Ooi, S. G. Dhok, M. Sharma, A. A. Bhurane, U. R. Acharya, "Automated detection of cyclic alternating pattern and classification of sleep stages using deep neural network," Applied Intelligence, vol. 52 no. 3, pp. 2903-2917, DOI: 10.1007/s10489-021-02597-8, 2022.
[22] Y. Wang, L. Zhu, H. Xue, "Ultra-short-term photovoltaic power prediction model based on the localized emotion reconstruction emotional neural network," Energies, vol. 13 no. 11,DOI: 10.3390/en13112857, 2020.
[23] Z. Zhang, H. Luo, C. Wang, C. Gan, Y. Xiang, "Automatic modulation classification using CNN-LSTM based dual-stream structure," IEEE Transactions on Vehicular Technology, vol. 69 no. 11, pp. 13521-13531, DOI: 10.1109/TVT.2020.3030018, 2020.
[24] Y. Ghatas, M. Fayek, M. Hadhoud, "A hybrid deep learning approach for musical difficulty estimation of piano symbolic music," Alexandria Engineering Journal, vol. 61 no. 12, pp. 10183-10196, DOI: 10.1016/j.aej.2022.03.060, 2022.
[25] S. Cotter, "Architectonisation: the Spatio-temporal rhythms of contemporary sculptural practices," Architectural Design, vol. 92 no. 2, pp. 22-29, DOI: 10.1002/ad.2789, 2022.
[26] L. Qiu, S. Li, Y. Sung, "DBTMPE: deep bidirectional transformers-based masked predictive encoder approach for music genre classification," Mathematics, vol. 9 no. 5,DOI: 10.3390/math9050530, 2021.
[27] W. Wang, J. Pan, H. Yi, Z. Song, M. Li, "Audio-based piano performance evaluation for beginners with convolutional neural network and attention mechanism," IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 29, pp. 1119-1133, DOI: 10.1109/TASLP.2021.3061267, 2021.
[28] W. Luo, B. Ning, "Toward piano teaching evaluation based on neural network," Scientific Programming, vol. 2022,DOI: 10.1155/2022/6328768, 2022.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2022 Xunyun Chang and Liangqing Peng. This work is licensed under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
With the development of society and the progress of technology, the piano education industry has a large market. In view of the problem of high payment fees in the piano education industry, the scientific and automatic nature of piano performance evaluation has attracted people’s attention. However, since most of the piano performance evaluation schemes are based on rules, the continuity of the piano music and the accuracy of playing are ignored. Therefore, the purpose is to design a scientific piano performance evaluation scheme that can play a certain role in the sustainable development of the piano education industry. Firstly, long short-term memory in deep learning is explored. Secondly, the musical characteristics of piano performance are analyzed according to the musical instrument digital interface. The piano music features are extracted, and a long short-term memory-based musical instrument digital interface piano performance evaluation model is constructed. Finally, it analyzes the number of hidden layers implemented in the long short-term memory model for piano performance evaluation. The accuracy of piano performance evaluation under different models is analyzed. Under the bidirectional long short-term memory network model, different piano performance levels are evaluated to realize the study of piano performance evaluation strategies. Compared with the accuracy of the recurrent neural network and the long short-term memory model with different hidden layers, the bidirectional long short-term memory model has the highest test accuracy, with an average of 69.78%. When the hidden layer of the bidirectional long short-term memory model is 3, the loss function
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer