Reservoir computing is a promising method for implementing high-efficiency artificial intelligence that can perform at lower learning costs than is possible with conventional artificial neural networks, which is due to the system having fewer learning parameters and the ability to process time series data.[1–3] These advantages can be realized in a reservoir computing system by meeting three requirements; nonlinear transformation, short-term memory (fading memory), and the ability to map in higher dimensional space. Although recently, while various physical devices (e.g., electrical circuits, electrochemistry, magnetics, optics, robotics, ion-gating device, and so on) have been utilized to create physical reservoir systems that mimic biological systems by utilizing nonbiological substrates,[4–36] some issues remain due to the high electrical power consumption, low accuracy rates, and large volumes associated with such systems. Among related research fields, the spin wave is attracting attention due to its charge-less transport method and low electric power consumption.[37–40] Recently, there has been a report of a reservoir computing system with spin wave propagation in an active-ring resonator,[16–18] although its capabilities are lacking (e.g., large volume, including resonator circuit, and low memory capacity). As a way of solving such problems, it is shown that spin wave interference in ferromagnetic materials satisfies the requirement for the three features of a physical reservoir through micromagnetic simulation.[41–46] However, there have been no experimental demonstrations that show that nonlinear interference of spin waves has been applied to a reservoir computing system.
Here, we describe the first demonstration of a physical reservoir computing system utilizing interfered spin waves. In the subject system, a yttrium-iron-garnet (YIG) single crystal with multiantennas, which are for the excitation and detection of multispin waves, is used as a homogeneous medium. A hand-written digit recognition task and nonlinear time series data prediction tasks were performed to evaluate the performance of the subject physical reservoir system. The maximum testing accuracy rate for the hand-written digit recognition task was 89.6%, which is comparable to or higher than the score of the high-performance physical reservoirs reported. Memory capacity and the solvability of nonlinear autoregressive moving average (NARMA) and second-order nonlinear dynamic tasks were dramatically improved by interfered spin wave multidetection, which was used experimentally for the first time. Minimum errors of 8.37 × 10−5 and 1.81 × 10−2 were achieved, which are dramatically lower than the errors of any other experimental reservoir system reported to date. One of the most noteworthy points in this article is that our reservoir computing system can predict NARMA10 more precisely than any experimental spintronics reservoir system due to its large short-term memory capacity (CSTM ≈ 35.5 per 100 nodes) and high nonlinearity, which were achieved utilizing interfered spin wave multidetection. In this article, we discuss in the following order; physical characteristics of the YIG single crystal with multiantennas, demonstration of a physical reservoir system, nonlinearity validated by micromagnetic simulation, and evaluation of short-term memory and nonlinearity of the reservoir by experimental measurement.
The Physical Characteristics of YIG Single Crystal with Multiantennas and the Concept of Reservoir ComputingFigure 1a shows YIG single crystal deposited coplanar antennas and their experimental configuration. A static magnetic field was applied perpendicular to a YIG surface. Two exciters and two detectors were situated so as to detect multiple signals from interfered spin waves. Microwave current, injected into the exciters by an arbitrary waveform generator (AWG), induces a microwave Oersted field, which drives the precession of the magnetic moments localized near the antenna. This precession is carried away from the antennas. An inverse process then occurs, where the dynamic magnetic dipole field, produced by the precession of the moments, induces an electromotive force in the detectors. The wavenumber k of the spin wave is determined by the geometric shape of the coplanar antenna, since the spin wave is excited by microwave current flowing in the antenna. Thus, k space distributions of electric current density and the amplitude of the excited spin wave are synonymous. Spin wave spectroscopy of a de-embedded transmission signal, which is shown in Figure 1b, shows the magnetic field dependence of an excited spin wave frequency. Spin wave propagation is shown as white lines, while the horizontal line does not result from spin waves. The spin wave resonance frequency, under the application of a perpendicular magnetic field, is described as follows[Image Omitted. See PDF]where f, γ, H, Ha, M, k, and d are frequency, gyromagnetic ratio, applied external field, magnetic anisotropy field, saturation magnetization, and YIG thickness, respectively. Here, γ and M are set at 2.8 MHz Oe−1 and 1984 Gauss, which was obtained from the magnetization measurement, as shown in Figure 1c. The calculated k is 0.134 μm−1 for k1 and 0.385 μm−1 for k2, which are in good agreement with the k1 of 0.118 μm−1 and k2 of 0.311 μm−1 of the calculated distribution of current density. Details of the dimensions of the single crystal and antennas, k estimation, and spin-wave spectroscopy are given in S1 (Supporting Information). Here, in the experimental configuration shown schematically in Figure 1d, detected spin waves are affected by two factors; 1) the interference of spin waves propagating from exciters A and B; and 2) the historic effect of remnant spin precession originating from the traveling spin wave. These effects are mentioned in some theoretical studies, which utilize nonlinear interference and the historic effect of spin wave.[41–43,45,46] The nonlinear interference of spin waves in this experiment results from magnetic dipole interaction between nonlinearly excited spin waves. The traveling spin wave causes spin fluctuation through the remaining spin precession. The degree of this fluctuation depends on the past input information, and the historic effect of remnant spin precession gives memory. Figure 1e shows the magnetic field dependence of a voltage-induced spin wave and input pulse voltage. The rise and fall times and pulse-on time are 320 ps, and there are obvious variations between each signal. While the amplitude of a wave packet is enhanced with increasing magnetic field up to 186 mT, the amplitude of wave packets above 186 mT weakens. This is because spin wave resonance frequency shifts with the magnetic field, as already shown in Figure 1b. Figure 1f shows the nonlinearity of an interfered spin wave. As shown in the upper panel, a spin wave traveling between Exciter A and Detector A differs from one traveling between Exciter B and Detector A. Thus, a multidetection technique can extract a variety of signals. As shown in the lower panel, the difference between interference and a linear combination of spin waves, shown in the upper panel after a time domain of 10 ns, shows that the interfered wave exhibits nonlinearity. The nonlinear phenomenon of spin waves is known as parametric pumping and chaos due to nonlinear interference induced by the instability of spin precision.[47] The nonlinear interference of spin waves in this experiment results from magnetic dipole interaction between nonlinearly excited spin waves, as observed in previous theoretical studies.[41,45,46]
Figure 1. Concept of a reservoir computing system with interfered spin wave multidetection. a) (upper panel) YIG single crystal deposited coplanar antennas and its experimental configuration. (Lower panel) Crystal structure of the YIG. FeI and FeII denote Fe ions at tetrahedral and octahedral sites. b) Spin wave spectroscopy of de-embedded transmission signal. Exciter A and Detector A are used. The solid blue and red lines show the fitting curves with the magnetic field dependence of spin wave frequency. c) Magnetization as a function of applied in-plane magnetic field for YIG single crystal. d) Schematic illustration of propagation and interference of spin waves. Pulsed voltage input by AWG induces an Oersted field (solid black arrows) around the antenna. The voltage induced by the precession of magnetic moment is detected by an oscilloscope. The white arrows denote traveling spin waves excited by exciters. e) Induced voltage variation of spin wave packets at various magnetic fields and input pulse voltages. Exciter A and Detector A are used. f) (upper panel) Voltage induced by spin waves traveling between Exciter A and Detector A and between Exciter B and Detector A. (lower panel) Compared voltage induced by an interfered spin wave and a linear combination of two spin waves and its difference. g) General concept of a reservoir computing system with spin wave interference. u(k), Xi(k), Wi, and y(k) denote input waveform, i-th virtual node state at time step k, i-th output weight, and reservoir readout.
This nonlinearity meets one of the necessary requirements of a physical reservoir. Details of the nonlinearity of an interfered spin wave and the attenuation length and velocity of spin waves are described in S2 and S3 (Supporting Information). Figure 1g is a schematic illustration of a reservoir computing system with an interfered spin wave. Time-series data u(k) is input to the reservoir from the input layer. The input data are then transformed into nonlinear data due to spin wave interference and the historic effect. The signal affected by the interference and historic effect is directly measured by an oscilloscope. Thus, a detected signal is extracted as i-th virtual nodes Xi(k), namely, neurons interacting with each other in the reservoir, to map input data to high dimensional space. A reservoir readout y(k) is expressed as the product of each Xi(k) and learning parameters Wi, as follows[Image Omitted. See PDF]where n, k, b are the number of nodes, the discrete time, and bias, respectively.
Hand-Written Digit Recognition TaskThe first demonstration performed with our reservoir computing system was of a hand-written digit recognition task, as shown in Figure 2a. Hand-written digits are expressed by combinations of voltage signals with 16 different binary pulsed voltages. In the reservoir, these 16 different voltage signals are transformed into 16 different waveforms through the YIG single crystal, since excited spin waves also have various shapes. Thus, each hand-written digit is reconstructed by voltages extracted from the 16 waveforms. Figure 2b shows normalized induced-voltage variation at various pulsed 4-bits of data. These plots were taken from the voltage values at time point of 44 ns for 16 waveforms acquired under a magnetic field of 180 mT, as shown in Figure S4g (Supporting Information). The individual values do not overlap each other, meaning that the reservoir system gives the input data sufficient diversity. The fading memory property in the reservoir system is important for this task, since the degree of dispersion of the 16 different induced voltages is based on past input pulses in a 4-bit pulse train. Therefore, as shown in Figure 2b, it is revealed that the spin wave possesses a sufficiently large fading memory property to perform the hand-written digit recognition task. Figure 2c–e shows the accuracy rate for hand-written digit recognition tasks with interfered spin wave, and with spin-wave propagating from Exciters A and B, respectively. Although the accuracy rates depend on when the reservoir state is extracted, there is no systematic dependence on time. The best accuracy rate for an interfered spin wave is 89.31% (39 ns) at 180 mT (indicated by a black arrow). The best scoring accuracy rate for a spin wave excited at Exciter A is 89.6% (36 ns) at 172 mT (indicated by a red arrow). The best scoring accuracy rate for a spin wave excited at Exciter B is 89.47% (38 ns) at 200 mT (indicated by a light blue arrow). The reason for the lack of difference in the accuracy rate between with interference and without interference is the introduction of the sigmoid function. As is commonly performed in physical reservoirs, a sigmoid function is used in this task to transform the output values of the physical device for the input in a nonlinear manner. The difference in nonlinearity between the responses with interference and without interference was reduced by the sigmoid function, and as a result, no difference appeared in the accuracy rate for the task. As shown in Figure 2f, the accuracy rate of 69.59% improved as the number of training samples increased.
Figure 2. Hand-written digit recognition task. a) General concept process flow diagram of a hand-written digit recognition task using a reservoir computing system with interfered spin wave. b) Normalized induced voltage variation at various pulsed 4-bit data. The standard error is less than 7.89 × 10−4. c–e) Accuracy rate variations at various times when the value of the induced voltage is extracted from an interfered spin wave (c), spin wave excited from Exciter A (d), and spin-wave excited from Exciter B (e). Colored arrows denote the best accuracy rate in each condition. f) Number of training samples dependent on accuracy rate for hand-written digit recognition.
The maximum accuracy rate in this experiment is superior to that of any other physical reservoir computing system (e.g., memristor[6] and magnetic devices),[11] and is comparable to the excellent 91.3% score achieved with an optical element,[20] which has a larger node number of 512, and with an ionic liquid device,[28] while these are lower than 94.4% achieved with a biomolecular memristor.[35] Refer to S4 (Supporting Information) for the method used for a hand-written digit recognition task.
Nonlinear Dynamical System Prediction TaskTime series data prediction tasks are widely performed so as to evaluate the nonlinear transform function of a reservoir system. The process flow for such task is shown in Figure 3a. In solving a second-order nonlinear dynamical equation task, a random wave is input to a second-order nonlinear dynamical system. The output d(k) from this dynamical system at k is described as follows[Image Omitted. See PDF]d(k) depends not only on the current input u(k) but also on the past two states d(k − 1) and d(k − 2) at discrete times k − 1 and k − 2. Second term on the right-hand side of Equation (3) is the cross term that makes it a second-order nonlinear system. Before being input to the reservoir system, the original random wave is processed to the pulsed signal, at intervals of 2, 5, 10, 15, and 20 ns, as preprocessing. Each of these signals is input to a reservoir computing system to which perpendicular magnetic fields of 150, 169, 176, 186, 200, and 250 mT are being applied. One interval of a pulsed signal was considered as one discrete time. Pulsed voltage was input into Exciters A and B. Then, various spin waves, depending on amplitude and interval of pulsed voltage, were excited. These spin waves reach Detectors A and B, with interference from other spin waves, and are converted to voltage signals. The virtual nodes are equally spaced from the whole period within one-time step. 50 virtual nodes were taken from the voltage values, which region does not contain the cross-talk component from the pulse input, and which are denoted as black points in the figure showing the spin wave signals measured at Detectors A and B for each discrete time. From Figure 1f, the entire wave packet does not have nonlinear components. If the pulse interval is short, it is not possible to capture the large nonlinear component. However, time series data consisting of pulse voltages are input to the device on the task in reservoir computing, and it is expected that the nonlinearity caused by the interaction of the spin wave of the next time step with the spin precession remaining after the spin wave propagates. Furthermore, amplitude of pulsed voltage is important for nonlinearity, as shown in S7 (Supporting Information). Thus, both interval and amplitude of pulsed voltages are important for virtual nodes to capture the nonlinearity induced by spin wave interference. 100 virtual nodes per discrete time can be extracted by utilizing Detector A and Detector B. Then, 100 waveforms at a discrete time, which is referred to as a “reservoir state”, were generated by connecting these virtual nodes. Figure 3b shows a comparison with the theoretical output of Equation (3) and the predicted output reconstructed from the reservoir computing system in the training phase, which is measured under a magnetic field of 169 mT, at an interval of 5 ns. The normalized mean square error (NMSE) for this task is described as follows[Image Omitted. See PDF]
Figure 3. Nonlinear dynamical system prediction task. a) General concept process flow diagram of a time-series prediction task using a reservoir computing system with the interfered spin wave. b,c) Predicted results at the training phase (b) and testing phase (c) of a second-order nonlinear system. The black, green, and red lines denote the target, and prediction results at the training and testing phases, respectively. d) NMSE variation for prediction of a second-order nonlinear system at various magnetic fields and intervals. (Upper-left) Detector A without interference. (Upper-middle) Detector B without interference. (Upper-right) Multidetection without interference. (Lower-left) Detector A with interference. (Lower-middle) Detector B with interference. (Lower right) Multidetection with interference. e) Comparison of NMSE of various devices. The NMSEs of an electric double layer-ion gating reservoir (EDL-IGR),[31] biomolecular memristor,[35] spin torque oscillator,[11] and WOx-memristor[6] are also shown.
Here, T, d(k), and yp(k) are the lengths in the training phase (T = 3500) or test phase (T = 500), the target signal, and the predicted signal. NMSE at the training phase is 7.66 × 10−5. A new random input is prepared for the testing phase, so as to verify that the trained reservoir computing system can predict output from Equation (3). The compared results in the testing phase are shown in Figure 3c. NMSE for the testing phase exhibits a similar value of 8.37 × 10−5. Figure 3d shows NMSE changes at various intervals and magnetic fields. There is a tendency for the following to occur; while NMSE is lower in stronger magnetic fields and at shorter intervals, NSME in weaker magnetic fields and at longer intervals is higher under all measurement conditions. This can be seen in the comparison shown in the two upper panels (w/o interference, Detector A and Detector B) and the two lower panels (with interference, Detector A and Detector B) of the figure, where NMSE is dropped overall by utilizing interfered spin waves. NMSE does not depend on the position of the detection antenna, although NMSE does change slightly in the case of both interference and non-interference. (“w/o interference” means that a spin wave is excited by a single exciter. In this case, since a spin wave excited at Exciter A (Exciter B) does not meet a spin wave excited at Exciter B (Exciter A), spin wave does not interfere. “With interference” means that two spin waves are excited at two exciters at same time step, and these waves interact with each other before reaching the detector). However, NMSE drastically dropped to 8.37 × 10−5 when interfered spin wave multidetection was utilized, as is proved by the comparison shown in the two right-hand panels (Detectors A + B) and the remaining four panels (Detector A and Detector B) in the figure. This value is much lower than the values for other physical reservoir computing systems that have been reported, in which the NMSEs of a theoretical reservoir computing system with 24 spin torque oscillators[11] and biomolecular memristor[35] were ≈1.31 × 10−3 and 9.4 × 10−4, an experimental reservoir computing system with 90 metal-oxide memristors[6] were ≈3.13 × 10−3, and electric double layer-based ion-gating reservoir (EDL-IGR)[31] was 2.1 × 10−4, as shown in Figure 3e. The subject nonlinear interfered spin wave multidetection has excellent nonlinearity, which results in reducing the minimum NMSE to 60% compared with another experimental reservoir.[31] The detail of interval dependence of reservoir performance is described in S6 (Supporting Information).
The nonlinear autoregressive moving average (NARMA) is a more difficult task to perform than a second-order nonlinear dynamic task, since to predict the output of an NARMA model, a reservoir system is required to not only perform nonlinear transform functions, but also to exhibit fading memory. The NARMA is frequently used in studies of physical reservoir computing, since this task plays the role of indicating the physical reservoir expression power for processing more concrete application tasks (e.g., spoken-digit recognition task, abnormal electrocardiogram detection, and sunspot data prediction). Here, we introduce NARMA2 and NARMA10, which require fading memory from the previous 2 and 10 steps,[48] respectively, as defined below[Image Omitted. See PDF]and[Image Omitted. See PDF]
Figure 4a,b shows comparisons with the theoretical outputs of Equation (5) and predicted outputs reconstructed from the reservoir computing system in the training phase and testing phase, which are measured under a magnetic field of 169 mT and an interval of 5 ns, for NARMA2. NMSE variations at various magnetic fields and intervals are summarized in Figure 4c. Here, NMSE for NARMA tasks is described as follow[Image Omitted. See PDF]where dave. is the time average of d(k). The measurement condition dependence of NMSEvar is similar to that for a second-order nonlinear dynamics task. The lowest NMSEvar for this task was 1.81 × 10−2, which is dramatically lower than that of an experimental physical reservoir computing system previously reported.[8–10,30,31] Thus, reservoir computing systems with interfered spin wave multidetection show excellent computational performance, as shown in Figure 4d. Input voltage dependence of the errors for second-order nonlinear dynamic equations and NARMA2 tasks is described in S7 (Supporting Information). Figure 4e,f shows comparisons with the theoretical outputs of Equation (6) and predicted outputs reconstructed from the reservoir computing system in the training phase and the testing phase, which are measured under a magnetic field of 186 mT and an interval of 20 ns, for NARMA10. The NMSEvar for NARMA10 was 1.68 × 10−1 in the testing phase, under a magnetic field of 186 mT and an interval of 20 ns, when utilizing interfered spin wave multidetection. Figure 4g shows the NMSEvar variation at various intervals and magnetic fields. There is a tendency, which differs from the tendency for the NMSEvar for the NARMA2 prediction, which tendency is described as follows; while the NMSEvar is lower in stronger magnetic fields and longer intervals, the NSMEvar at rest is higher. The comparison with the two upper panels (w/o interference, Detector A and Detector B) and the two lower panels (with interference, Detector A and Detector B) does not depend on the position of the detection antenna, although NMSE slightly changes in the case of both interference and noninterference. However, NMSEvar dropped overall, and reached its lowest NMSEvar of 1.68 × 10−1, when interfered spin wave multidetection was utilized, as proved by the comparison between the two right-hand panels (Detectors A + B) and the remaining four panels (Detector A and Detector B) in the figure. This result indicates that the large CSTM and the high ability to map high dimensions, achieved with 100 nodes extracted from the multidetection, is critically important in solving NAMRA10 tasks. This result confirms that a reservoir computing system with interfered spin wave multidetection exhibits the highest performance of all experimental reservoir computing system, with the exception of reservoir systems with large volumes (i.e., Mackey–Glass oscillators[4,5] and optical element),[22–26] as shown in Figure 4h. Refer to the supplementary information for reproducibility for the task (S8, Supporting Information) and a comparison of reservoir volumes, performance, and computing speeds described in this study and in optical elements (S9, Supporting Information).[49]
Figure 4. The results of NARMA2 and NARMA10 prediction tasks. a,b) Prediction results for the training phase (a) and testing phase (b) of the NARMA2 system. The black, green, and red lines denote the target, and the prediction results in the training and testing phases, respectively. c) NMSEvar variation for prediction of a NARMA2 system at various magnetic fields and intervals. (Upper-left) Detector A without interference. (Upper-middle) Detector B without interference. (Upper-right) Multidetection without interference. (Lower-left) Detector A with interference. (Lower-middle) Detector B with interference. (Lower-right) Multidetection with interference. d) The NARMA2 benchmark. The diode circuit,[8] electrochemical cell,[9] carbon nanotube (CNT) network,[10] and EDL-IGR[31] are also shown. e,f) Prediction results for a NARMA10 system for the training phase (e) and the testing phase (f) of a reservoir computing system with a magnetic field of 186 mT and an interval of 20 ns. The black, green, and red lines denote the target, and the prediction results for the training and testing phases, respectively. g) NMSEvar variation for prediction of the NARMA10 system at various magnetic fields and intervals. (Upper-left) Detector A without interference. (Upper-middle) Detector B without interference. (Upper-right) Multidetection without interference. (Lower-left) Detector A with interference. (Lower-middle) Detector B with interference. (Lower-right) Multidetection with interference. h) The NARMA10 benchmark. The optics,[22–26] Macky–Glass oscillator,[4,5] AMR array,[19] MEMS[33] are also shown.
To reveal the origin of the excellent nonlinearity shown in the second-nonlinear dynamical equation and NARMA2 tasks, micromagnetic simulation of the nonlinear interfered spin wave was performed. The degree of interaction between spin waves should be modulated by change in the pulse interval. The pulse interval dependence on the nonlinearity of an interfered spin wave was investigated by comparing waveforms simulated under various pulse interval conditions. The nonlinear ratio is defined as (Pint.–Plin.)/Pint., where Pint. and Plin. denote the time-averaged output intensity of interfered spin waves and of linear summation of spin waves excited from Exciter A and Exciter B, respectively. Figure 5a shows the simulation model of the nonlinear interfered spin wave multidetection. The size of the antennas used and the distance between them are set to the same dimensions as the actual device. Figure 5b–e shows the simulated spin wave motion at an external magnetic field of 0.3 T and an input pulse interval of 5 ns. The input signal is shown in Figure 5j,k. As can be seen in Figure 5b,d, spin waves excited at two different exciters (A and B) show different waveforms, even though the detection position is the same (Detector A). This result corresponds to the behavior observed in the experiment shown in Figure 1f. The interfered spin wave can be also observed in this simulation, and there is a finite difference Δmx between the interfered spin waves excited by Exciters A and B (Figure 5f) and the linear summation of the two spin waves excited by Exciters A and B (Figure 5b,d), as shown in Figure 5h. This evidences that spin waves interfere nonlinearly due to magnetic dipole interactions between spin waves.[41] A similar relationship was confirmed in the case of spin waves at Detector B, as shown in Figure 5c,e,g,i. The nonlinear ratio variation of interfered spin waves at various pulse intervals is shown in Figure 5l. The nonlinearity ratio at Detector A and Detector B reaches to a maximum value of 119 and 145%, respectively, at a pulse interval of 5 ns. Then, the nonlinear ratio dropped at 10 ns and maintained a relatively low value by the time the interval reached 20 ns. The volcano curve forms as a result of the historic effect, which is the interaction between the presenting spin wave and a subsequently excited spin wave. The nonlinearity resulting from the historic effect can also be seen in a theoretical reservoir model with a nonlinear interfered spin wave.[41] The simulation result with the strongest nonlinearity, at an interval of 5 ns, is in good agreement with the experimental result with the best performance for second-order nonlinear dynamic equation and NARMA2 tasks achieved at a pulse interval of 5 ns, as shown in Figure 5m. Thus, we are able to theoretically confirm that a pulse interval of 5 ns gives the system the strongest nonlinearity with the present magnetic parameters (magnetic anisotropy and saturation magnetization). This further indicates that the best pulse interval for the tasks varies with the use of various magnetic materials, with their inherent magnetic parameters. Refer to S10 (Supporting Information) for a description of the versatility of nonlinear interfered spin wave multidetection.[50]
Figure 5. The simulated spin wave motion and its nonlinear interference. a) Simulation model of nonlinear interfered spin wave multidetection with a waveguide of 380 × 90 × 0.12 μm3. The cubic mesh employed measured 40 nm × 40 nm × 40 nm. b,d,f) Simulated spin wave motions, which are excited at Exciter A (b), Exciter B (d), and Exciter A and Exciter B (f), at Detector A. c,e,g) Simulated spin wave motions that are excited at Exciter A (c), Exciter B (e), and Exciter A and Exciter B (g), at Detector B. h,i) Difference between the interfered spin wave and linear summation at Detector A (h) and Detector B (i). j,k) Magnetic field generated at Exciter A and Exciter B. l) Nonlinear ratio variation interfered spin wave at various pulse intervals. m) NMSE and NMSEvar. of the second-order nonlinear dynamic equation and NARMA 2 tasks under a magnetic field of 200 mT, respectively.
To corroborate the high performance achieved by the reservoir with interference spin wave multidetection, the properties of the reservoir were validated in terms of short-term memory and nonlinearity. A short-term memory task was performed to determine whether the system could recover past input data as current input. A random wave, with a time step of 5000, was prepared and utilized for this task. The first half of the 500 times steps were discarded. The target output d(k) is u(k–τ), which is a time series data delayed time step of k. The output weight coefficient Wout was determined using the training set. The system predicts on the test set, and the square of the correlation coefficient between the ideal targets and the model predictions is determined by utilizing the relationship described as follow[Image Omitted. See PDF]where Cov(A, B) is the covariance between vectors A and B, and Var(A) ≅ Cov(A, A). r2 takes values between 0 and 1, where the value of 1 indicates perfect replication of the targets. The short-term memory capacity CSTM is then calculated by taking the sum of r2(k) over the range of delays. CSTM is defined as follow[Image Omitted. See PDF]
The ability for prediction decreases as the step delay τ increases, as shown in the forgetting curves presented in Figure 6a. This behavior accurately shows the short-term memory feature. While a reported experimental reservoir system with spin wave propagation, which does not utilize interference and multidetection, is able to hold the memory of 5 steps [indicated by a black arrow in Figure 6a],[18] the short-term memory that our reservoir system achieves is retained for far longer, at 36 and 10 steps [indicated by blue (condition I) and orange (condition II) arrows in Figure 6a]. This is made possible by utilizing interfered spin wave multidetection. The respective memory capacities (CSTM) under various measurement conditions are calculated from the area under the forgetting curves, as summarized in Figure 6b. CSTM improves at the region above a magnetic field of 169 mT and an interval length of 10 ns. The CSTM then reaches its largest value of 35.5 per 100 nodes, which is a much larger value than the respective CSTM of below 4.85 per 20 nodes and 12.0 per 200 nodes of an experimental reservoir computing system that utilizes spin waves with an active ring resonator and anisotropic magnetoresistance,[16–19] with a magnetic field of 186 mT and an interval of 20 ns. Thus, our reservoir system has larger CSTM than said reservoir systems with spin waves. Although the reservoir system under condition I in Figure 6b is able to predict a NARMA10 model with the lowest NMSEvar of 1.68 × 10−1, the reservoir system cannot precisely predict second-order nonlinear dynamic tasks and NARMA2. In contrast, the reservoir system under condition II, which has CSTM of 13.1, can precisely predict these tasks, while the one in condition I cannot, even though it has a larger CSTM of 35.5. This fact indicates that the system under condition I may not have enough nonlinearity to solve NARMA2.
Figure 6. Short-term memory and nonlinearity of the reservoir system. a) Forgetting curves corresponding to conditions I and II. The solid black line is the forgetting curve of a spin wave delay line active ring resonator.[18] The horizontal dashed line indicates the squared correlation of 0.5. Inset shows the definition of CSTM, which is the area surrounded by a forgetting curve as painted in gray. b) The memory capacity variation for short-term memory tasks at various magnetic fields and input intervals. I and II are the conditions that can achieve the lowest NMSEvar. for NARMA2 and NARMA10, respectively. c,d) The relation between CSTM, maximum Lyapunov exponent λmax, and NMSEvar. of NARMA10 (c) and NARMA2 (d). e,f) The return maps of the reservoir correspond to condition I with node 15. g,h) The return maps of the reservoir correspond to condition II with node 15. i) The trajectory at the 2nd period (left panel), 3rd period (middle left panel), 4th period (middle right panel), and 5th period (right panel) of (e).
To evaluate the nonlinearity of the system, maximum Lyapunov exponent λmax was estimated. Lyapunov exponent λ is defined as follows[Image Omitted. See PDF]
Here, t and are iteration time and differentiation of mapping function . λmax is maximum value of calculated λ, and is generally used to determine if the response of a system is orderly or disorderly (i.e., chaotic); when λmax is negative (positive), the system is orderly (disorderly). Furthermore, the degree of nonlinearity of the system improves as λmax increases. As shown in Figure 6c,d, the system under conditions I and II have a λmax of 0.22 and 0.37, respectively. Positive λmax indicates the chaotic nature of an interfered spin wave in a YIG single crystal. Condition II has larger nonlinearity due to a larger λmax. Thus, the system under this condition has higher linearity than condition II. The Lyapunov spectra for all conditions are shown in S11 (Supporting Information).[51–53] To eliminate the possibility of random responses from the system, the return maps of the interfered spin wave are plotted, as shown in Figure 6e–h. All return maps show characteristic attractors. The trajectory in the figure exhibits aperiodic features, as indicated by the trajectories at the 2nd–5th periods in Figure 6i. Thus, this feature reflects chaotic behavior in the reservoir system.
From the result in this section, we found that there is a trade-off relationship between the nonlinearity and the memory capacity of the reservoir system, which is well-known as a general rule:[54] the λmax and CSTM of condition I (condition II) are 0.22 (0.37) and 35.5 (13.1). To predict second-order nonlinear tasks and NARMA2, it is important for the system to possess high nonlinearity rather than large CSTM. In contrast, the prediction of NARMA10 requires compatibility with large CSTM and nonlinearity, both of which are realized by utilizing interfered spin wave multidetection. Furthermore, other physical systems with large feedback circuits (e.g., optical elements) that are able to precisely predict NARMA10 are reported to have a large CSTM of 21.14/50 nodes.[25] In contrast, physical systems that predict NARMA2 but cannot predict NARMA10 have small CSTM, indicating a trade-off between CSTM and NMSEvar. for certain tasks such as predicting NARMA10.[8–10] Realizing large CSTM without any feedback circuit is one of the unique advantages of the reservoir with the nonlinear interfered spin wave multidetection.
Electric Power Consumption and Network ArchitectureEcho state network and liquid state machine, one of the recurrent neural networks referred to by reservoir computing, have been implemented on complementary metal oxide semiconductor (CMOS) circuits, and practical electric power consumption and network architecture were presented.[55,56] We discuss electric power consumption and network architecture of the reservoir system in this study.
Spin waves are a Joule loss-free information transfer method because they do not involve a flow of electric charge. A current must be applied to generate a local magnetic field as an input signal, and a total of 5000 steps (1 pulse voltage/step) was used in this study: discarded 500 steps/training period of 3500 steps/discarded 500 steps/test period of 500 steps. The power consumption to excite the spin waveform is 0.57 μJ per integration for one input (without interference) and 1.13 μJ for two inputs (with interference). Since the waveform is integrated 500 times, the signal is also input 500 times, so in this case, 284 μJ at 1 input (without interference) and 567 μJ at 2 inputs (with interference). Echo state networks and liquid state machines implemented in CMOS circuits utilize memristor arrays.[55,56] In contrast, the reservoir computing in this study utilizes a single device with a high-frequency antenna on a YIG single crystal. Therefore, the device structure is quite simple and can be miniaturized by the microfabrication process. In this study, a computer is used to calculate the reservoir output from the waveform acquired from the physical device, but in the future, it is possible to design circuits based on fine devices such as magnon transistors, which operate with an extremely low energy consumption of 10−18 J (much smaller than the standard value for CMOS of 10−16 J),[37] and magnetoresistive random access memories.
ConclusionWe achieved the first demonstration of an experimental reservoir computing system with interfered spin wave multidetection. Our reservoir computing system achieved a recognition rate of 89.6% for hand-written digits. The NMSEs for a second-order nonlinear dynamic task and NARMA2 were 8.37 × 10−5 and 1.81 × 10−2, respectively, which are dramatically lower than the NMSEs of any reservoir computing system reported to date. The subject reservoir system can also predict output from a NARMA10 model, with an NMSE of 1.68 × 10−1, which is the lowest value so far reported in an experimental spintronic reservoir computing system. While the largest CSTM in this study was 35.5 per 100 nodes, which is larger than the CSTM for any spin wave propagation reported to date,[16–18] this system has high nonlinearity, which is generally a trade-off relationship with CSTM. These high-performance functions were achieved by utilizing the unique advantages of interfered spin wave multidetection. Since this technique can be also applied to not only bulk crystal forms but also to thin film forms with extremely small volumes, the said system concept utilizes both spin wave interference and multidetection to contribute to the implementation of integrated physical reservoir systems with real-world practical uses.
Experimental Section Preparation of a Device for Reservoir Computing System with Interfered Spin WaveA one-side-polished YIG single crystal with 111 orientation, which was grown by the floating zone technique, was supplied by MTI Co. (USA). The diameter and thickness of the single crystal were 5 and 0.5 mm, respectively. Coplanar waveguides, consisting of a 10 μm wide signal line and two 20 μm wide ground lines, were fabricated by a conventional lithography technique. Ti and Au were continuously deposited by an electron beam evaporator. The distance between the edges of the antennas was 30 μm. Details of the dimensions of the device are given in supplementary information.
Experimental Set-Up for Spin Wave DetectionAll experiments were performed in a high-frequency signal measurement system, consisting of rf probes and an electromagnet, which was made by Toei Scientific Industrial Co., Ltd. An external magnetic field was applied perpendicular to the sample surface (i.e., the 111 direction of the YIG single crystal). The sample was kept at room temperature (i.e., 295 ± 1 K). The exciters and detectors shown in Figure 1a were connected, through rf probes, to an arbitrary waveform generator (Tektronix AWG5202) and a mixed signal digital oscilloscope (Tektronix MSO68B), respectively. Pulse voltage was input to the exciter to excite the spin waves. Rise and fall times and the pulse-on time were 320 ps. The amplitude of the pulse voltage was 400 mV. The input and output signals were amplified to 30 and 38 dB, respectively. To avoid detection of excess spin waves excited by previous sequences, long 4 μs intervals were inserted between each sequence. An average of 500 waveforms were taken to improve the signal-to-noise ratio (S/N). The repeating time will be reduced in future work. This is possible because the S/N can be improved by developing signal-amplifying electrical circuits that are dramatically reduced in size.
Hand-Written Digit Recognition TaskThe reservoir system was trained and tested with the commonly used Mixed National Institute of Standards and Technology (MNIST) database.[57] The database contains 60 000 samples for training and 10 000 samples for testing. Each grayscale image in the dataset was comprised of 28 × 28 pixels. At first, the original images were converted into monochrome images with a binary state of “0” or “1”. Each image was then transformed into a 196 × 4 matrix. Each row of the matrix was encoded by choosing from a total of 16 different 4-bit time-series data (i.e., “0000”, “0001”, “0010”, …, and “1111”). “0” and “1” were set to V = 0 mV and V = 400 mV, respectively. The rise, fall, and pulse-on times were 320 ps for each. The pulse interval was set to 360 ps, and V = 0 mV was also applied during this time. Sixteen different time-series data were input to Exciter A and/or Exciter B of the physical reservoir, and induced voltage, which results from a spin wave reaching Detector A, after a fourth pulse input was obtained as the reservoir state. Thus, the pulse trains are input to Exciter A and/or Exciter B one by one, and 16 waveforms resulting from 16 pulse train inputs were recorded one by one. Here, the timing when the reservoir state was taken was varied every 1 ns so as to investigate the time dependence of the recognition accuracy rate. In case of without interference, Exciter A and Detector A are used to input 4-bit signals and readout spin wave signals, respectively. In case of with interference, Exciter A and Exciter B are used to input a signal, and Detector A is used to read out the spin wave signal. A 4-bit pulse train was input to Exciter A, Exciter B, and both Exciter A and Exciter B. Then, the corresponding 16 spin wave signals are detected at Detector B. Sixteen voltage values were collected from 16 spin wave signals. The extracted voltages were normalized. Then, the original 196 × 4 matrix was replaced with the reservoir state every four pixels, and the digit data converted to 196 values of the reservoir state was trained and classified in the readout network. With the reservoir state matrix as X and the weight matrix of the readout network as W, the readout function h(X) is defined as follows[Image Omitted. See PDF]and[Image Omitted. See PDF]
Here, denotes sigmoid function. The output values were ideally equal to the target values of 1.0 for the appropriate digit and 0.0 for the remainder. In the training phase, the optimum weight coefficient vector was determined by minimizing the difference between the target and the output values. The squared error is defined as[Image Omitted. See PDF]
The weights W were updated to minimize ΔW defined as follows[Image Omitted. See PDF]
The learning rate α was set to 0.1, and the training was performed 20 times. In the testing phase, the weight coefficient vector determined in the training phase was used. The accuracy rate was calculated as the ratio between the sample number classified correctly and all the sample numbers.
The number of devices used for a reservoir was one. The maximum number of used electrodes was three, which are two exciters (Exciter A and Exciter B) and one detector (Detector A). In the case of with interference, three antennas were used (i.e., Exciter A, Exciter B, and Detector A), and used electrode pairs were two (Exciter A and Detector A, and Exciter B and Detector A). In case of without interference, two antennas (Exciters A and Detector A, or Exciters B and Detector A) were used, and used electrode pair was one. The inputs were applied sequentially. The result was represented by combinations of separate responses to characteristic features in the input data.
Nonlinear Time Series Data Prediction TaskThe subject reservoir computing system was trained and tested with a random waveform to predict the output from a NARMA model. To input to a reservoir computing system, the original random waveform u(k) = [0,0.5], with a time step of 5000, was transformed to pulsed waveforms with rise and fall times of 360 ps, a pulse-on time of 0 ps, and various intervals of 2–20 ns. Pulsed waveforms, with a maximum value of 400 mV, were applied to Exciter A and/or Exciter B. The voltages induced by a spin wave, which reaches each detector, were measured by an oscilloscope, and 50 virtual nodes per one detector are from each induced voltage. Thus, 50 reservoir states (w/o multidetection) or 100 reservoir states (with multidetection) were obtained from 1D input u(k) by the reservoir. Time steps were separated into training phases with a time step of 3500 and test phases with a time step of 500 after the first time step of 1000 was discarded. In the training phase, the weight parameters combining the reservoir state and the readout node were optimized so as to minimize the difference between the target waveform output d(n) from the theoretical model (i.e., Equation (3), (5), and (6)) and the reservoir output y(k). Thus, the weight coefficient Wi in y(k) is optimized to correspond to d(k). This is described as follows[Image Omitted. See PDF]
Here, n and b are the total number of nodes and a bias term, respectively. wi was optimized by ridge regression as training for the system. To compare the performance of the reservoir system with that of other systems, the errors are calculated by using Equation (4) and (7). The experiment is described in detail in Supplementary information.
The number of devices used for this task is one. The maximum number of used antenna electrodes was four, which are two exciters (Exciter A and Exciter B) and two detectors (Detector A and Detector B). In case of with interference and with multidetection, two exciters (i.e., Exciter A and Exciter B) and two detectors (i.e., Detector A and Detector B) were used, and used antenna electrode pairs were four (i.e., Exciter A and Detector A, Exciter A and Detector B, Exciter B and Detector A, and Exciter B and Detector B). In case of with interference and without multidetection, two exciters (i.e., Exciter A and Exciter B) and one detector (i.e., Detector A or Detector B) were used, and used antenna electrode pairs were two (i.e., Exciter A and Detector A (Detector B), and Exciter B and Detector A (Detector B)). In case of without interference and with multidetection, one exciter (i.e., Exciter A) and two detectors (i.e., Detector A and Detector B) were used, and used antenna electrode pairs were two (i.e., Exciter A and Detector A, Exciter A and Detector B). In case of without interference and without multidetection, one exciter (i.e., Exciter A) and one detector (i.e., Detector A or Detector B) were used, and used antenna electrode pair was one. The inputs were applied sequentially.
Ridge Regression for Solving Second-Order Nonlinear Dynamic Equation Tasks, NARMA2, and NARMA10 TasksIn the time series data analysis tasks, the readout network of the nonlinear interfered spin wave multidetection reservoir was trained by ridge regression. The reservoir output y(k), as shown in Equation (2), is transformed to[Image Omitted. See PDF]
Here, W = (w0, w1, …, wn) and X(k) = (X0(k), X1(k), …, Xn(k))T are the weight vector and the reservoir state vector with a reservoir size of n, respectively. Note that w0 = b and X0(k) = 1 to introduce the bias b shown in Equation (2). The cost function J(W) in the ridge regression is defined as follows[Image Omitted. See PDF]where T, β, and yt(k) are the data length in the training phase, the ridge parameter, and the target output generated by Equation (3), (5), or Equation (6), respectively. We fixed T = 3500 and β = 0 for all the tasks demonstrated in Figure 3 and 4. The weight matrix , which minimizes cost function J(W), is given by the following equation[Image Omitted. See PDF]
Here, Y = (yt(1), yt(2), …, yt(T)), X(k) = (X(1), X(2), …, X(T)), and I (⊆ ℝ(N+1)×(N+1)) are the target output vector, the reservoir state matrix, and the identify matrix, respectively.
Then, after learning the readout weight, the computational performance was evaluated by “NMSE” to solve the second-order nonlinear dynamics equation task and “NMSEvar.” for the NARMA2 and NARMA10 tasks, as shown in Equation (4) and (7) in main text.
Micromagnetic SimulationTo investigate nonlinearity variation at various pulse intervals, we performed a theoretical simulation using a Mumax3 micromagnetic simulator.[58] YIG with 380 μm × 90 μm × 0.12 μm is used for the spin wave waveguide to investigate the spin dynamics near a surface region the vicinity of an antenna. The two exciters used consist of a signal line (10 μm × 90 μm × 0.12 μm) and two ground lines (20 μm × 90 μm × 0.12 μm). The detection areas of the two detectors are 10 μm × 90 μm × 0.12 μm, which corresponds to the signal lines of the detectors. The mesh was cubic, measuring 40 nm × 40 nm × 40 nm along the Cartesian coordinates defined by the origin at the center of the surface plane on the YIG. A spin with a saturation magnetization of 157.9 kA m−1 was located at every mesh corner. The simulation time step was 10 ps. The material parameters estimated from experimental measurement are a saturation magnetization of 157.9 kA m−1, and a uniaxial anisotropy along the z-axis KU of −5.75 kJ m−3. A cubic magnetocrystalline anisotropy of 0, an exchange stiffness constant of 3.7 pJ m−1, and a damping constant of 0.0001 are used as the typical value of YIG. A static magnetic field of 0.3 T is applied along the z-axis (i.e., perpendicular to the YIG surface) over the entire region. An excitation field, with a rectangular shape-pulse and rise and fall times of 320 ps, is set at 80 mT along the y-axis at the exciters are applied in the exciter. The field vectors at the signal and ground lines are positive and negative, respectively, since electric current flows in the opposite direction.
Lyapunov SpectrumThe Lyapunov spectrum is an index used to evaluate orbital instability, which is one of the features of chaos. Lyapunov exponent λ is an indicator of the nonlinearity of complex systems, including physical reservoirs.[59] λ of nonlinear interfered spin wave multidetection was calculated by the Jacobi matrix method[60] and is defined as Equation (10)[Image Omitted. See PDF]
Here, t and are iteration time and differentiation of mapping function . If λ is positive (negative), proximity orbits are detached (asymptotic). In calculation, the Lyapunov exponent is calculated by taking a hypersphere (ε-sphere) in m-dimensional space of minute radius ε around a point, assuming that its change after one step can be approximated linearly, and then estimating the Jacobi matrix using that variation.
The displacement vector μi RN+L with respect to v(ki) as seen from v(t) is expressed as[Image Omitted. See PDF]
Since v(t) and v(ki) transit v(t + s) and v(ki + s) when time s elapses, the displacement vector zi at time t + s can be described as[Image Omitted. See PDF]
Under the consumption that ε and s are small enough so as to be negligible, and can be linearly approximated as follows[Image Omitted. See PDF]where is the Jacobi matrix to be estimated. Thus, the Jacobi matrix can be estimated as follows[Image Omitted. See PDF]
that is at t = 0 is described by QR decomposition, as follows[Image Omitted. See PDF]where Q and R are the orthogonal matrix and the upper triangular matrix. At time t + 1, multiply by the orthogonal matrix from one time earlier to obtain the following relation[Image Omitted. See PDF]
Using the upper triangular matrix at each time obtained in this way, the λ is obtained as follows[Image Omitted. See PDF]
Here, is the i-th diagonal element of the upper triangular matrix Rk.
AcknowledgementsThis work was in part supported by Japan Society for the Promotion of Science (JSPS) KAKENHI grant no. JP22H04625 (Grant-in-Aid for Scientific Research on Innovative Areas “Interface Ionics”), and JP21J21982 (Grant-in-Aid for JSPS Fellows). A part of this work was supported by the Yazaki Memorial Foundation for Science and Technology and Kurata grants from The Hitachi Global Foundation.
Conflict of InterestThe authors declare no conflict of interest.
Author ContributionsW.N., D.N., T.T., and K.T. conceived the idea for the study. W.N. and D.N. designed the experiments. W.N. and T.T. wrote the article. W.N. and Y.Y. carried out the experiments. W.N. prepared the samples. W.N., D.N., T.T., and Y.Y. analyzed the data. All authors discussed the results and commented on the manuscript. K.T. directed the projects.
Data Availability StatementThe data that support the findings of this study are available from the corresponding author upon reasonable request.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Physical reservoir computing, which is a promising method for the implementation of highly efficient artificial intelligence devices, requires a physical system with nonlinearity, fading memory, and the ability to map in high dimensions. Although it is expected that spin wave interference can perform as highly efficient reservoir computing in some micromagnetic simulations, there has been no experimental verification to date. Herein, reservoir computing is demonstrated that utilizes multidetected nonlinear spin wave interference in an yttrium-iron-garnet single crystal. The subject computing system achieves excellent performance when used for hand-written digit recognition, second-order nonlinear dynamical tasks, and nonlinear autoregressive moving average (NARMA). It is of particular note that normalized mean square errors for NARMA2 and second-order nonlinear dynamical tasks are 1.81 × 10−2 and 8.37 × 10−5, respectively, which are the lowest figures for any experimental physical reservoir so far reported. Said high performance is achieved with higher nonlinearity and the large memory capacity of interfered spin wave multidetection.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details




1 International Center for Materials Nanoarchitectonics (WPI-MANA), National Institute for Materials Science, Ibaraki, Japan
2 International Center for Materials Nanoarchitectonics (WPI-MANA), National Institute for Materials Science, Ibaraki, Japan; Faculty of Science, Tokyo University of Science, Tokyo, Japan
3 Faculty of Science, Tokyo University of Science, Tokyo, Japan