Recently, the research on and development of physical reservoir computing has seen increased activity due to the tremendous potential for it to significantly reducing computation resources compared to conventional machine learning approaches, which are based solely on semiconductor integrated circuits.[1,2] Various materials and devices, including soft bodies, optical devices, analogue circuits, spin torque oscillators, memristors, nanowire networks, and ion-gating transistors, have been reported to function as physical reservoirs that require nonlinearity, high dimensionality, and short-term memory,[2–26] while the demonstrated computing performance has been far from satisfactory to date. One common characteristic of a physical reservoir that is proving to be extremely difficult to achieve is the securing of high dimensionality, which is in essence the obtaining of a sufficient number of reservoir states from the output of a physical reservoir. This is because the outputs of physical reservoirs are measured as small numbers of time-series responses with a limited number of detecting probes (e.g., electrodes, sensors), which are attached to or arranged in some manner with the reservoir under serious geometrical constraints. This is in direct contrast to fully simulated reservoirs, in which unrestricted access to the reservoir states of nodes is enabled. Virtual node methods are useful in compensating for said lack of high dimensionality, and are thus widely used.[7–10] In such method, postprocessing allows for a lot of virtual nodes to be obtained from given time-series data. However, there is a known trade-off relationship between increasing the number of virtual nodes and the diversity of each virtual node.[27] It is thus not straightforward to secure a sufficient number of diverse virtual nodes from a limited number of time-series response, which makes physical reservoirs impracticable. Therefore, to achieve practical use, it is necessary to explore physical reservoirs that have diverse outputs.
Here, we report a redox-based ion-gating reservoir (redox-IGR) composed of all-solid-state redox transistors,[28–41] which can derive double reservoir states from drain and gate current response, based on ion insertion and desertion (redox) through a solid electrolyte,[42–46] at a Li+–electron mixed conductor, LixWO3. The redox mechanism, as well as the electric double layer mechanism, is useful for the conductance modulation of semiconductor channels.[47–53] Using sequential gate voltage pulse trains, a drain current (electronic current) flows through a LixWO3 thin film channel, where it is modified by a redox reaction with a Li+-ion conducting glass ceramic (LICGC) substrate through the modulation of the conducting electron density, so as to generate a nonlinear time-series response in the drain current. Simultaneously, a relatively large gate current for the redox process (lithium-ion current) can provide another time-series response. In the normal measurement configuration of transistor devices, two responses (drain and gate) with different characteristics are easily obtained. This increases the number of virtual nodes by overcoming the said trade-off relationship. By employing a redox-IGR, second-order nonlinear dynamical tasks and a second-order nonlinear autoregressive moving average (NARMA2) were successfully solved, with normalized mean square errors (NMSEs) of 5.39 × 10−4 and 0.163, respectively. Said IGR structure, with inorganic materials, is useful as a building block for semiconductor integrated circuits. Therefore, it is shown that the approach described herein can contribute to the physical implementation of physical reservoirs in practical devices that require compatibility together with high computational performance and high-density integration.
The novelty and differences of the redox-IGR with respect to the electric double layer (EDL)-IGR reported in our previous work are the operating mechanisms of the IGR. While the EDL-IGR utilizes electric response of EDL transistors on the basis of EDL charging/discharging at the channel/electrolyte interface,[21] the subject redox-IGR utilizes electric response of redox transistors on the basis of ion insertion/desertion into the channel (redox). The feature of the redox-IGR realizes double reservoir states in drain and gate nonlinear responses, as discussed later.
Results and Discussion General concept of the redox-IGRIn order to perform the efficient pattern recognition required by neuromorphic computing, including common deep learning and physical reservoir computing, time-series data are processed as input through their mapping to high dimensional feature space.[1,2] An example of a general neuromorphic computer mapping scheme is shown in Figure 1a. In physical reservoir computing, which is the main concern of the present study, such mapping can be performed by inputting time-series data to a “physical reservoir” so as to generate output schematically the same as the physical reservoir computing shown in Figure 1b. Utilizing inherent functions of the physical reservoir, time-series data are nonlinear-transformed and used to computing various pattern recognition tasks.[1–26] To achieving high performance with diverse outputs, physical reservoir computing requires nonlinearity, high dimensionality, and short-term memory. We examined the computing performance of a redox-based ion-gating transistor as a physical reservoir candidate; referred to in this study as a redox-IGR. Figure 1c is a schematic diagram of the redox-IGR operating on the redox of a LixWO3 channel. The nonlinear I–V characteristic of the redox-IGR, explained in detail later, is used to map input signals to high dimensional feature space. The subject redox-IGR was fabricated by RF sputtering to deposit a LiCoO2 thin film (200 nm) on a 0.15 mm-thick LICGC substrate, with a large storage capacity Li-ion/Pt thin film (50 nm) as a gate electrode, Pt thin film (50 nm) as drain and source electrodes, and WO3 thin film (100 nm) as a channel, respectively. As pretreatment for the IGR prior to commencing reservoir computing operations, a constant voltage of 2.5 V was applied between the gate and source electrodes for an hour. This operation inserted Li ions into the WO3 channel to create a LixWO3 phase with a smooth Li+ insertion/desertion characteristic. The ID (drain current)–VG (gate voltage) and IG (gate current)–VG characteristics of the redox-based IGR are shown in Figure 1d. The gate voltage is swept from 0.5 to 1.5 V and then back to 0.5 V at various sweep rates, ranging from 5 mV s−1 (slow) to 250 mV s−1 (fast). ID is normalized at the initial value for comparison to those measured under different sweeping rate conditions. The ID is modulated by the application of VG because the conducting electron is doped (or removed) in the LixWO3 by the redox reaction (Li+ insertion or desertion) (Equation (1)).[Image Omitted. See PDF]
Figure 1. a) General scheme for the mapping of input to high-dimensional feature space in neuromorphic computing. b) General scheme of physical reservoir computing. c) Schematic image of a LixWO3-based redox-IGR. d) Normalized drain current, and gate current measured during VG sweeping from 0.5 to 1.5 V. e) Gate voltage pulse stream, drain current response, and gate current response during operation of the redox-IGR. 40 reservoir states Xi (i = 1, …, 40) are obtained as shown in the panels to the right. f) General concept of a reservoir computing system with redox-IGRs. Wi denotes the read-out weight.
The ID-VG curves exhibited hysteresis at sweep rates ranging from 5 to 250 mV s−1 (a frequency range of from 2.5 to 125 MHz). In the subject redox-IGR transistor, Li+ transport in the LixWO3 channel is much slower than in the electrolyte; a rate-limiting step of the overall Li+ transport is Li+ transport in the LixWO3 channel. Another main cause of the hysteresis characteristics observed in the subject transistor is the delay of Li+ transport in the LixWO3 channel relative to gate voltage sweep, which is the origin of the short-term memory of the redox-IGR. In addition to these short-term memory characteristics, as the sweep rate decreases, the modulation of the ID response to the applied gate voltage increases and the nonlinearity becomes stronger. Therefore, by changing the sweep rate of the gate voltage (or the frequency of the input signal applied to the gate), the nonlinearity in the redox-IGR can be modulated, which is an important characteristic for a physical reservoir to have.
To perform time-series tasks using the transistor, ID response to gate voltage pulse is useful for mapping input signals to higher dimensional feature space. VG pulse streams can be used to deal with sequential time-series signals. The upper and middle panels of Figure 1e show an example of ID response with respect to VG pulse streams, which are input signals to the transistor. When one VG pulse (corresponding to one point in a time-series dataset) is input, 20 reservoir states Xi (i = 1, …, 20) can be obtained from the ID response by the virtual node method. Conventional IGRs use only the ID,[21] but the subject redox-IGR can use IG as a reservoir state. This is due to the significant gate current present, which is much larger than the one found in an electric double layer-IGR.[21] While very small IG in an electric double layer-IGR suffers from noise floor and thus makes it extremely difficult to obtain reliable reservoir states from IG, IG in the subject redox-IGR, which is comparable to ID, is in fact suitable for obtaining reliable reservoir states. Therefore, 20 additional reservoir states Xi (i = 21, …, 40) can be obtained from the IG response, as shown in the lower panel of Figure 1e. The doubled reservoir states can be utilized to perform reservoir computing with enhanced high dimensionality, as schematically shown in Figure 1f. Recently, mixed reservoir properties, with different dynamical characteristics, were theoretically predicted to show high-performance reservoir computing.[54] Due to their different characteristics, the doubled reservoir states obtained by ID and IG responses can derive such a mixed reservoir property effect and result in high computation performance. The details of computation performance with specific tasks will be discussed later.
Previously, a SiOx (doped with Ag)-based diffusive memristor was applied to reservoir computing to identify handwritten digits from the Modified National Institute of Standards and Technology (MNIST) dataset, the results of which were a highly accurate 83%.[26] Quickly fading current in the memristor, which functions as a short-term memory for resistor–capacitor (RC), was obtained by resistance modulation of the memristor due to voltage stimulated fast Ag diffusion and the resultant volatile redox reactions in the SiO2 film, which is a typical diffusive memristor behavior.[55,56] In the subject redox-IGR, the similarly quick fading IG observed is due to Li+ diffusion inside the electrolyte and Li+ transfer between the electrolyte and LixWO3 ion–electron mixed conducting channel layer.
In conventional transistors, low gate current is always preferred so as to keep energy consumption low, especially in standby states. In the subject redox-IGR, no voltage or current application to the redox-IGR is required in the standby state; both ID and IG are used only during information processing. Therefore, the significant IG of the redox-IGR is unlikely to cause any serious issues.
Solving a Second-Order Nonlinear Dynamic EquationReservoir computing is advantageous for time-series data analysis due to the nonlinearity, short-term memory, and high dimensionality of the reservoir for input signals. Therefore, we evaluated the computational performance of the subject redox-IGR in time-series data analysis by solving a second-order nonlinear equation task. The general concept of a process flow diagram for the second-order nonlinear equation task solved is shown in Figure 2a.[7,8] The target time-series for this task is generated by the second-order nonlinear dynamic equation shown in Equation (2).[Image Omitted. See PDF]where u(k) and k are a random input ranging from 0 to 0.5 and the discrete time, respectively. Equation (2) contains a second-order nonlinearity and a two-step forward term which, in order to solve the equation, are required to be expressed by the reservoir as linearly separable.[7]
Figure 2. a) Process flow diagram of a second-order nonlinear equation task, showing the target (blue line) and predicted (orange line) waveforms at b) T = 2 s and c) T = 40 s. d) Performance comparison with other physical reservoirs. e) Relationship between prediction error and pulse period, under conditions with only ID (red line) and ID + IG (purple line). The black line also shows utilization result of 20 nodes consisting of 10 nodes each from ID and IG. The pulse period is defined as shown in the inset of (e).
A random input u(k) was linearly converted to a voltage pulse stream, with a pulse period of T (2–100 s) and a duty rate of 50%, which was input to the subject redox-IGR transistor under a constant VD of 0.1 V.
The pulse intensity VG(k) ranged from 0.5 to 1.5 V (VG(k) = 2u(k) + 0.5 V) and a constant VG of 1 V was applied during the pulse interval. The gate voltage pulse stream, drain current response, and gate current response are shown in Figure 1e. As already discussed, a total of 40 reservoir states Xi were obtained from both the ID (i = 1, …, 20) and the IG (i = 21,…,40) responses by the virtual node method.
By combining physical and virtual nodes with different characteristics, the unique electrical behavior of the subject redox-IGR, caused by redox reactions (electronic current through channel and ion currents through electrolyte, which are associated with Li+ transport), can be extracted as reservoir states that are valid for reservoir computing and can further be mapped to a high-dimensional feature space.[21] As a result, the reservoir output y(k) is obtained by the following equation[Image Omitted. See PDF]where N, wi, and b are the size of the reservoir (= 40), read-out weights, and bias, respectively (see Experimental Section for additional details on the learning algorithm).
In the test phase, in order to evaluate the generalization performance of the subject redox-IGR, we checked whether the reservoir output (Equation (3)) with fixed wi matched the learned equation (Equation (2)) for inputs different from those in the training phase. The prediction error defined below was used to evaluate the computational performance of the subject redox-IGR in this task[Image Omitted. See PDF]where is a data length.
Figure 2b,c shows the target and predicted waveforms when the subject redox-IGR transistor was operated at different pulse periods T of 2 and 40 s in the test phase. It is particularly noteworthy that the target and predicted waveforms are in excellent agreement for T = 40 s, as shown in Figure 2c. That is, Equation (2) was successfully solved by the subject redox-IGR, with a prediction error of 5.39 × 10−4, which is sufficiently low when compared to other physical reservoirs reported to date (1.31–3.13 × 10−3),[7,8] as shown Figure 2d. Therefore, it is suggested that the IGR system is a mechanism that enables high-performance reservoir characteristics. On the other hand, the prediction error worsened when the subject IGR was operated with T of 2 s, as shown in Figure 2b. This is because the relaxation process in the subject redox-IGR is correlated with the sweep rate of the gate input, as detailed in Figure 1d. To evaluate the correlation between the operating conditions of the subject redox-IGR and its computational performance as a reservoir, we investigated the relationship between T and the prediction error, as shown in Figure 2e. The red line in the figure shows the results when only the ID is used for the reservoir states (i.e., Xi, i = 1, …, 20), and the purple line shows the results when both the IG and the ID are used for the reservoir states (i.e., Xi, i = 1, …, 40). It is found that the prediction performance is best at T = 40 s, regardless of the presence or absence of a gate current. This is because, when the pulse period is short, the redox reaction shown in Equation (1) proceeds with too great a delay, and the ion current associated with ion transport in the electrolyte dominates. Therefore, the short-term memory characteristics and nonlinearity due to the resistance modulation of LixWO3 are lost, and the only current response obtained is a simple one similar to the relaxation process of a RC parallel circuit. In addition, if the pulse period is too long, the interaction between the virtual nodes is suppressed, which results in poor computational performance.[10] In order to compare the utilization of double reservoir states alone, the results of solving a task with 20 nodes consisting of ID and IG (i.e., Xi, i = 1, 3, …, 37, 39) are shown with black line in Figure 2e. The predicted performance at each pulse period was higher than with a single reservoir state using ID or IG, and effective nodes were successfully obtained by adding IG.
The utilization of IG in addition to ID not only lowers the said error, but also moderates the dependence of the computational performance on the input conditions. This is because, in addition to increased expressive power due to the increased reservoir size, the ID-derived X and IG-derived X utilize complementary features that are necessary for task execution, which is an important feature of the subject redox-IGR, which utilizes different physical nodes as computational resources. This feature of the subject redox-IGR, which provides good computational performance regardless of slight changes in its operating conditions, is also an extremely significant practical advantage for reservoir computing implementation.
The high performance shown in Figure 2 is modulable nonlinearity in both ID and IG as discussed below. In the VG sweeping measurement shown in Figure 1d, ID is modified by electronic carrier density change due to Li+ insertion into/desertion from the LixWO3 layer. As the Li+ insertion/desertion process is driven by a gradient of the electrochemical potential of Li+ (composed of the chemical potential of Li+ and the local electrostatic potential) in the LixWO3 layer, the local flux of Li+ through the electrolyte/LixWO3 interface is strongly influenced by the electrolyte and by both the Li+ density (the chemical potential of Li+) profile and the electrical potential profile in the LixWO3 layer. Under Li+ flux, the Li+ density profile variation follows VG sweeping with notable delay due to the relatively slow Li+ diffusion kinetics in the LixWO3 layer, so the extent of the delay thus depends on the VG sweep rate. Furthermore, local Li+ density variation is accompanied by local electron density variation due to charge compensation with Li+ [shown in Equation (1)], making the electrical potential profile vary with delay. These lead to modulable nonlinearity of ID. On the other hand, the total Li+ flux (corresponding to IG) through the electrolyte/LixWO3 interface also follows VG sweep with delay, which is influenced by the potential profiles discussed above. Therefore, both IG and ID show modulable nonlinearity. These modulable nonlinearities hold under VG pulse stream applied conditions with different pulse periods (T). Furthermore, said ID and IG are sensitive to the VG input history because the history is stored in unique Li+ density profiles in the LixWO3 layer. We believe that these are the origin of the observed high performance of the subject redox-IGR.
Note that in addition to the ID and IG responses of the subject redox-IGR, it is entirely possible for other nonlinear data to be used to obtain the waveforms in the RC tasks. However, depending on the dynamical characteristics of the device (the source of the nonlinear data) used as the reservoir, performance in physical RC may change significantly. The three requirements for physical reservoirs—nonlinearity, short-term memory, and high dimensionality—are of great importance for the achievement of high-performance RC.
In comparing the subject redox-IGR and the EDL-IGR,[21] there is a notable difference in their respective prediction errors, which may be due to the inherent properties of the channel material used for the subject redox-IGR. The LixWO3 used in the subject redox-IGR is a well-known material used in electrochromic windows, which require their redox ON and OFF states to be highly repeatable. Because of this, LixWO3 was expected to be a suitable channel material in the subject redox-IGR. However, the occurrence of irreversible Li+ trapping during repetition of ON and OFF states has been recently pointed out.[46,57,58] Although such irreversibility appears to be not significant in our redox-IGR, we suspect that it is possible that Li+ trapping can cause increased or decreased loses of the echo state property, which is required for high-performance computing, and thus may lead to increased prediction errors. Further to this point, it is expected that errors in the subject redox-IGR can be reduced by using alternative ion–electron mixed conductors, in which such ion trapping does not occur.
Evaluation of Prediction Performance for a NARMA2 TaskTo further evaluate the performance of time-series prediction using the subject redox-IGR, we have performed a NARMA2-task, which is more difficult than the second-order nonlinear equation task performed in the previous section, as well as being a typical benchmark task for both full-simulation reservoir computing and physical reservoir computing.[4–6,22–25] The time-series prediction generated by the NARMA2 model, with specific parameters defined by Equation (5), is a popular benchmark for the development of physical reservoirs.[4–6,22,23][Image Omitted. See PDF]
Figure 3a,b shows the results of waveform prediction utilizing ID + IG (40 nodes) at T = 2 s and 40 s, respectively. The error value, a normalized mean square error (NMSE), between the target waveform (blue line) and the predicted waveform (orange line) is defined by Equation (6)[Image Omitted. See PDF]where is a data length. The computational performance of the subject redox-IGR in performing the NARMA2 task was enhanced by the higher dimension, combined with IG. As shown in the comparison of the two conditions (T = 2 and 40 s) using ID + IG (40 nodes) in Figure 3a,b, the minimum value of NMSE is 0.163 at T = 40 s, whereas the maximum NMSE is 0.321 at T = 2 s. These two waveforms were chosen as examples to show how the predicted waveforms, with relatively high and low NMSE, differ from each other. As can be seen from a comparison of the two, deviation of the predicted waveform from the target waveform appears less significant in Figure 3b, giving support to a predicted accuracy higher at T = 40 s than at T = 2 s. Figure 3c shows the relationship between NMSE and pulse period at each reservoir state. When using only ID as the reservoir, as shown by the red line in Figure 3c, the best predicted NMSE performance was 0.212 (T = 40 s). As in the case of the second-order nonlinear equation task in Figure 2, the positive effects from the utilization of the double reservoir states alone were evaluated by excluding the collateral effect of increasing the number of nodes to 40. The result for a double reservoir state consisting of 20 nodes (ID + IG) is indicated by the black line in Figure 3c. The relationship between NMSE and pulse period under all conditions was the same as for the second-order nonlinear equation task in Figure 2. The general tendency, in which utilizing IG in addition to ID gives better performance over the whole pulse period range, is quite similar to the case for the second-order nonlinear equation task, which gives support to our assumption that the present approach is sufficiently versatile to achieve an information processing ability in the subject redox-IGR.
Figure 3. The target (blue line) and prediction (orange line) waveforms using ID + IG (40 nodes) at a) T = 2 s and b) T = 40 s for NARMA2 task. c) Relationship between NMSE and pulse period. The red, black, and purple line represent 20 (ID), 20(ID + IG) and 40 nodes (ID + IG), respectively. d) Forgetting curves and e) memory capacity of the subject IGR for each condition at T = 40 s.
In order to further investigate the underlying mechanism of the enhancement effect elicited by the addition of IG to the reservoir states, we performed a short-term memory task, which task measures the ability of our redox-IGR to reconstruct past time series data input to the redox-IGR. Here, as in the time series analysis task described in Figure 2 and 3, a voltage-transformed random input u(k) is applied to the subject redox-IGR and the input u(k-τ) before the delay time τ is reconstructed by a linear combination of reservoir states and weights obtained from the current response of the subject redox-IGR (Equation (3)). The agreement between the target waveform u(k-τ) and the reconstructed waveform y(k) by the reservoir was evaluated using the following coefficient of determination r[2][Image Omitted. See PDF]where Cov() and Var() are the covariance and variance, respectively. Figure 3d shows the forgetting curve (determination coefficient vs delay) of the subject redox-IGR using ID and ID + IG for utilization of 40(20) nodes at T = 40 s.[5] The determination coefficient (i.e., the ability for reconstruction) decreases as the delay increases. That is a universal feature of short-term memory. Memory capacities (MC) for the three conditions are calculated to be 2.35 for ID, 2.80 for ID + IG (20 nodes), and 3.57 for ID + IG (40 nodes), respectively, by integration of the curves in Figure 3d as follows[Image Omitted. See PDF]
As shown in Figure 3e, by comparing to MCs under the three conditions, it is revealed that the double reservoir states combining IG and ID enhance both the high dimensionality and the MC of the reservoir.
Evaluation of Output Versatility with Correlation Efficient Between Each NodeIncreasing the number of virtual nodes with sufficient diversity can enhance high dimensionality, leading to high-performance reservoir computing. The apparent difference of the ID and IG characteristics observed in Figure 1e and the significant performance improvement accompanied by the addition of IG [observed in Figure 2 and 3] indicate that adding IG causes an increase in diverse nodes, which are not strongly correlated to existing ID nodes. However, this mechanism is not evidenced in the above discussion. In order to clarify the enhancement mechanism for IG addition in relation to the diversity of nodes, the correlation between each node i under given conditions (e.g., utilizing of ID only, or ID + IG, T) was quantified by the Pearson correlation coefficient , using the following equation[Image Omitted. See PDF]where and L are the reservoir state of node i, and the discrete time and data length, respectively. Figure 4a shows the reservoir state waves for X5 (black line) and X7 (red line) obtained from the ID response. The X5 and X7 waveforms were similar to each other, and a high correlation was confirmed with a calculated r of 0.95, as shown in Figure 4b. Although r can express positive correlations (r > 0) and negative correlations (r < 0) between each node by taking values in the rang e of −1 to 1, 1−|r| (0≦1−|r|≦1) is rather useful for evaluating the extent of correlation, regardless of the sign, between each node. 1−|r| for the X5 and X7 waveforms was calculated to be 0.05, indicating high correlation and poor versatility. On the other hand, the waveforms for X5 (black line) from ID and X35 (blue line) from IG appear completely different, as shown Figure 4c, and the calculated 1−|r| of 0.92 (r = 0.08) indicated almost no correlation, as shown in Figure 4d. From these comparisons, it was confirmed that 1−|r| can be a useful index for evaluating the versatility of virtual nodes. 1−|r| can thus be understood as an uncorrelated coefficient for a specific combination of two nodes. Figure 4e shows a heatmap representing 1-|r| between each node (X1 to X40 in the vertical axis vs X1 to X40 in the horizontal axis) measured at T = 40 s. The heatmap has linear symmetry with respect to a diagonal line because a pixel for a specific combination (Xi, Xj) is equivalent to the one for the corresponding combination (Xj, Xi). If 1−|r| for a specific combination (e.g., X5 vs X35 indicated by a green circle) is close to 1, the color of the corresponding pixel becomes dark, and expresses that the correlation has low and high versatility, and vice versa. The regions surrounded by red, blue, and purple squares represent ID (X1 to X20) versus ID (X1 to X20), IG (X21 to X40) versus IG (X21 to X40), and ID (X1 to X20) versus IG (X21 to X40) correlations, respectively. In each region, there is a notable distribution of 1−|r|. Figure 4f shows several 1-|r| heatmaps measured under various operation conditions from T = 2 to 100 s. As T is increased from 4 to 10 s, 1−|r| in a part of the ID versus ID region (X1 to X10 vs X11 to X19) becomes much higher (darker) than the ones at T = 2 and 4 s. In addition, 1−|r| in a part of the ID versus IG region (X1 to X20 vs X21 to X40) also becomes slightly higher. More significantly, at T = 20 s or above, 1−|r| of the high 1−|r| domain becomes very high and much broadened in both the ID versus ID and ID versus IG regions, meaning that the ID and IG nodes become more uncorrelated to enhance high dimensionality. As 1−|r| between each node can express an effectiveness of the nodes, a sum of 1−|r|, described by , can be an index to compare versatility and high dimensionality in overall outputs. We compare the sum of 1−|r| and MC so as to analyze the relationship between versatility (high dimensionality) and MC. Figure 4g shows MC versus the sum of 1−|r| plots under only ID, ID + IG (20 nodes), and ID + IG (40 nodes) conditions. Positive correlation is clearly found between MC and the sum of 1−|r|, regardless of conditions, meaning that the high versatility (high dimensionality) caused by IG addition surely contributes to the strengthening of MC. We further investigated the relationship between MC and computing performance for a second-order nonlinear dynamic equation and NARMA2, as shown in Figure 4h,i. For both tasks, computation performance is improved as MC increases from below 2.0 to about 4.0. These results evidence that IG addition enhances the high dimensionality (versatility) of the output, leading to high computation performance accompanied by MC increase. This is consistent with the fact that computing performance in both the second-order nonlinear dynamic equation task and the NARMA2 task is higher at T = 4 s than T = 2 s.
Figure 4. a) X5 for ID (black line) and X7 for ID (red line) waveforms at T = 40 s. b) The scatter plot between X5 and X7 with high correlation (r = 0.95). c) X5 for ID (black line) and X35 for IG (blue line) waveforms. d) The scatter plot between X5 and X35 without correlation (r = 0.08). e) The heatmap of 1−|r| for 40 nodes (X1, X2, …, X40) at T = 40 s. f) The heatmaps of 1−|r| for 40 nodes (X1, X2, …, X40) measured under all T conditions. g) The relationship between memory capacity and sum of 1−|r|. Memory capacity dependence of h) prediction error for second-order nonlinear dynamic equation task and i) NMSE for a NARMA2 task. Each orange fitting curve is inserted for easier understanding of the characteristic.
In order to investigate repeatability and stability of our redox-IGR, we compared the ID and IG responses of two devices, each with the same device dimensions (device A and B), under identical VG pulse train applied conditions. The left-hand side of Figure 5 shows the ID and IG responses of device A and B in the beginning part of the input VG pulse train. When comparing the ID and IG responses, the two devices were found to give very similar responses, with repeated spiking and relaxation. Moreover, during the last part of the VG pulse train, as shown on right-hand side of Figure 5, device A and B continued showing such similar ID and IG responses. From said responses, we were able to obtain stable multiple 40 states, as well as those shown in Figure 2 and 3. This result supports that the subject redox-IGR has sufficient repeatability and stability to performing reservoir computing.
Figure 5. ID and IG responses of device A and B, with the same device dimensions, under identical input VG pulse train applied conditions. The left (right)-hand side shows the result in the beginning (last) part of the input VG pulse train.
Physical reservoir computing, with a redox-IGR composed of LixWO3 thin film and LICGC, has been demonstrated. The subject redox-IGR successfully solved a second-order nonlinear dynamic equation, with a lowest prediction error of 8.15 × 10−4, under a normal condition where only ID is used for reservoir states. Performance was enhanced by the addition of IG to the reservoir states, resulting in a significant lowering of the prediction error to 5.39 × 10−4, which is noticeably lower than other types of physical reservoirs reported to date. NARMA2, a typical reservoir computing benchmark, was also performed with the subject redox-IGR. Better performance was achieved, with an NMSE of 0.163, by the addition of IG to the reservoir states, which reveals that IG is a useful source for obtaining better reservoir properties. A short-term memory task was performed to investigate enhancement mechanism resulting from the addition of IG. The forgetting curves of the subject redox-IGR show that MC was enhanced from 2.35 with ID to 3.57 with ID + IG. The enhancement of both high dimensionality and MC resulting from the addition of IG to the reservoir states is attributed to the origin of the performance improvement.
Physical reservoir computing is, from a certain viewpoint, an attempt to utilize as many inherent properties of a material/device as is possible so as to achieve efficient information processing. For a transistor, IG is usually regarded as of no use; nevertheless, it includes certain internal and temporal information about the transistor. The present technique is useful in harnessing such internal information in a device so as to realize the efficient mapping of input to higher dimensional feature space.[59–64] This approach can be applied to a wide range of multicomponent physical reservoir systems.
Experimental Section Fabrication of a LixWO3-Based Redox TransistorThe LixWO3-based redox transistor, schematically shown in Figure 1c, was fabricated on a 0.15 mm-thick LICGC substrate. First, the drain and source electrodes, made of 50 nm-thick Pt films, were deposited by the RF sputtering method at room temperature. Next, a 100 nm-thick WO3 thin film was deposited by the RF sputtering method, using a 99.9% pure, sintered stoichiometric WO3 target, with a supply of pure Ar and O2 gases at fixed flow rates of 10 and 0.6 sccm, respectively. A 200 nm-thick LiCoO2 thin film was then deposited as a gate electrode on the opposite side of the LICGC substrate, against the WO3 side, with a supply of pure Ar and O2 gases at fixed flow rates of 9 and 3 sccm, respectively. Finally, the current collector on the gate electrode, made of 50 nm-thick Pt film, was deposited by the RF sputtering method at room temperature. Prior to measurements being made, a constant voltage of 2.5 V was applied between the gate and source electrodes so as to insert Li ions into the WO3 channel.
Measurement of ID and IG ResponsesAll electrical measurements of the subject redox-IGR were carried out at room temperature in a vacuum chamber and carried out using the source measure unit (SMU) of a semiconductor parameter analyzer (4200 A-SCS, Keithley). A random input u(k) was linearly converted to the voltage pulse streams, with a pulse period of T (2–100 s) and a duty rate of 50%, which was input to the subject redox-IGR transistor under constant VD of 0.1 V. The pulse intensity VG(k) ranged from 0.5 to 1.5 V (VG(k) = 2u(k) + 0.5 V) and constant VG of 1 V was applied during pulse intervals. The ID and IG responses of the subject redox-IGR were monitored, and 20 virtual nodes were extracted from each response. Thus, 40 reservoir states were obtained from input u(k) by the subject redox-IGR. Said reservoir states were normalized from 0 to 1 for calculation, as shown in Equation (3).
Ridge Regression for Time-Series Data Analysis TasksIn the time-series data analysis tasks, such as the solving of the second-order nonlinear dynamic task and the NARMA2 task shown in Figure 2 and 3, the readout network of the subject redox-IGR was trained by ridge regression. Here, we describe the algorithm used for said ridge regression. The reservoir output shown in Equation (3) can also be defined as follows[Image Omitted. See PDF]where and are the weight vector and the reservoir state vector with a reservoir size of N, respectively. The cost function in ridge regression is defined as follows[Image Omitted. See PDF]where L, and are the data length in the training phase, the ridge parameter, and the target output generated by Equation (2) or (5), respectively. The data length and ridge parameter were and for the second-order nonlinear equation task. The NARMA2 task used β of . The trained weights that minimize cost function are given by the following equation.[Image Omitted. See PDF]where , , and are the target output vector, the reservoir state matrix, and the identify matrix, respectively.
AcknowledgementsT.W. and D.N. contributed equally to this work and treated as co-first authors. This work was in part supported by Japan Society for the Promotion of Science (JSPS) KAKENHI Grant Number JP22H04625 (Grant-in-Aid for Scientific Research on Innovative Areas “Interface Ionics”) and JP21J21982 (Grant-in-Aid for JSPS Fellows). A part of this work was supported by the Yazaki Memorial Foundation for Science and Technology.
Conflict of InterestThe authors declare no conflict of interest.
Data Availability StatementThe data that support the findings of this study are available from the corresponding author upon reasonable request.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Herein, physical reservoir computing with a redox-based ion-gating reservoir (redox-IGR) comprising LixWO3 thin film and lithium-ion conducting glass ceramic (LICGC) is demonstrated. The subject redox-IGR successfully solves a second-order nonlinear dynamic equation by utilizing voltage pulse driven ion-gating in a LixWO3 channel to enable reservoir computing. Under the normal conditions, in which only the drain current (ID) is used for the reservoir states, the lowest prediction error is 8.15 × 10−4. Performance is enhanced by the addition of IG to the reservoir states, resulting in a significant lowering of the prediction error to 5.39 × 10−4, which is noticeably lower than other types of physical reservoirs (memristors and spin torque oscillators) reported to date. A second-order nonlinear autoregressive moving average (NARMA2) task, a typical benchmark of reservoir computing, is also performed with the IGR and good performance is achieved, with a normalized mean square error (NMSE) of 0.163. A short-term memory task is performed to investigate an enhancement mechanism resulting from the IG addition. An increase in memory capacity, from 2.35 without IG to 3.57 with IG, is observed in the forgetting curves, indicating that enhancement of both high dimensionality and memory capacity is attributed to the origin of the performance improvement.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details




1 Research Center for Materials Nanoarchitectonics (MANA), National Institute for Materials Science (NIMS), Tsukuba, Ibaraki, Japan; Department of Applied Physics, Faculty of Science, Tokyo University of Science, Katsushika, Tokyo, Japan
2 Research Center for Materials Nanoarchitectonics (MANA), National Institute for Materials Science (NIMS), Tsukuba, Ibaraki, Japan
3 Department of Applied Physics, Faculty of Science, Tokyo University of Science, Katsushika, Tokyo, Japan