New types of computing hardware for accelerating artificial intelligence (AI) algorithms have recently been enabled by materials and devices that mimic the functions of neural synapses found in biological nervous systems.[1–9] These advancements are promising due to demonstrated improvements in energy efficiency and training time, factors that significantly constrain AI performance. While the number of unique synaptic behaviors in the nervous system is large, only a few have been demonstrated in emerging neuromorphic devices. Neuromorphic hardware including memristors and synaptic transistors typically leverages simple forms of long-term potentiation[10] or short-term synaptic plasticity (STP).[11,12] Fewer devices exhibit complex forms of STP, such as sensory adaptation (SA)[13–22] or multiple concurring STP timescales.[23–26]
SA is defined as a rise in neuronal spiking frequency of a postsynaptic neuron followed by a decrease in its spiking frequency when continually stimulated at a constant level.[13,14,21,23,27–31] SA is crucial for the operation of efficient neural sensory systems, as it provides “device level” filtering of nonnovel stimuli. Therefore, SA is one of the mechanisms allowing system-wide habituation.[32,33] Without this adaptive feature, brains would be constantly overwhelmed by persistent and irrelevant information (e.g., the feeling of clothes on a wearer's skin, or the smell of one's home). SA is hypothesized to stem from a variety of causes including changes to individual ion channels and synaptic junctions.[13,23,27,34] Identifying and leveraging this form of autonomous adaptation in neuromorphic synaptic devices is therefore an opportunity to realize artificial sensory systems that more closely mimic the history-dependent signal processing in biological nervous systems.
We previously demonstrated that lipid membranes doped with monazomycin (Mz) ion channels are volatile memristors (Figure 1A,B) that, unlike many other artificial synapses, exhibit amplitude-dependent forms of STP, including sensory adaptation and STP across multiple concurrent timescales[25,35] (Figure 1C,D, and S1, Supporting Information). These attributes are hallmarks of adaptive sensory processing in biological nervous systems.[13,23,28,34,36,37] An example of the previously-characterized[35] amplitude-dependent STP behavior in MzBS is shown in Figure 1C, where i represents the ion current density through Mz channels in the membrane. At low voltages (e.g., 80 mV, blue trace), i increases (facilitates) monotonically as the transmembrane voltage stably drives more Mz into the bilayer to form new ion channels. However, i peaks and then decays at higher applied voltages (e.g., 110 mV, orange trace) due to voltage-driven translocation of Mz which lowers the net number of conductive channels permitting ion flux[25,37] (Figure 1A). This mechanism of facilitation-then-depression via Mz depletion emulates SA observed in neurons of the olfactory, auditory, and visual processing circuits in the brain[13,14,38] (Figure 1C) and it leads to cumulative changes in device STP behavior across multiple stimulation events and longer timescales (Figure 1D).
Figure 1. System overview and characteristics. A) Bottom-view micrograph of the MzBS device platform, the droplet interface bilayer (DIB). Two aqueous droplets where amphiphilic phospholipids self-organize at the oil–water interface are brought into contact, resulting in a lipid bilayer between them (scale bar: 400 μm). A side-view micrograph of the system can be seen in Figure S2A, Supporting Information. Membrane-active monazomycin (Mz) species, represented by the red staves are incorporated to affect ion transport through the bilayer. At voltages exceeding 60 mV, Mz species insert into the lipid bilayer and oligomerize to form nanoscale conductive ion channels. At voltages exceeding 80 mV, Mz inserts, forms channels, and may also translocate to a nonconductive state on the opposite side of the membrane. This results in a decrease in current density (i) from a peak, as seen in (C) (orange trace), which is analogous to sensory adaptation (SA). B) Frequency dependent pinched i-V hysteresis of MzBS devices confirming volatile (i.e., analog) memristive switching. D) Signal transduction in the peripheral nervous system. At a constant stimulus level, some sensory neurons (right) show an increase in spiking frequency followed by a decrease when continually stimulated. These changes in presynaptic firing have a direct effect on the postsynaptic firing rate. In our previous work, we showed that MzBS devices mimic this sensory synapse and its ability to modulate postsynaptic firing.[35] At a constant voltage (80 and 110 mV, blue and orange curves, respectively), MzBS current density facilitates (green arrow) and then depresses (red arrow) (right). D) Mz-mediated conductance exhibits history-dependent adaptation across multiple time scales. Shown are the simulated evolutions in Mz current density in response to three consecutive 90 mV steps (an extended simulation can be found in Figure S2B,C). In the short term on the order of ms to s, rapid changes are measured, and sensory adaptation occurs. Over the course of minutes, longer term depression occurs, indicated by the decrease in peaks. This medium-term plasticity[25] disappears with a 20–30 min gap between pulses (see Figure S2D–F). E) RC networks built from independent volatile memristors (top) enable a direct substitution of MzBS devices (bottom) to study the effects of SA on RC performance.
These features make Mz-based biomolecular synapses (MzBS) unique candidates for memristor-based AI applications, such as reservoir computing (RC),[39–43] that operate in the time domain. RC relies upon the rich dynamics of physical or simulated nodes to nonlinearly process and separate temporal input signals.[44] The primary advantage of RC over traditional neural networks is that it drastically reduces computational overhead by cutting the number of weights requiring training by focusing on a linear output layer.[40] A variety of physical systems and devices, including two-terminal memristors and three-terminal transistor devices, have been used to implement RC schemes.[45–55] The benefit of constructing reservoirs from volatile memristors such as the MzBS allows their “fading memory” to function as intrinsic negative feedback. This property satisfies the requisite of recurrence for RC[56] and allows simpler network architectures, such as parallel arrangements of independent devices. Recent memristor-based networks exhibited facilitation-only STP in RC networks for classification and learning[41,47,49,57,58] and showed that defining virtual nodes (i.e., time multiplexing)[59] and mapping inputs to multiple sets of devices (i.e., device multiplexing)[43,59] to increase the number of reservoir outputs significantly improves task performance.
However, the effects of sensory adaptation on RC tasks have not been explored, despite the prevalence of SA in neurosensory processes that assist in pattern recognition, event detection, and learning.[29,60,61] Leveraging MzBs devices, which exhibit facilitation-only STP at low input levels and facilitation-then-depression at higher input levels,[35] we can uniquely modulate STP behavior (both in situ and in silico) to reveal the effects of SA on RC applications. Herein, we report on model-based device simulations and physical experiments of several RC tasks, including a novel aerospace application, to show that multiplexing of MzBS devices (Figure 1E) exhibiting amplitude-dependent STP and SA can outperform RC network frameworks that utilize memristive devices with facilitation-only forms of STP.[42,47,49,57,58,62,63] We hypothesize this performance increase arises from the ability of SA to spawn different trajectories for small differences in voltage and initial device state, thereby operating closer to the “edge of chaos” regime observed in biological neural networks.[30,64]
Results and Discussion Model of Amplitude-Dependent STP in MzBSWe previously developed a dynamical model describing the voltage- and history-dependent current density (i.e., volatile memristance, Equation S1, Supporting Information) measured in Mz-doped membranes at room temperature (21 °C).[35] The model consists of four state variables, three of which are dynamic: Nb, the number of available Mz species per area; P, the proportion of Nb in the nonconductive “pre-channel” state; C, the proportion of Nb in the ion-conducting channel state; and I, the proportion of Nb in the inactivated state (Equation S2–S8, Supporting Information). Empirical voltage-dependent rate parameters govern the reversible kinetic transitions between the P, C, and I states, Equation (S9) and (S10), Supporting Information. The current density of an MzBS device at a particular voltage is therefore proportional to the number of ion-conducting channels per membrane area, N, determined by the product of Nb and C, as detailed by Equation (S5), Supporting Information. An example of the predicted states to a step response using experimentally-fitted rate parameters as well as the variations in the values of the rate parameters at room temperature during a simulated sinusoidal voltage sweep with accompanying device current is shown in Figure S3, Supporting Information.
In agreement with device measurements at low stimulus levels, the model predicts that some fraction of Mz species initially in the P state transition to the conductive C state yielding a facilitation-only type STP response (Figure 1C, blue trace). The physical process for this transition involves insertion of Mz monomers into the membrane and oligomerization to form an ion channel.[35] Under higher steady potentials, the subsequent transition of Mz from states C to I causes the total number of channels, and thus the measured current (conductance), to peak and then decay in a manner similar to SA (Figure 1C, orange trace). Rate parameters defining backward transitions (e.g., C to P, or I to C) determine how quickly Mz species return to a prior state. In addition to amplitude-dependent STP under a single continual stimulus, our model also captures the adaptation of device conductance across longer times and under subsequent applied voltages. The simulated response in Figure 1D shows that repeated high voltage pulses induce diminishing SA-type responses due to cumulative depletion of Mz in the P and C states. The model thus provides a basis for simulating reservoirs consisting of independent MzBS nodes with amplitude-dependent forms of STP and spontaneous fading memory (Figure 1E).
Device Simulations for 5 × 5 Digit ClassificationTo prove the feasibility of pattern recognition using an MzBS biomolecular reservoir, we created a custom 5 × 5 binary image dataset comprising digits “0–9”, with 10 variations of each digit for a total of 100 unique patterns (see Figure S4, Supporting Information, for the full set). This smaller task provided a convenient environment for determining suitable input mapping and output selections from the MzBS reservoir. For inputting these spatial patterns, each row of pixels was converted to a train of rectangular voltage pulses defined by “high” and “low” stimulus levels (Vhigh and Vlow, respectively) as has been used elsewhere.[42,57,62] In this scheme, the reservoir consists of n = 5 independently operated devices, each receiving information from a separate pixel row (Figure 2A). The pulse duration for a single pixel is defined as tpixel. Black pixels (“0”) in a row remain at Vlow, whereas each white pixel (“1”) induces a single “high” level pulse with a 50% duty cycle (Vlow first, then Vhigh) (Figure 2A). Using these temporal voltage signals for each pixel row of the binary images, we simulated the dynamic evolutions in channel density, N, of all MzBS devices and recorded their final values at the vertical dashed line in Figure 2A. Model details and simulation parameters can be found in Table S1, Supporting Information. Thus, for this task, each 5-pixel row per image produced five independent voltage signals that were input into five independent MzBS devices. The simulated outputs for each simulation condition are classified 20 times using a linear classifier (0 hidden layers, 5 × 10 readout function), starting with newly randomized weights at each iteration. Due to the small size of the dataset, randomized image selections for training and testing produced variable results. Therefore, one randomly selected image from each digit class was used as the testing set, resulting in a 90%/10% training/testing split.
Figure 2. Image to pulse conversion scheme and 5 × 5 digit classification. A) Each pixel row is converted to a series of rectangular voltage pulses based on a time per pixel (tpixel), Vhigh, and duty cycle. These voltage pulses are then used as the temporal inputs for simulations of MzBS channel density, represented by the orange curves. The channel density values obtained at the time indicated by the black dotted line are used as features for a linear readout layer to classify which digit (0–9) is presented. Numbers in gray circles show that a single digit can be made of repetitions of unique rows. B) Differences in STP between model variations by tuning the translocation rate constant (k2, see Table S1, Supporting Information) in response to the “pixel-row 1” (from (A)) with Vhigh = 145 mV. By setting k2 to 0 (0×), a fast sigmoidal rise is achieved. 1× uses nominal parameters obtained from device fitting[35] and shows SA with a moderate amount of inactivation. 10× increases k2 by a factor of 10 as compared to 1× and shows a larger amount of inactivation, represented by the channel density nearly returning to its prewritten state by the end of the second pulse. C, D) Comparisons of classification accuracies versus pulse amplitude for varying levels of SA, while keeping all other parameters constant: duty = 50%, tpixel = 4 s, Vlow = 15 mV in all cases. Red asterisks indicate pulse amplitudes where 1× is significantly higher (see (E)) than both 0× and 10×. Green hashmarks indicate where 10× is significantly lower than both 0× and 1×. The blue “@” symbol shows the single case where 0× is higher than both 1× and 10×. Error bars represent standard error of 20 randomly seeded weight initializations. C) shows the overall (training + testing) accuracy. D) Testing accuracy only. E) Table containing multiple comparison p values from Kruskal–Wallis testing. “All” refers to overall accuracy in (C); “Test” refers to (D); Comparisons are color-coded to match designations in the table and graphs in (C, D) P value color represents the model variation with the higher value for that comparison.
A unique aspect of MzBS is that the shape of its dynamic response can take one of two different qualitative forms (Figure 1C) depending on stimulation amplitude and duration. Therefore, we varied Vhigh while keeping the value of tpixel, duty cycle, and Vlow fixed at 4 s, 50%, and 15 mV, respectively. This was done to induce a continuum of responses exhibiting facilitation-only to facilitation-then-depression type STP to study the effect of SA on 5 × 5 digit classification accuracy. We also modulated SA directly by varying the value of model parameter k2 (see Table S1, Supporting Information), which governs the transition rate of Mz from C to I (see Equation (S4)). It was either set to 0 (“0×”), left unchanged (“1×”), or multiplied by factor of 10 (“10×”). The 0× variation is representative of the typical conductance evolution of volatile memristors used in RC networks,[42,57,58,62] 1× is representative of slowly adapting sensory neurons in the eye, ear, and nose, and 10× is more representative of rapidly adapting synapses, such as those found in the brain.[14] The plots in Figure 2B show how k2 affects the simulated response of N to a three-pulse voltage sequence (corresponding to the top row of pixels for the image in Figure 2A) with a Vhigh value of 145 mV. In the 0× variation, facilitation-only STP causes unidirectional “writing” of the conductance state (N here) for memristors. This increase in conductance subsequently fades when voltage is then removed due to memristive volatility caused by C-state Mz channels exiting the membrane and returning to the P-state.[35] In the 1× variation, conductance initially increases but then decays even as the stimulation remains on. In the 10× variation, the postpeak conductance decay is more significant during the writing pulse, returning the device conductance almost to its prewritten value.
Simulations revealed that reservoirs consisting of unmodified (1×) MzBS devices were significantly more accurate than the 0× and 10× device models at many of the tested voltage levels. The overall (training + test) and test accuracy results are reported in Figure 2C,D. For Vhigh 115 mV, the three models (0×, 1×, and 10×) overlap in classification performance. They diverge from one another at 130 mV, where the 1× model shows higher overall accuracies for voltages between 130–175 mV and higher test accuracies at 145–160 mV than 0× and 10× variations. The 10× model performs consistently worse than the 0× and 1× models starting at 160 mV. The p values for these comparisons are reported in Figure 2E (see Methods for specifics on statistics).
Interestingly, 1× and 10× models both show a decrease in accuracy as voltage rises above 145 mV, whereas the 0× model begins to slowly increase, overtaking the 1× model in test accuracy at 205 mV. This is likely due to the state being overwritten in the 1× and 10× case at these larger voltages, as seen in Figure 2B, with 10× being an extreme case whose performance suffers because of it. This finding implies that RC nodes exhibiting moderate amounts of SA can outperform facilitation-only nodes in image classification tasks. Moreover, it informed our choices of stimulation voltages, model comparisons used to understand MzBS RC performance, and the role of SA on larger classification tasks, such as the Modified National Institute of Standards (MNIST) handwritten digit dataset.
Experiments Validate Predicted 5 × 5 Digit Classification PerformanceTo validate the model-based RC simulations, the 5 × 5 digit dataset (Figure S4, Supporting Information) was converted into voltage pulses and used to experimentally stimulate physical MzBS devices. The combination of using biomolecular memristors that exhibit SA and experiments to demonstrate their physical responses differentiate this work from a prior simulation-only study of RC using biomolecular nodes.[42] Specifically, the 5 × 5 digit dataset was separated into its basic set of 21 unique pixel rows containing at least 1 white pixel. This process is represented in Figure S5, Supporting Information, where it can be observed, for example, that these versions of a “3” and an “8” can each be constructed from two unique pixel rows; one of which is shared between them. Unique pixel rows are also labeled in Figure 2A with gray numbered circles. Experiments consisted of inputting voltage pulses derived from a single pixel row into one MzBS device and measuring the resulting current density at the predefined timepoint for each pixel-row (Figure S5, Supporting Information). Stimulations from several (but not all) unique pixel rows were input serially into the same MzBS device to reduce device-to-device variation and maintain stability. We allotted 30 min of inactivity (V = 0) in between each stimulation to allow the devices to fully reset[35] (see Methods for further details). In this way, we generated responses to the whole basis set with 21 independent measurements. Two full sets of basis row stimulations were created, one at Vhigh = 110 mV and another at a Vhigh = 120 mV, both voltages where SA is present (individual responses in Figure S6–S8, Supporting Information).
The measured feature set for Vhigh = 110 mV produced an accuracy of 84.46% (max 91.00%) and mean test accuracy of 68.92% (max 90.00%), where the same training/testing split as used in the simulations was used here. Increasing Vhigh to 120 mV resulted in a mean overall accuracy of 83.74% (max 88.00%) and mean test accuracy of 70.42% (max 90.00%). Of note, raising Vhigh to 120 mV resulted in currents beyond the measurement range of our equipment. This saturation caused measurement artifacts (see 120 mV in Figure S7 and S8, Supporting Information) that affected assessing the final state at higher voltages and prevented experimental measurements at higher voltages. It also necessitated a slightly different form of current normalization, see Methods. These experimental accuracies align well with the simulation results from 115 mV 1× (Figure 2C,D, Table 1), and compare favorably with experimental results on solid-state volatile memristors from other groups.[57,62]
Table 1 A comparison between classification methodologies presented in this manuscript (outlined by the thick black line) and previously reported works using two terminal devices. Classifications using the image pixel values as features are also presented for comparison. “F1” and “F2” refer to combinations of feature sets
Case | Data Source | Vhigh (mV) | tpixel | duty | # VNs | # Devices | # wts | Max OA Acc | Max Test Acc |
5 × 5 1× | RC Sim | 115 | 4 s | 0.5 | 0 | 5 | 50 | 0.9100 | 0.8500 |
5 × 5 1× | RC Exp | 110 | 4 s | 0.5 | 0 | 5 | 50 | 0.9100 | 0.9000 |
5 × 5 1×a) | RC Exp | 120 | 4 s | 0.5 | 0 | 5 | 50 | 0.8800 | 0.9000 |
5 × 5 Images | NN Sim | – | – | – | – | – | 250 | 0.9700 | 0.8500 |
5 × 1 Images | NN Sim | – | – | – | – | – | 50 | 0.4900 | 0.4500 |
MNIST 1×VN | RC Sim | 85 | 0.5 s | 0.8 | 0 | 20 | 200 | 0.8332 | 0.8364 |
MNIST 1×VN | RC Sim | 145 | 0.5 s | 0.8 | 0 | 20 | 200 | 0.7240 | 0.7300 |
MNIST 1×VN | RC Sim | 85 | 0.5 s | 0.8 | 3 | 20 | 800 | 0.9157 | 0.9149 |
MNIST 1×VN | RC Sim | 145 | 0.5 s | 0.8 | 3 | 20 | 800 | 0.9089 | 0.9150 |
MNIST 1×VN | RC Sim | F1: 85 F2: 145 | F1 0.5 s F2: 0.5 s | F1: 0.8 F2: 0.8 | 0 | 40 | 400 | 0.9170 | 0.9181 |
MNIST 1×VN | RC Sim | F1: 85 F2: 145 | F1 0.5 s F2: 0.5 s | F1: 0.8 F2: 0.8 | 3 | 40 | 1,600 | 0.9410 | 0.9411 |
MNIST SM | RC Sim | F1: 85 F2: 145 | F1 0.5 s F2: 0.5 s | F1: 0.8 F2: 0.8 | 0 | 160 | 1,600 | 0.9263 | 0.9261 |
MNIST SM | RC Sim | F1: 85 F2: 145 | F1: 0.5 s F2: 1 s | F1: 0.8 F2: 0.8 | 0 | 160 | 1,600 | 0.9271 | 0.9267 |
MNIST SM | RC Sim | F1: 85 F2: 85 | F1: 0.5 s F2: 1 s | F1: 0.8 F2: 0.8 | 0 | 160 | 1,600 | 0.9238 | 0.9200 |
MNIST 20 × 20 | NN Sim | – | – | – | – | – | 4,000 | 0.9285 | 0.9183 |
MNIST 20 × 8 | NN Sim | – | – | – | – | – | 1,600 | 0.9023 | 0.9049 |
Du et al. SM[62] | RC Sim | 1,500 | F1: 0.8 ms F2: 4 ms | F1: 0.625 F2: 0.125 | 0 | 88 | 1,760 | – | 0.9110 |
Du et al. SM[62] | RC Sim | 1,500 | F1: 0.8 ms F2: 4 ms | F1: 0.625 F2: 0.125 | 0 | 112 | 2,240 | – | 0.9150 |
Du et al. SM[62] | RC Exp | 1,500 | 0.8 ms | 0.625 | 0 | 88 | 880 | – | 0.8560 |
Du et al. SM[62] | RC Exp | 1,500 | F1: 0.8 ms F2: 4 ms | F1: 0.625 F2: 0.125 | 0 | 88 | 1,760 | – | 0.8810 |
Midya et al. SM[57] | RC and NN Exp | 1,250 | 200 us | 0.5 | 0 | 220 | 2,200 | 0.8300 | – |
Hossain et al. SM[42] | RC Sim | 120 | 12 ms | 0.833 | 0 | 88 | 880 | – | 0.8851 |
Hossain et al. SM[42] | RC Sim | 130 | 12 ms | 0.833 | 0 | 88 | 880 | – | 0.8866 |
a)This experiment was normalized by initial device current rather than device area.
We also compared the MzBS RC experimental results to using the individual pixel values (i.e., 0 black or 1 for white) as features for training a linear readout layer (0 hidden layer neural network), again using training 200 iterations with randomized starting weights. This resulted in a mean overall accuracy of 91.94% (max 97.00%) and mean testing accuracy of 59.72% (max 90.00%). The higher overall accuracy compared to test accuracy suggests that overfitting occurred, leading to poor generalization. In addition, using all 25 pixels required training of 250 weights compared to only 50 for the RC scheme. Compressing the 5 × 5 images to 5 × 1 (see Methods) reduced the mean overall accuracy to 40.37% (max 49.00%) and mean testing accuracy to 25.98% (max 45.00%) (Table 1). This reinforces the purpose of the reservoir computing scheme: fewer trained weights with comparable results.
Optimal tpixel and duty cycle will vary between individual or groups of devices due to device-to-device variations that affect the total number of Mz species interacting with the membrane. It is likely that these experimental results can be further improved by optimizing the input parameters and assembling arrays to test MzBS devices concurrently. Still, the accuracies achieved from experimental measurements add confidence to our model-based predictions, including that devices exhibiting SA can achieve favorable classification accuracies for small datasets.
Device Simulations for MNIST Handwritten Digit ClassificationUsing the same pixel row to voltage pulse train scheme as used for the 5 × 5 digit classification (Figure 2A), we simulated the responses of larger MzBS reservoirs to test the accuracy of MzBS and study the effects of SA in classification of the MNIST handwritten digit dataset. MNIST images were cropped from 28 × 28 to 20 × 20 and binarized (Figure 3A). We hypothesized that maximizing the performance of the reservoir depends on fully leveraging the history dependence of the volatile memristive nodes. We therefore employed both virtual node-multiplexing (VN) and device-multiplexing to create classifiable RC features. These strategies have been used in previously reported RC applications.[43,57–59,62,63] VNs are stored states from earlier timepoints of the signal which are used as additional features for the readout layer to use for classification. Device-multiplexing sends each input signal to multiple devices, usually where the additional devices receive an amplitude-modified or timescale-modified version of the same input.[43,62,65] In our multiplexing scheme, only the value of Vhigh changes, keeping the same tpixel and using three prior states (VNs) in addition to the final state (Figure 3A).
Figure 3. MNIST digit classification details and results. A) MNIST digits are first cropped from 28 × 28 to 20 × 20 and binarized. Features are then created by virtual nodes from multiple devices (1× and 0×) or separated into smaller sections before being input into different devices at two different rates/timescales (SM, see Figure S9, Supporting Information). The features are labeled according to device and VN number for clarification of how they are used in the readout layer. B–D) Heatmaps of average testing accuracies (n = 20) obtained after combining feature sets created with differing Vhigh. tpixel = 0.5 s for both axes of (B) and (C). Highest average for each heatmap indicated by a green outline. (B) 1×VN model containing SA. C) 0×VN model containing no SA. D) SM method combining differing tpixel to compare with previous work's methods,[62] namely on the diagonal where Vhigh is equal for each feature set.
We also hypothesized that multiplexing devices exhibiting SA with those that only facilitate would increase accuracy due to the differing trajectories of current density that determine the nonlinear map between inputs and classified outputs. We tested this by constructing several feature sets containing reservoir states at differing Vhigh, tpixel, and duty cycle that can be “mixed and matched” with each other, as in Figure 3A (righthand side). Feature sets using these same values are also constructed using the sectioning method (SM)[42,57,62] and the VN method with inactivation (SA) disabled (0×VN) (Figure 2). An example of the sectioning method (SM) is shown in Figure S9, Supporting Information, to compare with VN multiplexing in Figure 3A. Using two different values of tpixel, three feature creation methods, two values of duty cycle, and seven values of Vhigh resulted in 84 possible combinations of feature sets to classify. Chosen values were: 0.5 and 1 s for tpixel; 50% and 80% duty cycle; and 70–160 mV in increments of 15 mV for Vhigh. The simulated reservoir states for each of these sets generated 80 output features (20 rows, 4 states per row). Combining two sets of multiplexed devices resulted in 160 features (160 × 10 readout), which is the same as or fewer than the number of features per row in previous works, but with fewer devices: 40 here compared to 88 or more elsewhere,[42,57,62] see Table 1 for details on number of devices and number of weights trained.
For each combination, the merged features were classified by a linear readout layer 20 times, using newly randomized initial weights each time with an 80%/10%/10% training/validation/testing split on the full 70 000 digit set, divided randomly. The highest performing 1×VN, 0×VN, and SM feature sets are shown as accuracy heatmaps in Figure 3B–D. Between 1×VN and 0×VN, using the same value of Vhigh (i.e., values on the diagonals) for the two sets of devices in a multiplexed RC network performs worse than combining devices stimulated at different voltage amplitudes (i.e., values off the diagonals). Regardless of device model variation, accuracy increases when devices receiving low voltage stimulus are multiplexed with devices receiving high voltage stimulus. Test accuracies for the SM are roughly uniform until 145 mV, where they decline (Figure 3D).
The 1×VN feature set combination obtained for 85 and 145 mV pulse amplitudes with tpixel = 0.5 s and duty = 0.8 achieved the highest mean test accuracy at 93.62% (max 94.11%) (Figure 3B). This is significantly higher than the mean test accuracy of 93.33% for 0×VN (max 93.86%) (p = 0.0269) and SM (92.04%, p = 4.96 E-19). The highest maximum test accuracy achieved by any feature set combination was 94.34% (mean accuracy 93.56%, Figure 3B) obtained with the 1×VN reservoir, which multiplexed devices at 70 and 145 mV and set tpixel = 0.5 s and duty = 80%. These results support the hypothesis that SA present in the device model for the nodes leads to higher performance. The simulated performance of MzBS reservoirs also compares favorably to similar memristor-based RC network performance, achieving higher average and max accuracies for all methods. This includes theoretical maxima predicted from simulation.[42,57,62]
In aggregate, the classification accuracies of 0×VN and 1×VN reservoirs with device multiplexing are not statistically different (n = 84, p = 0.1247). In contrast, they are both significantly higher (p ≤ 6.45 E-16) than when SM is used. This suggests that a virtual node method is responsible for some of the improvement in accuracy demonstrated by MzBS. We also tested classification accuracies of individual feature sets with and without virtual nodes; using the raw pixel values of the 20 × 20 images as the features; and using the raw pixel values of compressed 20 × 8 images as features for the linear readout to make the number of trained weights equal to the 1×VN method. These results are summarized in Table 1 and Table S2, Supporting Information.
SA Benefits Dynamic Nonlinear Function FittingModel-free dynamic nonlinear function fitting is another common task in benchmarking RC performance.[42,49,57,58,62,63,66] To investigate the role of SA in this type of task, two common functions were fit using MzBS RC networks. The first[Image Omitted. See PDF]is discrete in time and is a second-order, nonlinear, autogregressive moving average, abbreviated NARMA2.[67] For the purposes of the benchmark, u(k) was set to be a random number between 0 and 0.5. This number was scaled to a voltage range of 70–100 mV (Figure 4A), which was then converted to a rectangular voltage pulse with varying duration and used as the input to a reservoir consisting of 90 MzBS devices with preprogrammed device variation (see Methods for details). The simulated channel densities from all devices at the end of each stimulation block were then fit by regression to the true function value by a 90 × 1 readout layer using the normalized mean square error (NMSE), defined as[Image Omitted. See PDF]where pi(k) is the predicted value at the kth step or time k = t and the ith device, and yi(k) is the true function value.
Figure 4. Model-free fitting of nonlinear transfer functions. A) A representation of how random numbers u(k) are converted into voltage pulses. The same is true for Mackey–Glass, except the timescale does not vary and the numbers are determined by the function instead of being random. B) Blue represents the true function value y(k) of the NARMA2 function for a given random input u(k) and orange represents the predicted value based on reservoir states after training on voltage pulses made from u(k). C) The testing prediction set using a newly generated u(k) and the same weights from (B). D) Mackey–Glass chaotic time series fitting. True values in blue. Training occurs from timepoint 150–400, where 1× and 0× models essentially converge on the training set. From 401 to 1100, the testing fits of 1× and 0× are close, but not the same (see Figure S10, Supporting Information). E) Boxplot of 1× and 0× testing error. 1× achieves significantly lower testing error (p = 1.986 E-14).
The reservoir was allowed to equilibrate for 149 timesteps before training occurs on timesteps 150–400. Ten training and testing iterations were performed to compare 1× and 0× device variations, with both model variations using the same randomized initial weights as well as the same randomly generated u(k) in each iteration. Results from 1× are shown in Figure 4B,C. No statistical difference was found between 1× and 0× for training and testing errors suggesting SA is neither beneficial nor harmful in this task. In Figure 4B, the NMSE error (Equation (2)) is 5.34 E-4, indicating a very good fit to the training set values. Using a newly generated set of random numbers u(k) from 0 to 0.5 as the testing set in Figure 4C, the error remained below 1E-3, showing that the fit generalized well. These errors are lower than previously reported in memristor-based reservoirs.[42,62]
The second benchmark equation is the Mackey–Glass equation, given by [Image Omitted. See PDF]which is a continuous time delay-differential equation with physiological relevance.[68] In this task, the following parameters were used: τ = 18 s, β = 0.2, ϒ = 0.1, and n = 10. Details about function initialization can be found in the Methods section. The function was discretized by generating the expected output every 0.5 s. These values were converted to voltage pulses with a 50% duty cycle between 60 and 90 mV and used as the inputs to 20 simulated MzBS devices, the same as Figure 4A, but with pulse width set at 0.5 s. The values of the function at the prior 49 timesteps were used as additional features for classification, resulting in a 1000 × 1 readout layer. The reservoir was allowed to equilibrate for 150 timesteps before training begins. In Figure 4D, the true signal is represented in blue, while the weighted fit to the training timepoints is represented in orange (1× model) and purple (0× model). Both model variations fit the training data very well, achieving average NMSEs of 5.35 E-4 and 5.30 E-4 for 1× and 0× device models, respectively, based on ten randomized weight initializations. Both model variations, using the weights trained during the training timepoints with no updating, fit the testing timepoints (1× in yellow, 0× in green, Figure 4D) well, with the 1× model achieving a significantly lower (p = 1.986 E-14) error of 6.63 E-4 compared to 0× at 6.68 E-4 (Figure 4E). A zoomed inset can be found in Figure S10, Supporting Information. This demonstrates that SA provides a small but tangible benefit.
Demonstration of MzBS Reservoir Classifying Dynamic GustsIncorporating physically realized RC networks as peripheral nervous analogs in autonomous systems can enable efficient and adaptive information processing. For example, this could benefit manned and unmanned flight control by offloading critical computational resources to the sensors themselves rather than the pilot or central processing unit. To test the signal processing capability of an MzBS reservoir for signals relevant to an aerial vehicle, we collected pressure measurements from six pressure sensors on an aircraft wing to distinguish between types of dynamic gusts (Figure 5A). The sensors were located at 0c, 0.05c, 0.015c, 0.1c, 0.4c, and 0.5c where c is the chord length of the wing. Different gust pressure distributions were generated by moving an upwind airfoil between set angles of attack, moving from 0° → + a° → −a° → 0°, where a was positive or negative. The gust types are delineated by the three amplitudes and sign of a (±7.5°, ±10°, and ±12.5°) resulting in six classes of gust. For example, “+12.5°” would refer to 0° → +12.5° → −12.5° → 0°, an example of which is presented in Figure 5B, while “−7.5°” would refer to 0° → −7.5° → +7.5° → 0°. These movements altered the pressure at each sensor location on the downstream wing (Figure 5B).
Figure 5. Gust classification using dynamic pressure measurements on a NACA 0012 Wing. A) Schematic of the wind tunnel experimental setup. An obstructing airfoil rotates between different angles of attack to create an artificial gust on the wing further downwind. Pressure taps are labeled (from left to right): 0c, 0.015c, 0.05c, 0.1c, 0.4c, and 0.5c. B) A single trial example of normalized pressure for a +12.5° deflection. The lines labeled T1, T2, T3, and T4 are the timepoints where the pressures or states are sampled to use as features for classification. C) An example of how the state variable “C” evolves over time when normalized pressures are converted to voltages and used to stimulate an MzBS reservoir. D) Histograms of testing accuracies for comparison of MzBS reservoir models vs static neural networks using different features. Histogram bars belonging to a group are slightly offset from others to make distinguishing easier. E) Box plots of the histogram accuracies. Multiple comparisons using nonparametric tests show RC 1× at 100% is significantly higher (p [less than] 0.0478) than all other cases.
The pressure readings from the six sensors were then converted to proportional voltage signals used as separate inputs to a six-device MzBS reservoir. The channel proportions C for all devices were simulated across the entire history of pressure measurement (Figure 5C), but only their values at the final time point (T4) were used as the feature set for a linear readout layer. A reservoir implementing 1× MzBS devices achieved 100% classification accuracy across 1000 randomly initiated weights, regardless of whether final states of all six devices or just the final state of the device receiving information from tap 0c, which exhibited the largest sensitivity to changes in angle of attack, were used as features. Gust classification accuracy dropped substantially to an average of 77.26% (when all devices were used) and 66.31% (using only the device corresponding to sensor 0c), respectively, when the reservoir consisted of facilitation-only 0× MzBS devices. This comparison demonstrates that SA resulting from Mz-channel inactivation plays a significant role in discrimination between gust types.
To compare these classification results to traditional neural network architectures, several variations of “static” neural networks were tested for this task. These variations received floating point values of normalized pressure at four specific timepoints (Figure 5B), resulting in a total of 24 classifiable features (six pressure taps × four timepoints). All cases were tested with 1000 iterations using randomized initial weights. Using a linear readout from all 24 features (gold) for the six gust types (24 × 6 readout) resulted in a testing accuracy of only 58.21% (Figure 5D). We speculated this occurred due to extraneous information provided by T1 and T4 features. To remedy this, the feature set was reduced to using only features from T2 and T3 (i.e., when the angle of attack is not zero), lowering the number of features from 24 to 12 (12 × 6 readout). This increased the test accuracy to a mean of 80.64%. To improve the fairness of comparison, nonlinearity was then added via a six-neuron hidden layer (HL) and only the two static features (pressure values at timepoints T2 and T3, 2 × 6 × 6 architecture) derived from tap 0c were included (denoted as the Static 0c T2 + T3 HL case). This dramatically increased performance, resulting in a mean testing accuracy of 98.18%.
As a final comparison, we determined the test accuracy for a static network with one six-neuron, hidden layer using only one feature (1 × 6 × 6), the pressure from tap 0c at time point T4. This selection allows for comparison to using only the final state at T4 for the MzBS device corresponding to tap 0c. Predictably, the accuracy of the static network falls to nearly 0 because the value of pressure at timepoint T4 contains no information about what angles of attack occurred at earlier times. Using the Kruskal–Wallis multiple comparison test (due to far from normal distributions), the MzBS is still statistically different (p < 0.0478) with higher mean ranks than all compared groups (Figure 5E). Comparing an RC implementing 1× MzBS model to the Static 0c T2 + T3 HL (highest performing static) reduces the p-value dramatically (p < 2.884 E-23).
These comparisons showcase the advantages of using a physical reservoir based on volatile memristive nodes, as well as the advantage of SA exhibited by MzBS, in classifying with real-world time-varying signals. One such advantage is spending less effort on feature selection. The static network performed poorly when overwhelmed with extraneous features, whereas the MzBS reservoir started with fewer features that captured the same information and required no feature selection, showing 100% classification accuracy “out of the box”. Another advantage is the ability to sample at any timepoint to gain information about the history of the signal. This is also intrinsic to the reservoir, as it stores the information of previous stimulation and only requires the T4 point of information. The static network is wholly dependent on the timepoint picked ad-hoc by an intelligent outside actor, while the MzBS reservoir is a continuously updating and autonomous processor that provides useful transformations for dynamic data.
ConclusionIn conclusion, we show that reservoir computing performance may be enhanced through the addition of amplitude-dependent sensory adaptation as exhibited by monazomycin-based biomolecular synapses (MzBS). Through experiments, we confirm the feasibility of accurate classification using a physical network of devices. We suspect that larger biomolecular reservoirs incorporating membrane-bound channels with faster or more complex forms of memory will be able to accomplish even more complex tasks. While RC without SA is the norm and works reasonably well, some amount of SA appears to be beneficial in amplifying small differences in signals that can be leveraged to make a distinction. When SA is too strong or fast, the discriminating information is lost or overwritten. It is well understood from an information theory and thermodynamic perspective that SA overwrites old information in favor of new information, so it is reasonable to believe that SA imparts new information to the system for processing as opposed to facilitation-only STP.[69,70]
We also show that an MzBS RC can reconstruct and fit nonlinear dynamical functions in a model-free fashion with improved error compared to previous memristor-based RC networks. However, we determine that SA is not always beneficial, and should therefore be judiciously applied when appropriate.
Lastly, while RC has been studied for use in autonomous vehicles and control,[71–73] its integration as a sensory system to complement feedback control has been limited.[74] We demonstrate a novel application by using pressure data from an airfoil to dynamically classify different types of gusts and show that it outperforms traditional forms of neural networks, using fewer trained weights and achieving perfect accuracy. Perfect accuracy was only achieved when SA was present in the device model. We believe presenting a physically realized MzBS reservoir with more complex gust patterns holds promise as a sensory system analog.
Methods Simulations and ClassificationSimulations of Mz-based biomolecular synapses (MzBS) were completed in MATLAB based on the three-state voltage-dependent model published previously.[35] To recap, the proportional amount of Mz in each of the three states (preinserted, conducting, and inactivated) is governed by transition rates that vary exponentially with voltage, see Supporting Information for deeper discussion. Here, we use the generalized version with parameters for a 1:1 DOPC:DPhPC membrane (see Table S1, Supporting Information, for a list of parameter functions and values). Noise was not modeled due to the high precision in current measurement of the Axopatch 200B.
Classifications were performed using the default neural network training functions in MATLAB included with the machine learning and deep learning toolboxes. No difference was found between training function options besides time of execution, therefore the default “trainSCG” (stochastic gradient descent) was used. Large-scale simulations of the MNIST data set were performed using resources provided by the Infrastructure for Scientific Applications and Advanced Computing at the University of Tennessee.
NARMA2 FunctionTen groupings of nine devices (total of 90) are used, as described previously.[62] The timescale of each pulse per step k are varied between groupings, from 0.1 s in the first grouping to 1 s in the last grouping. Each pulse has a duty cycle of 50%, with the low voltage being constant at 15 mV. Device variations are also included, in this case, the Mz autocatalytic insertion rate (k1B, Table S1, Supporting Information) may vary from 0.85 to 1.15 times its nominal value at a given voltage, assigned randomly.
Mackey–Glass FunctionThe Mackey–Glass equation requires a “history function” to tell it what to consider values before t = 0 due to its recursive nature. In this study, we set the history function to be constant at 0.8. The function values generated using this history are then converted to voltage pulses ranging from 60 to 90 mV in amplitude with a duration of 0.25 s with an off time of 0.25 s and an “off” low voltage of 15 mV. To introduce device variation, 2 Mz model parameters are allowed to vary: k1b, k1r, as well as initial channel density N0. These parameters vary between 0.1 and 5 times their nominal values calculated from the model at a given voltage. This range represents a realistic difference between device responses.
Statistical TestingStatistical testing in this manuscript follows these prescribed steps:
1) Check for normality with a Shapiro–Wilk Test.
2) Test for differences with t-test (2 groups)/ANOVA (>2 groups) for normally distributed data or Wilcoxon (2 groups)/Kruskal–Wallis (>2 groups) for nonnormally distributed data.
3) Perform Tukey HSD post hoc test, if applicable (for >2 groups).
4) Determine sample size (n) required for ≥90% statistical power based on group differences.
5) If number of samples ≥n, reject null hypothesis.
These were applied to all reservoir simulations, where sample size was 20 for MNIST classifications, 200 for 5 × 5 digit classification, 10 for NARMA2 and Mackey–Glass function fitting, and 1,000 for dynamic gust classification. All processing was done in MATLAB.
Experimental Section Vesicle PreparationLipids were purchased from Avanti Polar Lipids (Alabaster, AL) in chloroform. Lipid films were produced by fully evaporating chloroform-dissolved stock under nitrogen stream and then additionally desiccated for an hour under vacuum. Mixtures of 1:1 DOPC:DPhPC were made by mixing chloroform solutions of each prior to film creation. Films were rehydrated in ultrapure water at a concentration of 5 mg mL−1 to make stock aqueous lipid solutions. Stock solutions were frozen and thawed four times and mixed well before being passed 11 times through a 100 nm extrusion filter (Whatman). Lipid stocks were combined with potassium chloride and 3-(N-morpholino)propanesulfonic acid (MOPS) from Sigma (St. Louis, MO) and monazomycin from Cayman Chemical (Ann Arbor, MI) to obtain final concentrations of 2 mg mL−1 lipid, 100 mm potassium chloride, 10 mm MOPS, and 20 μg mL−1 Mz. Solution pH was consistently measured to be near 6 and was not modified for device testing.
Electrophysiological MeasurementsMzBS ionic currents were measured using an Axopatch 200B patch clamp amplifier in conjunction with a Digidata 1500B digitizer, with a noise level of <5 pA (0.000125% measurement range). Voltage was supplied by an NI 9263 output module and was controlled through MATLAB.
Droplet Reservoir Digit ClassificationDroplet interface bilayers were constructed using lipid-in[75] 1:1 mole ratio DPhPC:DOPC vesicles at 2.36 mm, 20 μg mL−1 Mz, 100 mm potassium chloride, and 10 mm MOPS. Experiments were conducted in hexadecane oil at room temperature (21 °C). Due to Mz causing innate instability in the membrane, membranes had to retain a high resistance state at 0 mV for at least 10 min before experiments could proceed. Images were taken using a Ximea XiC bottom view camera and processed in MATLAB. Area normalization was performed using custom MATLAB scripts. A total of 7 of the 21 experiments for each pixel set were performed on individual MzBS devices, resulting in 6 total MzBS devices represented overall, 3 for each set. For the 120 mV set, device area was too small to measure, necessitating a different form of normalization. Instead, the fractional change from baseline current is used: I/I0 – 1, where I0 is the current from the initial 3 s of the experiment, all at 15 mV. The minimum I0 from each set of experiments is used to replace the row of all “0's”.
Airfoil in Wind TunnelTo develop a gust alleviation system for spanwise camber morphing wing, we installed six pressure taps on the top surface of a 228 mm wide NACA 0012 wing with a 320 mm chord and a multifunctional macrofiber composite trailing edge. We placed the gust-sensing wing at a 9°±1° angle of attack in the 30 × 30 cm wind tunnel at the University of Michigan. A 15 cm wide rigid airfoil located 30 cm upstream of the morphing wing deflected at angles between ±12.5° to generate gusts similar to updrafts and downdrafts found in dynamic aerial environments. Elliptical endplates were mounted to the ends of the morphing wing to prevent tip vortices and limit the aerodynamic analysis to two dimensions in the 10 m s−1 airflow environment.
Classifying the degree of gust from onboard sensor signals can inform gust alleviation controller decisions to compensate for change in wing lift through camber morphing. Although pressure can be used to directly measure lift produced by an airfoil, we chose to perform classification directly from signals provided by six pressure taps. Six compact differential low-pressure transducers provided signals proportional to the difference between static pressures experienced at each pressure tap location and the static pressure measured upstream of the testing section using a pitot tube. The low-pressure transducers provided signals between 0 and 5 V, with expected pressure signals of 3.40, 3.66, 3.46, 3.27, 2.99, and 2.83 V for each pressure tap, beginning with the leading edge and working backward. We gathered pressure sensor signal data and lift values for gust combinations of ±[7.5°,10°,12.5°]. By centering our normalization around 2.5 V we could distinguish between positive and negative differential pressures. We found that due to the 9° angle of attack, differential pressures remained positive, and using a scaling factor of two bounded pressure signals between [0, 1] for the largest tested gusts. Therefore, we normalized the pressure values using[Image Omitted. See PDF]where is the measured pressure signal and is the normalized pressure signal.
AcknowledgementsThis work is supported in part by the US Air Force Office of Scientific Research under grant number AFOSR-FA9550-19-0213, titled “Brain Inspired Networks for Multifunctional Intelligent Systems in Aerial Vehicles’’ and grant number FA9550-21-1-0325, titled “Towards Neural Control for Fly-by-Feel Morphing’’ both monitored by B. L. Lee. S. A. S. acknowledges financial support from the James Conklin Faculty Fellowship at the University of Tennessee. Computations pertaining to the MNIST handwritten digit dataset for this work were performed on the computational resources at the Infrastructure for Scientific Applications and Advanced Computing (ISAAC) supported by the University of Tennessee.
Conflict of InterestThe authors declare no conflict of interest.
Data Availability StatementThe data that support the findings of this study are available from the corresponding author upon reasonable request.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2023. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Despite its prevalence in neurosensory systems for pattern recognition, event detection, and learning, the effects of sensory adaptation (SA) are not explored in reservoir computing (RC). Monazomycin-based biomolecular synapse (MzBS) devices that exhibit volatile memristance and short-term plasticity with two strength-dependent modes of response are studied: facilitation and facilitation-then-depression (i.e., SA). Their ability to perform RC tasks including digit recognition, nonlinear function learning, and aerodynamic gust classification via combination of model-based device simulations and physical experiments where SA presence is controlled is studied. Simulations exhibiting moderate SA achieve significantly higher accuracy classifying a custom 5 × 5 binary digit set, with experimental validation achieving maximum testing accuracies of 90%. Classifications of the Modified National Institute of Standards and Technology (MNIST) handwritten digit dataset achieve a maximum testing accuracy of 94.34% in devices with SA. Fitting error of the Mackey–Glass time series is also significantly reduced by SA. Experimentally obtained pressure distributions representing gusts on an airfoil in a wind tunnel are classified by MzBS reservoirs. Reservoirs exhibiting SA achieve 100% accuracy, unlike MzBS reservoirs without SA and comparable static neural networks.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer