1. Introduction
The traditional computing systems based on the von Neumann architecture typically utilize deterministic binary bits to encode information and perform calculations, leading to inefficient solutions for combinatorial optimization problems, which are often Non-deterministic Polynomial-time hard (NP-hard) or NP-complete problems, such as the knapsack problem, integer factorization, and traveling salesman problem (TSP). Probabilistic Bit (P-Bit), as the core device to construct the hardware entity of Ising computation [1], owing to its inherent stochasticity, is proposed for efficiently solving combinatorial optimization problems. Several hardware implementations of P-Bit devices are proposed, for instance, stochastic low barrier magnetic tunneling junctions (MTJs) [2,3,4,5,6], spin orbit torque (SOT)-driven MTJ [7], non-linear oscillators [8,9,10,11,12,13], diffusive memristors [14], metal–insulator-transition-based P-Bit [15], optical parametric oscillators [16,17,18,19,20], CMOS circuits [21,22,23,24,25], and resistive random access memories (RRAMs) [26].
Among them, MTJ P-Bit devices offer advantages like high speed (∼ns), ultra-low power (∼W), and small footprint (∼10 nm). However, in order to implement the stochastic property of P-Bit, the energy barrier of the MTJ P-Bit needs to be reduced to approximately . In practice, it is very hard to fabricate multiple MTJ P-Bits with the same , especially since the is too small to be controlled. This leads to severe intrinsic variation in MTJ P-Bit devices [27,28]. To ensure the correctness of Ising computation, the P-Bit devices are required to be calibrated one by one, which is prohibitively expensive for a large P-Bit array. Even worse, within an analog P-Bit system, the individual variations of P-Bits remain elusive to direct measurement and calibration. Several methods have been presented to address the variation issue, including time division multiplexing (TDM) [29,30], applying external magnetic fields and voltages [31], and configuring suitable resistances [2]. However, these approaches are either operationally complex or not scalable.
Boltzmann machines are stochastic recurrent neural networks inspired by Boltzmann distribution and have widespread applications in generative machine learning [32,33]. Recently, hardware implementations of restricted Boltzmann machines have been proposed [34,35]. In response to the variability of MTJ P-Bit devices, an in situ learning approach based on fully visible Boltzmann machine (FVBM) has been introduced [36], which updates the weight matrix to adapt to P-Bit variation. This approach regards P-Bits with variations as a unified system and trains suitable weights at the system level, rather than calibration. However, it requires retraining the weight matrix for different problem instances, significantly affecting its scalability and increasing the complexity.
This paper is structured as follows: Section 2 presents a behavioral model of P-Bits incorporating and . Subsequently, by analyzing computations of the AND gate using both ideal and real P-Bits, it underscores the pressing need to address P-Bit variation. A weight compensation algorithm is then proposed in order to nullify variation by rederiving the weight matrix. Central to this methodology is the imperative requirement for precise knowledge of the and values for each P-Bit. Consequently, Section 3 introduces a variation extraction algorithm based on the Boltzmann machine to capture individual P-Bit variations. Examination of the extraction inaccuracies on the AND gate array reveals the necessity of constructing a suitable density matrix, leading to the proposal of an adaptable 3D ferromagnetic model capable of achieving accurate and scalable array variation extraction. In Section 4, we demonstrate the effectiveness, transferability, and scalability of our approach by successfully solving the 16-city TSP and 21-bit integer factorization on a large P-Bit array with corrected variations given by the Automatic Extraction and Compensation methods.
2. Weight Matrix Rederivation to Compensate P-Bit Array Variation
The P-Bit device is the main building block of the Ising computer. It is actually a binary stochastic neuron (BSN) [37,38] which can fluctuate between −1 and 1 with probability that is tuned by the input voltage (or current). The curve of controlled probability with respect to input voltage is usually Sigmoid-like. The behavioral model of P-Bit is written as follows [39]:
(1)
where is the output P-Bit state at the next time step, represents the input voltage, implements the Sigmoid function, and is a uniformly distributed random number between −1 and 1.2.1. Detrimental Impact of P-Bit Array Device Variation on Ising Computation
In our previous work, an ultra-fast field-free stochastic SOT P-Bit device is proposed and fabricated [7]. As shown in Figure 1a, a stack was deposited consisting of IrMn(7.5)/CoFe(2)/Ru(0.8)/CoFe(2)/CoFeB(1.9)/MgO /CoFeB(1.2)/W(5). Figure 1b displays the SEM image; the stack was etched into elliptical MTJs with a major axis of 137 nm and a minor axis of 107 nm. As illustrated in Figure 1c, The SOT MTJ structure involves adding a heavy metal layer beneath an MTJ, which is used to apply a SOT voltage. This voltage can influence the double-wall energy barrier via the SOT effect. The elliptical MTJ has a stray field that inherently favors the ‘0’ state. When the SOT voltage is low, the SOT MTJ tends to stay in the ‘0’ state. As the SOT voltage increases, the probabilities of the ‘0’ and ‘1’ states become 50/50%. While the SOT voltage is sufficiently high, the SOT MTJ is more likely to remain in the ‘1’ state. The Sigmoid curve of the SOT P-Bit device is fitted by the behavioral model, as shown in Figure 2. The behavioral model can fit the experimentally measured data well. However, upon measuring numerous SOT P-Bit devices, we found that the experimentally measured Sigmoid curves exhibit large intrinsic variation. As displayed in Figure 3, it is clearly observed that, compared to the ideal Sigmoid curve, the deviations of the real curve fall into two categories: stretched or compressed in shape, and rigid shift. This characteristic is also illustrated in both [7,40]. In ref [7], we have proposed to use to describe the degree of stretching and compression and to describe the rigid shift of the Sigmoid curve. The behavioral model that incorporates and to describe device variations is written as follows:
(2)
Based on the modified behavioral model, we can fit our experimental Sigmoid curves with different combinations of and . However, these device-to-device variations severely compromise the accuracy of Ising computation. To further examine the effect of the P-Bit device variation on Ising computation quantitatively, we connect three P-Bits to construct an AND logic gate. In Ising computing, the input voltage for each P-Bit will be determined as follows:
(3)
where is the inverse pseudo-temperature, is the coupling weight between the and P-Bit, and is the bias of the P-Bit. The input voltage will be calculated from the previous states of other P-Bits, which means the P-Bits are coupled with each other.As depicted in Figure 4, three P-Bits are interconnected. The coupling weight matrix for the AND gate is also provided, and the truth table for the AND gate is given as Table 1. Each truth value of AND gate has a probability: .
The result of AND gate computation in three scenarios of P-Bit variation is shown in Figure 5. As displayed in Figure 5a, the ideal P-Bit array obtains accurate results in AND gate computation at ; each of the four correct states is approximately and this make total accuracy. While in Figure 5b, the real P-Bit array extracted from the experimental data with different and cannot get accurate results in Ising computing. The accuracy from the real P-Bit array is , much smaller than the ideal P-Bit array. It is even worse that the four correct states are not equal; the states of P-Bits in computing are trapped in two states of high probability rather than fluctuating among the four correct states as expected.
We employed the Monte Carlo method to perform a statistic analysis of the AND gate. To better evaluate the computational results, we utilized the Kullback–Leibler divergence (KLD) [41]:
(4)
where the is the designed probability distribution, and is the probability distribution calculated from Ising computer. The KLD measures the similarity between two probability distributions. A smaller KLD value indicates a better quality of the computed results. As shown in Figure 6, the horizontal axis represents the standard deviation of , while the vertical axis represents the standard deviation of . Each pair of different is sampled 1000 times, and their KLDs are averaged and plotted as one point in the figure. As illustrated in Figure 6, a small standard deviation of results in a significantly large KLD for the AND gate computation, while the tolerance for is higher. When the standard deviation of is less than 0.5, the computation can ensure sufficient accuracy. Additionally, it has been indicated that the device variation measured from experiments without correction exceeds the range shown in Figure 6.The variations in the real P-Bit array lead to the wrong results in Ising computing. As the scale of solved problem expands, the negative effect of P-Bit array variation on Ising computation will accumulate. The main drawback in hardware implementation of Ising computing is that the P-Bit devices have to be calibrated individually to get the same Sigmoid curves, which is not practical for a large array. A novel approach is required for addressing the P-Bit device variation, instead of expensively calibrating one by one.
2.2. Transferable Compensation for P-Bit Array Variation by Rederiving Weight Matrix
In this paper, we first propose the weight compensation method for addressing the P-Bit array variation. This method compensates the P-Bit device variation by rederiving the weight matrix instead of calibrating each P-Bit device individually. In Figure 7, the process of the weight matrix compensation is displayed. For the ideal P-Bit, the weight matrix is only calculated from the Ising Hamiltonian corresponding to the problem, while for the real P-Bit array, it is necessary to integrate extra parameters into the weight matrix to mitigate the variations inherent in the P-Bit device array.
In accordance with Equations (1) and (2), achieving identical output probabilities for P-Bit i in both the real P-Bit array and the ideal P-Bit array requires
(5)
therefore, the applied input voltage must adhere to the subsequent relationship:(6)
through a straightforward transformation, we can obtain the following:(7)
In the context of a whole P-Bit array, the above equation can be articulated in a matrix representation:
(8)
where , , and V are vectors, and n is the number of P-Bits in computing. The is the diagonal matrix generated from vector . Equation (3) can also be expressed in matrix form:(9)
where the V, m, and B are vectors, and W is weight matrix. W and B are calculated from the Ising Hamiltonian based on the solved problem. Upon the substitution of Equation (9) into Equation (8), the resulting expression is as follows:(10)
therefore, we can derive new W and B for compensating the P-Bit array variation, denoting them as(11)
through the aforementioned deductions, we have obtained the rederived weight and bias matrices crucial for ensuring the precise Ising computation of the real P-Bit array. We define two compensation parameters: , . The rederived weight matrix and bias matrix can be written as(12)
To verify the correctness of the weight matrix compensation, we initially conduct experimental measurements of the Sigmoid curves for three P-Bits and utilize a behavioral model to fit these data, thereby obtaining distinct values of and . Then, we alter the weight matrix for computing the AND gate by integrating and . As shown in Figure 5c, with the weight matrix compensation method, the real P-Bit array can calculate accurate results in the AND gate, and the results are similar to that of ideal P-Bit array without variations, which proves the reliability and rightness of the weight compensation.
It is noteworthy that the compensation parameters and solely relate to P-Bit variations, rather than the weight matrix. This implies that even with changes in the solving problem, these compensation parameters remain unchanged. In essence, it eliminates the necessity to retrain the weight matrix. The prerequisite is merely obtaining the P-Bit variation parameters, and , owing to our consideration of these variations as inherent and immutable attributes of P-Bit devices. This demonstrates the transferability of the weight compensation method across diverse problem domains.
It has been certified that weight compensation enables the real P-Bit array to compute similarly to the ideal P-Bit array without calibrating Sigmoid curves one by one or retraining the weight matrix. However, the most challenging key point lies in how to efficiently extract the and from the large P-Bit array. Although calibration is not required, measuring and fitting the variations of P-Bit array are also complicated tasks. We seek to perform an expeditious and precise extraction of variations within an extensive P-Bit array.
3. Simultaneous Device Variation Extraction for Whole P-Bit Array
Boltzmann machine learning plays a significant role in uncovering and analyzing, in particular, complex data distributions. The Boltzmann machine can be mapped to the hardware system for the P-Bit, and it trains the weight matrix W for the given data distribution. The learning rule can be written as follows [36]:
(13)
Here, is the average correlation between the and P-Bits in the given data distribution, and is the correlation of the outputs sampled from P-Bit i and P-Bit j in Ising computing, is the regulation parameter, and is the learning rate. In the training process, the will be learned from , while the in the weight matrix corresponds to .Equation (13) can only train the required weight matrix for a specific problem. For another problem, it needs to retrain the other weight matrix, which requires more time consumption, and there is a lack of transferability and scalability. However, our objective is to extract the and of each P-Bit device in the real array just once, thinking of them as the inherent attributes of each P-Bit, then use the weight compensation method to solve diverse unconventional problems. Such a process is transferable without complicatedly training again.
3.1. Automatic Extraction Algorithm Based on Boltzmann Machine Learning
To extract the P-Bit variation by Boltzmann machine learning, we need to observe and analyze the impact of and on the data distributions in Ising computation. Let us treat first. As shown in Figure 8, the red Sigmoid curve represents the ideal P-Bit, the black curve is P-Bit 1, and the blue curve is P-Bit 2. When , the ideal Sigmoid curve rigidly shifts to the left; all points of this curve have also moved upward for the input voltage. Consequently, the increases from to . Conversely, when , the will decrease from to . Based on this observation, in instances where the is larger than during the computational process, it is recommended to diminish the in the learning, and vice versa.
As depicted in Figure 9, the of the ideal P-Bit is 1, the Sigmoid curve of P-Bit 1 with is stretched, and the Sigmoid curve of P-Bit 2 with is compressed. When , the interaction between P-Bit i and P-Bit k in Ising computing is positive, then the state of two P-Bits are the same direction. For P-Bit 1, , the absolute value of the decreased, which resulted in the decrease in from . For P-Bit 2, , the absolute value of the increased, which resulted in the increase in from . Building on this insight, in cases where the is larger than , it is advisable to reduce the in the Boltzmann machine learning. Furthermore, in cases where the is smaller than , should be increased in the Boltzmann machine learning. When , the above observation will be opposite.
Based on the analysis above, we propose the extraction algorithm for P-Bit array variation. As shown in Figure 10, firstly, the density matrix D is calculated as follows [36]:
(14)
where X is the dataset corresponding to the problem, and d is the order of the matrix. Density matrix D describes the average probability distribution of the correlation between P-Bits. For instance, the is actually the average correlation in Equation (13), which signifies that the desired ideal value of in Boltzmann machine learning is .From the perspective of a whole P-Bit array, the will affect the correlation between the P-Bit i and other P-Bits; therefore, the compensation parameter will be learned from the sum of average correlation , while the compensation parameter is learned from . After learning of the compensation parameter reaching convergence, the variation of each P-Bit device in the array can be extracted as , .
3.2. Inefficiency Extraction Based on Traditional Logic Gate
To verify the feasibility of the proposed extraction algorithm, we performed the extraction of the P-Bit array variations on the AND gate. Initially, we can calculate the density matrix D for the AND gate based on the truth table in Table 1 and Equation (14).
(15)
The density matrix D describes the ideal data distribution for the AND gate. In the learning of and , our aim is to align the sampled data distribution in Ising computing with the ideal distribution by updating the compensation parameters and . Finally, we can obtain the and .
Figure 11 displays the outcomes of our proposed extraction algorithm applied to an AND gate. We randomly choose three fabricated SOT P-Bits of different variations. As depicted in Figure 11a, corresponds to P-Bit i, and the dash line means the real value of P-Bit variation. The training trends of and are correct, and the extracted values of and are somewhat close to their respective real values. However, the extracted is inaccurate. The extracted results of P-Bit 1 and P-Bit 2 shown in Figure 11b are relatively aligned with the real value, while the inaccuracy of extracted has a detrimental impact on the , causing the extracted to slightly deviate from the real value. It can be explained that the is calculated as .
The proposed extraction algorithm has demonstrated effectiveness in the AND gate. However, it has not yet reached a high level of accuracy. The performance of extracted is subpar, which significantly influences the training process for . We shall carefully examine the reasons for the inaccurate training of in the AND gate.
In the training rule of displayed in Figure 10, is learned from all in the row of density matrix. Nevertheless, when in weight matrix comprises both positive and negative values, the training of will be affected. This occurs due to the contrasting training directions of and as analyzed in Figure 9.
Based on the preceding analysis, the weight matrix from the Ising Hamiltonian will notably influence the extraction of and . As a consequence, a suitable Ising Hamiltonian for the proposed extraction algorithm should be meticulously constructed. Firstly, the Hamiltonian we construct needs to be scalable for extending the variation extraction algorithm to larger P-Bit arrays. Secondly, the corresponding density matrix of this Hamiltonian should be sparse to facilitate simpler and more accurate extraction. From the density matrix of the AND gate, we observe reduced accuracy in extraction when the density matrix is uneven. Hence, finally, a homogeneous density matrix is essential, showcasing advantages in training convergence rate.
3.3. 3D Ferromagnetic Hamiltonian Model for Scalable Large P-Bit Array
The Ising model, renowned for its depiction of ferromagnetism, prompts a pertinent inquiry: can this ferromagnetic model be harnessed for variation extraction purposes? Drawing inspiration from this, we have formulated a succinct and scalable ferromagnetic model. Within this constructed regular ferromagnetic lattice, spins exclusively interact with their immediate neighboring spins, ensuring the overall sparsity of the system. Notably, the density matrix associated with our developed ferromagnetic model adheres to uniformity required in the aforementioned criteria.
The constructed ferromagnetic Hamiltonian model has three categories of dimension: 1D, 2D, and 3D. Figure 12 displays three connection types for the ferromagnetic model. All P-Bit states in the ferromagnetic model exhibit a uniform orientation, and each P-Bit simultaneously interacts with adjacent P-Bits. For example, the P-Bit in the 3D ferromagnetic model has six adjacent P-Bits. The coupling strength J between P-Bits is the same, which means that every P-bit is equivalent. The right side gives the weight matrix and density matrix for the ferromagnetic model. In the weight matrix, the total number of 1 in each row is , which corresponds to the number of adjacent P-Bits for each P-Bit. The 1 which situated on the last position of the anti-diagonal in the matrix signifies the presence of periodic boundary conditions. The bias B on the diagonal of matrix are all 0. For the calculated density matrix, the total number of 1 in each row is , since the density matrix has an order higher than the weight matrix, and the diagonal of density matrix is 1. The last column in matrix is 0, which infers that the ideal mean value of P-Bit state in the ferromagnetic Hamiltonian model is 0.
The three types of dimension for the ferromagnetic Hamiltonian model we have developed are meticulously designed to address the critical aspect centered on the variation extraction algorithm highlighted in the preceding analysis. However, ascertaining the dimensionality of the ferromagnetic model that is most compatible with the extraction algorithm requires empirical verification. To initiate this validation process, we measured the standard deviation of the variation for 600 fabricated P-Bit devices. The standard deviation of is approximately 0.3, and for , it is around 1 V. Utilizing this level of P-Bit variation, we proceeded to generate a substantial number of P-Bit devices with diverse variations in simulations. In these simulations, the follows a Gaussian distribution with a mean value of 1 and a standard deviation of 0.3, while that of the are 0 and 1 V.
As shown in Figure 13, based on these P-Bit devices, we constructed three kinds of P-Bit arrays including 1000 devices, respectively, utilizing 1D, 2D, and 3D ferromagnetic models (FM) to perform variation extraction. In Figure 13, different colors correspond to different P-Bits, where the solid lines demonstrate the evolving trends of extracted and with the number of iterations, while the dash lines represent their real values. For the 1D-FM, there is a moderate deviation between the extracted values of and the real values, while the extraction direction is accurate. The extracted values of are relatively close to the real values. In the case of 2D-FM, the training outcomes for the difficult are satisfactory, with the extracted variation being relatively consistent with the real value. Furthermore, the training result for is highly precise, nearly identical to the real value. With regard to 3D-FM, the learning results are further improved compared to 2D-FM. The extracted values of are almost identical to the real value, which is remarkable for the challenging extraction of . As for , the extracted values perfectly align with the real values.
We conducted a detailed analysis of the statistical standard deviation exhibited by the extraction algorithm across various dimensional ferromagnetic models. As illustrated in Table 2, the implementation of the extraction algorithm based on the ferromagnetic model notably mitigates device variation levels within extensive P-Bit arrays. Of particular significance is the 3D-FM model, which achieves a noteworthy reduction in the standard deviation of to an impressively low 0.032 and to an exceptionally minimal 0.034 V. The specific Sigmoid curves before and after extraction are further provided. As shown in Figure 14, the distribution of curves without extraction is quite broad, whereas the extraction based on 3D-FM can reduce the variation level to a sufficiently low range. As shown at the orange point in Figure 6, this variation can achieve accurate Ising computing. After the above comparison, the final results indicate that the extraction algorithm based on 3D-FM can most accurately extract and .
3.4. Discussions of the Automatic Extraction Algrithm
Upon scrutinizing the variation extraction outcomes performed on P-Bit array in the aforementioned models, a noteworthy observation emerges. Specifically, the extraction challenge associated with is markedly more formidable than that associated with . This discrepancy can be attributed to the inherent nature of their relationships with input voltage V, as delineated in Equation (2). It becomes evident that exhibits a multiplicative dependence on , while manifests an additive relationship with the same variable. Delving into the intricacies from the perspective of the weight matrix, the singularly impacts the corresponding to P-Bit i, whereas exerts influence across all the in the row of the weight matrix. This distinction in the impact scope between and underlies the observed difference in extraction difficulty. Given the direct influence of extraction on in the training process, achieving precise extraction becomes crucial.
However, within weight matrix of the AND gate, the inconsistency of results in a non-uniform influence of on weights. This factor collectively affects the training of . Challenges emerge due to the inherent extraction inaccuracies of and the non-scalable nature ingrained within the AND gate. Although addressing the accuracy of training within the AND gate is achievable, achieving seamless scalability presents challenges. Cascading logic gates amplify the complexity of the weight matrix, rendering the training process notably intricate.
Building upon the aforementioned analysis, we have devised the ferromagnetic model characterized by uniform W, B, and inherent scalability. Through the training of 1D, 2D, and 3D ferromagnetic models, a consistent improvement in training effectiveness is observed with the increasing dimensionality, particularly notable in the case of 3D-FM. In this scenario, the proposed extraction algorithm simultaneously and adeptly captures all variations in P-Bits.
The enhanced performance in higher-dimensional ferromagnetic models can be attributed to their increased stability. According to the Landau theorem, 1D ferromagnetic materials are significantly influenced by thermal fluctuations, thus making it challenging for them to exhibit ferromagnetism. In 2D ferromagnetic materials, the quasi-long-range order is lost above the critical temperature. In contrast, the 3D ferromagnetic materials possess higher degrees of freedom, experiencing less influence from entropy effects, thus exhibiting greater stability. In a more stable 3D ferromagnetic Hamiltonian model, the extraction of becomes notably clearer and more accurate.
It is worth noting that in a digital system involving both read and write processes, our method may not be more efficient than direct exponential fitting. However, in an analog system, the situation changes because convenient read and write operations are absent. Moreover, digital systems have two significant drawbacks compared to analog systems. (1) Power consumption: Digital systems using FPGA and DAC consume considerable power. (2) Speed: Digital systems require clocks for read and write operations, which means that the flipping speed of P-Bit devices is limited by the FPGA clock. The speed advantage of stochastic MTJ (ns) might be compromised by the read operations in digital systems (see Supplementary Materials in [40]). In contrast, an analog Ising system composed of resistors/capacitances networks [42,43] or tunable logic gates will have lower energy consumption as it does not require application-specific integrated circuits (ASICs) and high-precision DACs. Regarding speed, the advantage of ultra-fast stochastic MTJs [44] will be realized because the spins in an analog system can update asynchronously. The local minima in Ising computations will be mitigated by the speed differences between MTJs [29,30]. However, analog Ising systems face calibration challenges since the Sigmoid curve of P-Bits cannot be acquired directly. Therefore, our method is designed to be more suitable for analog Ising systems, where read and write operations are not feasible and direct exponential fitting cannot be performed. In such cases, our method helps extract the individual variations of each device and uses a weight compensation algorithm for correction, thereby achieving accurate results in Ising computing.
4. Transferable and Scalable Ising Computations on Large P-Bit Array with Variations
In addressing the challenge of device-to-device variations on P-Bit array within Ising computations, our proposed approach involves the utilization of both a weight matrix compensation method and an variation extraction algorithm. A crucial sequence for practical implementation becomes apparent: Extraction and Compensation. Specifically, for a substantial P-Bit array with variation, we initially employ the proposed extraction algorithm to extract the variations of each P-Bit, and store them for later use. Owing to the simplicity and scalability of the 3D ferromagnetic model employed in the extraction algorithm, the extraction based P-Bit array can be greatly expanded. Given the invariant Sigmoid curve of P-Bit devices, their and remain constant. In essence, after performing the extraction algorithm once, we can consider the variations as inherent attributes of the P-Bits. Subsequently, we can utilize these variations in conjunction with weight matrix compensation to tackle diverse combinatorial optimization problems. Due to the transferability inherent in the weight compensation method, repetitive training is unnecessary. By incorporating specific compensation parameters denoted as and into the weight matrix, the real P-Bits with variations can be treated as ideal P-Bits without variations for computational purposes. Hence, we can achieve transferable and scalable Ising computation using a large P-Bit array with variation. In the following, we will address the solutions to the two classical problems in probabilistic computation, TSP and semi-prime factorization. As the scale increases, the variation requirements for P-Bit devices also become more stringent. Therefore, validating our algorithm on these two large-scale problems further demonstrates its effectiveness and superiority.
4.1. Successful Solution for 16-City Traveling Salesman Problem
To validate the feasibility, transferability, and scalability of our proposed extraction algorithm and weight compensation method, we employed a large-scale P-Bit array to solve the traveling salesman problem (TSP). It is a challenging NP-complete problem that aims at finding the shortest path to visit N cities. The Ising Hamiltonian of TSP can be written as follows [1]:
(16)
where m is the visiting matrix, and means that, at the time, the selection is made to visit city v. The first constraint indicates that a city can be visited only once; the second constraint implies that each visit should involve only a single city. The third item concerns the calculation of the travel path length, with serving as the penalty coefficient, and is the distance between city u and city v. It is worth noting that this constraint should take into account the distance between the starting city and the first chosen city, as well as the distance between the last visited city and the starting city, given that the TSP requires returning to the starting point eventually.For a N-city TSP with given starting point, calculating the optimal tourist route requires P-Bits. As illustrated in Figure 15, we computed a TSP involving 16 cities. Based on the P-Bit array with variations generated in simulations (see Section 3.2), we constructed a 3D ferromagnetic Hamiltonian model, then extracted the variation of each P-Bit from the 225-bit array. By employing the weight compensation method, we could mitigate the device-to-device variations in the computation. With an annealing technique, we finally obtained the correct route.
In Figure 15a,c,e,g, the represents the inverse temperature. The red point means the starting city is (0, 0). In observing the Automatic Extraction and Compensation P-Bit array, it becomes apparent that, with the progression of annealing, the increment in gradually directs the salesman to visit each city while progressively reducing the total tour length. Eventually, as the approaches 5, the Ising system converges to find the shortest and compliant tour length. For comparison, Figure 15g illustrates the inability of the real array with variations to produce the correct outcome when is set to 5. This highlights the efficacy and precision of our proposed Automatic Extraction and Compensation algorithm. Figure 15b,d,f,h displays the respective visiting matrix m. Figure 16 shows the evolution of the total energy during the annealing process performed by three types of P-Bit array, real, ideal, and Automatic Extraction and Compensation, where the insert figure displays the trend of with the epochs. We note that the variations present in the real P-Bit array tend to stagnate the Ising system in a local minimum. Nonetheless, the ideal P-Bit array and the real P-Bit array with the Automatic Extraction and Compensation algorithm can facilitate the system’s annealing to the global minimum.
We further evaluated the average path energy of TSP-16 over 100 trials across three different P-Bit arrays. As shown in Table 3, the average path energy differs markedly before and after applying our automatic correction method. The average path energy from the P-Bit array using the Automatic Extraction and Compensation method can be reduced to nearly match that of the ideal array.
4.2. Successful Solution for 21-Bit Integer Factorization
To demonstrate the effectiveness of our proposed algorithm across multiple problems, we further performed a 21-bit semi-prime factorization calculation on a large P-Bit array with variations, successfully factoring 2,035,153. As shown in Figure 17, our proposed Automatic Extraction and weight compensation algorithm can reduce the variation of the P-Bit device array sufficiently to complete the challenging task of semi-prime factorization. This factorization was achieved using the invertible logic method [3,6], which clamps the output of the multiplier to perform semi-prime factorization. According to the invertible logic method, Figure 17a presents the coupling matrix for the factorization of a 21-bit semi-prime 2,035,153. Additionally, we employed the Parallel Tempering (PT) method [45,46], generating multiple replicas at different temperatures to search for the optimal state. As shown in Figure 17b, periodically, these replicas were swapped. These swap stages enable the whole Ising system to escape local optima, ensuring that the lowest energy state operates at the lowest temperature (highest ). The swap probability between two replicas is written as follows [47]:
(17)
where , are the energies at inverse temperature , . When the energy E is higher at a higher , a lower energy state will be exchanged at this . Figure 17c shows the energy trends of the replicas at different , where different colors represent the energy states at different values. The Ising system continuously searches and swaps low-energy states to the higher , ultimately ensuring that the lowest energy state is maintained at the highest . Figure 17d presents the values of factors A and B during the PT process. After approximately 35,000 iterations, the correct factorization was achieved, 2,035,153 .The effectiveness of our algorithm has been validated in both the challenging 21-bit semi-prime factorization and the TSP-16 problem performed on a large P-Bit array with variations. It has been demonstrated that the Automatic Extraction and Compensation algorithm can reduce the variations in the P-Bit array to a sufficiently low level to accurately perform complex computations.
The obtained results validate the efficacy of the proposed extraction algorithm and weight compensation method. The individual calibration needs of P-Bit devices are addressed through weight compensation. Due to the transferability inherent in this method, the weight matrix rederivation enables the treatment of real P-Bits as ideal ones for diverse problems, eliminating the necessity for weight matrix retraining. Furthermore, the extraction algorithm, coupled with 3D-FM, accurately extracts all variations within the P-Bit array, demonstrating scalability across array sizes. By clearly following an Extraction–Compensation sequence, a transferable (for any problem) and scalable (for any array size) Ising computation on the P-Bit array with variations is facilitated.
5. Conclusions
In this work, we initially proposed the behavioral model incorporating and which fits our fabricated SOT P-Bit devices well. Then we introduced the Automatic Extraction and Compensation algorithm. The Extraction step can extract the variations of all the P-Bit devices across the whole large P-Bit array simultaneously, while the Compensation step mitigates the device variations by modifying the weight matrix according to the extracted variations. Finally, we employed the developed Extraction–Compensation process to facilitate accurate solutions for both 16-city TSP spanning 225 bits and 21-bit semi-prime factorization, thereby confirming the efficiency, transferability, and scalability of our algorithm. The term ’efficiency’ indicates that individual calibration for P-Bits with variations is not required. The term ’transferability’ means that weight matrix retraining for diverse problems is unnecessary. ’Scalability’ refers to the capability of expanding the array scale. The Automatic Extraction and Compensation algorithm enables the accurate and efficient completion of Ising computations on large P-Bit arrays with variations, therefore improving the efficiency, accessibility, and widespread applicability of probabilistic computers.
Conceptualization, B.Z. and Y.L.; Methodology, B.Z.; Validation, B.Z., Y.L., T.G., J.Y. and Z.G.; Investigation, Y.L., T.G., J.Y. and Z.G.; Writing—original draft, B.Z.; Writing—review & editing, D.Z. and L.Z.; Project administration, D.Z. and L.Z.; Funding acquisition, D.Z. and L.Z. All authors have read and agreed to the published version of the manuscript.
The datasets generated during and/or analyzed during the current study are available from the corresponding author upon reasonable request.
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1. The general overview of fabricated the SOT P-Bit device. (a) The deposited structure of the stack. (b) The SEM image of the SOT MTJ. (c) The operation mechanism of the SOT P-Bit. The double-wall energy barrier can be modulated by SOT voltage.
Figure 2. The behavioral model can fit the experimentally measured output probability curve very well.
Figure 3. The experimental Sigmoid curves fitted by the modified behavioral model exhibit large intrinsic variation which includes [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.]. [Forumla omitted. See PDF.] describes stretching or compression in shape and [Forumla omitted. See PDF.] represents rigid shift.
Figure 4. The AND gate comprises three P-Bits, and the weight matrix for the AND gate is provided on the right.
Figure 5. Under three different P-Bit configurations, the computational outcomes of the AND gate are depicted, with red denoting the correct state and blue representing the wrong state. (a) The ideal P-Bit array successfully completes the computation, with an evenly distributed occurrence of the four correct states. (b) The real P-Bit array with variations fails to accurately complete the computation, displaying a significantly low occurrence of three correct states. (c) The real P-Bit array with weight compensation achieves accurate computation akin to the ideal P-Bit array.
Figure 6. The Monte Carlo analysis of [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] on the KLD for the AND gate. Each pair of different [Forumla omitted. See PDF.] has been sampled 1000 times, and their KLDs are averaged.
Figure 7. The process of weight matrix compensation. To achieve the same output probability, it is necessary to satisfy that [Forumla omitted. See PDF.]. After the derivation, the new weight and bias matrices incorporating compensation parameters [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] enable the real P-Bits to compute like the ideal P-Bits.
Figure 8. The impact of [Forumla omitted. See PDF.] observed on the mean value of P-Bit in Ising computation.
Figure 9. The impact of [Forumla omitted. See PDF.] observed on the correlation between P-Bits in Ising computation.
Figure 10. Proposed extraction algorithm for P-Bit array variation with given Ising Hamiltonian. It is necessary to initially compute the density matrix D and continuously update the compensation parameters to attain the desired distribution in the sampled Ising computation, ultimately obtaining [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.]. The learning rate [Forumla omitted. See PDF.] used in this algorithm is 0.0001, while the regulation parameter [Forumla omitted. See PDF.] is 0.05.
Figure 11. The learning results of presented extraction algorithm on the AND gate. The solid lines display the respective extracted value of P-Bits, while the dash lines represent the real value. (a) The extracted [Forumla omitted. See PDF.] for AND. (b) The extracted [Forumla omitted. See PDF.] for AND.
Figure 12. Constructed 1D, 2D, and 3D ferromagnetic Hamiltonian models for the extraction algorithm. Their uniform weight matrix and density matrix are provided on the right.
Figure 13. The learning results of the extraction algorithm employed on constructed 1D, 2D, and 3D ferromagnetic models. As the dimension of the ferromagnetic model increases, the accuracy of variation extraction gradually improves. The 3D-FM achieves flawless extraction, with the extracted values perfectly aligning with the real values.
Figure 14. The Sigmoid curves obtained from P-Bit arrays w/o correction and with extraction. According to Table 2, the [Forumla omitted. See PDF.] of the P-Bit array without extraction is ([Forumla omitted. See PDF.] V), while that of the P-Bit array with extraction based on 3D-FM is ([Forumla omitted. See PDF.] V).
Figure 15. The computing results of 16-city TSP solved by both the Automatic Extraction and Compensation P-Bit array and the real P-Bit array with variations. (a,c,e) The route of the 16-city TSP based on the Automatic Extraction and Compensation array from the annealing process at [Forumla omitted. See PDF.]. (g) The route of the real P-Bit array at [Forumla omitted. See PDF.]. As [Forumla omitted. See PDF.] gradually increased to a sufficient level, the Ising system ultimately computed the shortest and compliant tour length. For comparison, we observed that at [Forumla omitted. See PDF.], the real array failed to yield the correct results. (b,d,f,h) Their respective visiting matrix m.
Figure 16. The total energy of annealing process for 3 situations of the P-Bit array. The system of the real P-Bit array with variation is trapped in a local minimum, while the ideal P-Bit array and the real P-Bit array with the Automatic Extraction and Compensation algorithm can facilitate the systems annealing to the global minimum.
Figure 17. (a) The coupling matrix for the factorization of a 21-bit semi-prime. (b) The illustrator for Parallel Tempering. (c) The energy trends of the replicas at different [Forumla omitted. See PDF.]. (d) The trends of factors A and B with increasing iterations.
Truth table for AND gate.
A | B | | |
---|---|---|---|
0 | 0 | 0 | 0.25 |
0 | 1 | 0 | 0.25 |
1 | 0 | 0 | 0.25 |
1 | 1 | 1 | 0.25 |
On different dimensional ferromagnetic models, we measured the statistical standard deviation of the variation extraction algorithm. The ’w/o extraction’ signifies the initial deviation predetermined in our simulations. The statistical standard deviation of the extraction algorithm is computed as follows:
w/o Extraction | 1D-FM | 2D-FM | 3D-FM | |
---|---|---|---|---|
| 0.3 | 0.116 | 0.061 | 0.032 |
| 1 V | 0.143 V | 0.063 V | 0.034 V |
The average path energy of 100 trials for TSP16 across 3 P-Bit arrays. The average path energy from the Automatic Extraction and Compensation P-Bit array can be brought very close to that of the ideal array.
P-Bit Array | Ideal | Real | Automatic Extraction and Compensation |
---|---|---|---|
Average Path Energy | 17.837 | 34.080 | 18.080 |
References
1. Sutton, B.; Camsari, K.Y.; Behin-Aein, B.; Datta, S. Intrinsic optimization using stochastic nanomagnets. Sci. Rep.; 2017; 7, 44370. [DOI: https://dx.doi.org/10.1038/srep44370] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/28295053]
2. Borders, W.A.; Pervaiz, A.Z.; Fukami, S.; Camsari, K.Y.; Ohno, H.; Datta, S. Integer factorization using stochastic magnetic tunnel junctions. Nature; 2019; 573, pp. 390-393. [DOI: https://dx.doi.org/10.1038/s41586-019-1557-9] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/31534247]
3. Aadit, N.A.; Grimaldi, A.; Carpentieri, M.; Theogarajan, L.; Martinis, J.M.; Finocchio, G.; Camsari, K.Y. Massively parallel probabilistic computing with sparse Ising machines. Nat. Electron.; 2022; 5, pp. 460-468. [DOI: https://dx.doi.org/10.1038/s41928-022-00774-2]
4. Camsari, K.Y.; Salahuddin, S.; Datta, S. Implementing p-bits with embedded MTJ. IEEE Electron Device Lett.; 2017; 38, pp. 1767-1770. [DOI: https://dx.doi.org/10.1109/LED.2017.2768321]
5. Camsari, K.Y.; Sutton, B.M.; Datta, S. P-bits for probabilistic spin logic. Appl. Phys. Rev.; 2019; 6, 011305. [DOI: https://dx.doi.org/10.1063/1.5055860]
6. Camsari, K.Y.; Faria, R.; Sutton, B.M.; Datta, S. Stochastic p-bits for invertible logic. Phys. Rev. X; 2017; 7, 031014. [DOI: https://dx.doi.org/10.1103/PhysRevX.7.031014]
7. Yin, J.; Liu, Y.; Zhang, B.; Du, A.; Gao, T.; Ma, X.; Dong, Y.; Bai, Y.; Lu, S.; Zhuo, Y. et al. Scalable Ising computer based on ultra-fast field-free spin orbit torque stochastic device with extreme 1-bit quantization. Proceedings of the 2022 International Electron Devices Meeting (IEDM); San Francisco, CA, USA, 3–7 December 2022; IEEE: New York, NY, USA, 2022; pp. 36.1.1-36.1.4.
8. Wang, T.; Roychowdhury, J. Oscillator-based Ising machine. arXiv; 2017; arXiv: 1709.08102
9. Dutta, S.; Khanna, A.; Assoa, A.; Paik, H.; Schlom, D.G.; Toroczkai, Z.; Raychowdhury, A.; Datta, S. An Ising Hamiltonian solver based on coupled stochastic phase-transition nano-oscillators. Nat. Electron.; 2021; 4, pp. 502-512. [DOI: https://dx.doi.org/10.1038/s41928-021-00616-7]
10. Albertsson, D.I.; Zahedinejad, M.; Houshang, A.; Khymyn, R.; Åkerman, J.; Rusu, A. Ultrafast Ising Machines using spin torque nano-oscillators. Appl. Phys. Lett.; 2021; 118, 112404. [DOI: https://dx.doi.org/10.1063/5.0041575]
11. Houshang, A.; Zahedinejad, M.; Muralidhar, S.; Chȩciński, J.; Khymyn, R.; Rajabali, M.; Fulara, H.; Awad, A.A.; Dvornik, M.; Åkerman, J. Phase-binarized spin Hall nano-oscillator arrays: Towards spin Hall Ising machines. Phys. Rev. Appl.; 2022; 17, 014003. [DOI: https://dx.doi.org/10.1103/PhysRevApplied.17.014003]
12. McGoldrick, B.C.; Sun, J.Z.; Liu, L. Ising machine based on electrically coupled spin Hall nano-oscillators. Phys. Rev. Appl.; 2022; 17, 014006. [DOI: https://dx.doi.org/10.1103/PhysRevApplied.17.014006]
13. Zhang, B.; Lin, Z.; Liu, Y.; Wang, Y.; Zhang, D.; Gao, T.; Zeng, L. Compact Programmable True Random Number Generator Based on Spin Torque Nano-Oscillator. IEEE Trans. Nanotechnol.; 2022; 21, pp. 648-654. [DOI: https://dx.doi.org/10.1109/TNANO.2022.3216844]
14. Woo, K.S.; Kim, J.; Han, J.; Kim, W.; Jang, Y.H.; Hwang, C.S. Probabilistic computing using Cu0. 1Te0. 9/HfO2/Pt diffusive memristors. Nat. Commun.; 2022; 13, 5762. [DOI: https://dx.doi.org/10.1038/s41467-022-33455-x] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/36180426]
15. Rhee, H.; Kim, G.; Song, H.; Park, W.; Kim, D.H.; In, J.H.; Lee, Y.; Kim, K.M. Probabilistic computing with NbOx metal-insulator transition-based self-oscillatory pbit. Nat. Commun.; 2023; 14, 7199. [DOI: https://dx.doi.org/10.1038/s41467-023-43085-6] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/37938550]
16. Wang, Z.; Marandi, A.; Wen, K.; Byer, R.L.; Yamamoto, Y. Coherent Ising machine based on degenerate optical parametric oscillators. Phys. Rev. A; 2013; 88, 063853. [DOI: https://dx.doi.org/10.1103/PhysRevA.88.063853]
17. Marandi, A.; Wang, Z.; Takata, K.; Byer, R.L.; Yamamoto, Y. Network of time-multiplexed optical parametric oscillators as a coherent Ising machine. Nat. Photonics; 2014; 8, pp. 937-942. [DOI: https://dx.doi.org/10.1038/nphoton.2014.249]
18. Inagaki, T.; Haribara, Y.; Igarashi, K.; Sonobe, T.; Tamate, S.; Honjo, T.; Marandi, A.; McMahon, P.L.; Umeki, T.; Enbutsu, K. et al. A coherent Ising machine for 2000-node optimization problems. Science; 2016; 354, pp. 603-606. [DOI: https://dx.doi.org/10.1126/science.aah4243]
19. McMahon, P.L.; Marandi, A.; Haribara, Y.; Hamerly, R.; Langrock, C.; Tamate, S.; Inagaki, T.; Takesue, H.; Utsunomiya, S.; Aihara, K. et al. A fully programmable 100-spin coherent Ising machine with all-to-all connections. Science; 2016; 354, pp. 614-617. [DOI: https://dx.doi.org/10.1126/science.aah5178]
20. Honjo, T.; Sonobe, T.; Inaba, K.; Inagaki, T.; Ikuta, T.; Yamada, Y.; Kazama, T.; Enbutsu, K.; Umeki, T.; Kasahara, R. et al. 100,000-spin coherent Ising machine. Sci. Adv.; 2021; 7, eabh0952. [DOI: https://dx.doi.org/10.1126/sciadv.abh0952]
21. Takemoto, T.; Hayashi, M.; Yoshimura, C.; Yamaoka, M. A 2×30k-Spin Multi-Chip Scalable CMOS Annealing Processor Based on a Processing-in-Memory Approach for Solving Large-Scale Combinatorial Optimization Problems. IEEE J. Solid-State Circuits; 2019; 55, pp. 145-156. [DOI: https://dx.doi.org/10.1109/JSSC.2019.2949230]
22. Yamaoka, M.; Yoshimura, C.; Hayashi, M.; Okuyama, T.; Aoki, H.; Mizuno, H. 24.3 20k-spin Ising chip for combinational optimization problem with CMOS annealing. Proceedings of the 2015 IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers; San Francisco, CA, USA, 22–26 February 2015; IEEE: New York, NY, USA, 2015; pp. 1-3.
23. Su, Y.; Kim, H.; Kim, B. CIM-spin: A scalable CMOS annealing processor with digital in-memory spin operators and register spins for combinatorial optimization problems. IEEE J. Solid-State Circuits; 2022; 57, pp. 2263-2273. [DOI: https://dx.doi.org/10.1109/JSSC.2021.3139901]
24. Su, Y.; Kim, T.T.H.; Kim, B. Flexspin: A scalable CMOS Ising machine with 256 flexible spin processing elements for solving complex combinatorial optimization problems. Proceedings of the 2022 IEEE International Solid-State Circuits Conference (ISSCC); San Francisco, CA, USA, 20–26 February 2022; IEEE: New York, NY, USA, 2022; Volume 65, pp. 1-3.
25. Bae, J.; Oh, W.; Koo, J.; Yu, C.; Kim, B. CTLE-Ising: A Continuous-Time Latch-Based Ising Machine Featuring One-Shot Fully Parallel Spin Updates and Equalization of Spin States. IEEE J. Solid-State Circuits; 2023; 59, pp. 173-183. [DOI: https://dx.doi.org/10.1109/JSSC.2023.3318586]
26. Shin, J.H.; Jeong, Y.J.; Zidan, M.A.; Wang, Q.; Lu, W.D. Hardware acceleration of simulated annealing of spin glass by RRAM crossbar array. Proceedings of the 2018 IEEE International Electron Devices Meeting (IEDM); San Francisco, CA, USA, 1–5 December 2018; IEEE: New York, NY, USA, 2018; pp. 3.3.1-3.3.4.
27. Abeed, M.A.; Bandyopadhyay, S. Low energy barrier nanomagnet design for binary stochastic neurons: Design challenges for real nanomagnets with fabrication defects. IEEE Magn. Lett.; 2019; 10, 4504405. [DOI: https://dx.doi.org/10.1109/LMAG.2019.2929484]
28. Rahman, R.; Bandyopadhyay, S. The Strong Sensitivity of the Characteristics of Binary Stochastic Neurons Employing Low Barrier Nanomagnets to Small Geometrical Variations. IEEE Trans. Nanotechnol.; 2023; 22, pp. 112-119. [DOI: https://dx.doi.org/10.1109/TNANO.2023.3244139]
29. Zhang, B.; Liu, Y.; Gao, T.; Zhang, D.; Zhao, W.; Zeng, L. Time Division Multiplexing Ising Computer Using Single Tunable True Random Number Generator Based on Spin Torque Nano-Oscillator. Proceedings of the 2021 IEEE International Electron Devices Meeting (IEDM); San Francisco, CA, USA, 11–16 December 2021; IEEE: New York, NY, USA, 2021; pp. 27.6.1-27.6.4.
30. Liu, Y.; Gao, T.; Zhang, B.; Wang, Y.; Zhang, D.; Zeng, L. Time-division multiplexing Ising computer using single stochastic magnetic tunneling junction. IEEE Trans. Electron Devices; 2022; 69, pp. 4700-4707. [DOI: https://dx.doi.org/10.1109/TED.2022.3184651]
31. Lv, Y.; Bloom, R.P.; Wang, J.P. Experimental demonstration of probabilistic spin logic by magnetic tunnel junctions. IEEE Magn. Lett.; 2019; 10, 4510905. [DOI: https://dx.doi.org/10.1109/LMAG.2019.2957258]
32. International Neural Network Society (INNS)IEEE Neural Network Council Cooperating Societies Osborn, T.R. Fast teaching of Boltzmann machines with local inhibition. Proceedings of the International Neural Network Conference, Palais Des Congres; Paris, France, 9–13 July 1990; Springer: Dordrecht, The Netherlnds, 1990; 785.
33. Salakhutdinov, R.; Hinton, G. Deep Boltzmann machines. Proceedings of the Artificial Intelligence and Statistics, PMLR; Clearwater Beach, FL, USA, 16–18 April 2009; pp. 448-455.
34. Tsai, C.H.; Yu, W.J.; Wong, W.H.; Lee, C.Y. A 41.3/26.7 pJ per neuron weight RBM processor supporting on-chip learning/inference for IoT applications. IEEE J. Solid-State Circuits; 2017; 52, pp. 2601-2612. [DOI: https://dx.doi.org/10.1109/JSSC.2017.2715171]
35. Patel, S.; Canoza, P.; Salahuddin, S. Logically synthesized and hardware-accelerated restricted Boltzmann machines for combinatorial optimization and integer factorization. Nat. Electron.; 2022; 5, pp. 92-101. [DOI: https://dx.doi.org/10.1038/s41928-022-00714-0]
36. Kaiser, J.; Borders, W.A.; Camsari, K.Y.; Fukami, S.; Ohno, H.; Datta, S. Hardware-aware in situ learning based on stochastic magnetic tunnel junctions. Phys. Rev. Appl.; 2022; 17, 014016. [DOI: https://dx.doi.org/10.1103/PhysRevApplied.17.014016]
37. Ackley, D.H.; Hinton, G.E.; Sejnowski, T.J. A learning algorithm for Boltzmann machines. Cogn. Sci.; 1985; 9, pp. 147-169.
38. Neal, R.M. Connectionist learning of belief networks. Artif. Intell.; 1992; 56, pp. 71-113. [DOI: https://dx.doi.org/10.1016/0004-3702(92)90065-6]
39. Faria, R.; Kaiser, J.; Camsari, K.Y.; Datta, S. Hardware design for autonomous bayesian networks. Front. Comput. Neurosci.; 2021; 15, 584797. [DOI: https://dx.doi.org/10.3389/fncom.2021.584797] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33762919]
40. Si, J.; Yang, S.; Cen, Y.; Chen, J.; Huang, Y.; Yao, Z.; Kim, D.J.; Cai, K.; Yoo, J.; Fong, X. et al. Energy-efficient superparamagnetic Ising machine and its application to traveling salesman problems. Nat. Commun.; 2024; 15, 3457. [DOI: https://dx.doi.org/10.1038/s41467-024-47818-z]
41. Kullback, S.; Leibler, R.A. On information and sufficiency. Ann. Math. Stat.; 1951; 22, pp. 79-86. [DOI: https://dx.doi.org/10.1214/aoms/1177729694]
42. Dutta, S.; Khanna, A.; Gomez, J.; Ni, K.; Toroczkai, Z.; Datta, S. Experimental demonstration of phase transition nano-oscillator based Ising machine. Proceedings of the 2019 IEEE International Electron Devices Meeting (IEDM); San Francisco, CA, USA, 7–11 December 2019; IEEE: New York, NY, USA, 2019; pp. 37-38.
43. Danouchi, K.; Soumah, L.; Bouchard, C.; Disdier, F.; Fassatoui, A.; Phan, N.T.; Ezzadeen, M.; Delaet, B.; Viala, B.; Prenat, G. et al. Designing networks of resistively-coupled stochastic Magnetic Tunnel Junctions for energy-based optimum search. Proceedings of the 2023 International Electron Devices Meeting (IEDM); San Francisco, CA, USA, 9–13 December 2023; IEEE: New York, NY, USA, 2023; pp. 1-4.
44. Hayakawa, K.; Kanai, S.; Funatsu, T.; Igarashi, J.; Jinnai, B.; Borders, W.; Ohno, H.; Fukami, S. Nanosecond random telegraph noise in in-plane magnetic tunnel junctions. Phys. Rev. Lett.; 2021; 126, 117202. [DOI: https://dx.doi.org/10.1103/PhysRevLett.126.117202] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/33798384]
45. Wang, C.; Hyman, J.D.; Percus, A.; Caflisch, R. Parallel tempering for the traveling salesman problem. Int. J. Mod. Phys. C; 2009; 20, pp. 539-556. [DOI: https://dx.doi.org/10.1142/S0129183109013893]
46. Aadit, N.A.; Grimaldi, A.; Carpentieri, M.; Theogarajan, L.; Finocchio, G.; Camsari, K.Y. Computing with invertible logic: Combinatorial optimization with probabilistic bits. Proceedings of the 2021 IEEE International Electron Devices Meeting (IEDM); San Francisco, CA, USA, 11–16 December 2021; IEEE: New York, NY, USA, 2021; pp. 40-43.
47. Aadit, N.A.; Mohseni, M.; Camsari, K.Y. Accelerating Adaptive Parallel Tempering with FPGA-based p-bits. Proceedings of the 2023 IEEE Symposium on VLSI Technology and Circuits (VLSI Technology and Circuits); Kyoto, Japan, 11–16 June 2023; IEEE: New York, NY, USA, 2023; pp. 1-2.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
A Probabilistic Bit (P-Bit) device serves as the core hardware for implementing Ising computation. However, the severe intrinsic variations of stochastic P-Bit devices hinder the large-scale expansion of the P-Bit array, significantly limiting the practical usage of Ising computation. In this work, a behavioral model which attributes P-Bit variations to two parameters,
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 National Key Laboratory of Spintronics, Hangzhou International Innovation Institute, Beihang University, Hangzhou 311115, China;
2 National Key Laboratory of Spintronics, Hangzhou International Innovation Institute, Beihang University, Hangzhou 311115, China;
3 School of Cyber Science and Technology, Beihang University, Beijing 100191, China;