Content area
Full Text
Artificial intelligence (AI), powered by deep neural networks (DNNs), uses brain-inspired information processing mechanisms to approach human-level performance in complex tasks1, and has already achieved major applications ranging from translating languages2, image recognition3 and cancer diagnosis4 to fundamental science5. The vast majority of AI algorithms have been implemented via digital electronic computing platforms—such as graphics- and tensor-processing units—to support their major computational requirements; however, the computational performance that AI demands from processors has grown rapidly, greatly exceeding the development of digital electronic computing imposed by Moore’s law and the upper limit of computing energy efficiency6–8. Constructing photonic neural network (PNN) systems for AI tasks with analogue photonic computing has attracted increasing attention and is expected to be the next-generation AI computing modality due to its advantages of low latency, high bandwidth and low power consumption. The fundamental characteristic of photons and the principle of light–matter interactions (for example, diffraction9–11 and interference12–14 based on free-space optics or integrated photonic circuits) have been used to implement various neuromorphic photonic computing architectures such as convolutional neural networks15–18, spiking neural networks19–21, recurrent neural networks22,23 and reservoir computing24–26.
An effective training approach is one of the most critical aspects for DNNs to learn a model and guarantee high inference accuracy. The DNNs constructed using software on a digital electronic computer are generally trained using the backpropagation algorithm27. Such a training mechanism provides the basis for the in silico training of photonic DNNs, which establishes the PNN models in computers to simulate physical systems, trains models through backpropagation and deploys the trained model parameters to physical systems; however, the inherent systematic errors of analogue computing from different sources (for example, geometric and fabrication errors) cause a deviation between the in silico-trained PNN model and the physical system, resulting in performance degeneration during direct deployment11,28,29. To address the systematic errors, in situ training approaches (training PNNs on the physical systems with experimental measurements) have drawn increasing attention for optimizing the PNN models for practical applications11,29–34. Nevertheless, the existing in situ training methods still confront great challenges in training large-scale PNNs with major systematic errors, hindering the construction...