Intelligent Shape Decoding of a Soft Optical

Full text

Turn on search term navigation

Introduction

The study of light transmission underpins the fundamental development of many optical devices and photonics technologies.^[^1,2^] In the area of geometrical optics, light transmission can be approximated as the propagation of rays,^[³^] and has been utilized in waveguide-based sensing.^[^4–8^] Given a flexible medium of homogeneous refractive index with cladding, light rays are guided like a pathway as a result of total internal reflection. However, light transmission variations in light intensity and refraction losses can occur due to external mechanical stimuli. In waveguide sensors, these variations can be detected and converted into useful sensing feedback in the form of strain sensing or morphology sensing. The application of waveguide sensors spans many fields (e.g., wearable devices,^[⁹^] robotics,^[^10–12^] surgical manipulators^[^13–15^]) and shows potential as an alternative to conventional microelectromechanical systems (MEMS). This is particularly true when the primary design goal of such “soft sensors” or “electronic skins” is the reconstruction of the sensor's deformation or morphology in multiple dimensions, often requiring high flexibility and some degree of stretchability.

Researchers have previously developed approaches to soft sensing utilizing MEMS technology, ranging from tactile piezoresistive sensor arrays arranged on flexible circuit boards,^[^16,17^] to localized pressure sensing with polydimethylsiloxane (PDMS) encapsulation.^[¹⁸^] For the purpose of shape sensing, examples include the work by Hermanis et al.,^[¹⁹^] where discrete accelerometer modules are embedded in a flexible, but inextensible fabric sheet for reconstructing its overall morphology. However, a challenge in implementing many MEMS units for dense sensing is that discrete sensor modules have inherent rigidity, meaning that achieving high density and precision (e.g., through arraying) often comes at the expense of an increasingly rigid structure and encumbering wire routing.

Taking a different approach, highly deformable sensors have been developed using alternative sensing concepts such as liquid metals,^[^20,21^] and carbon-embedded substrates fabricated into customized electronic circuits and components such as touch-sensitive capacitive elements.^[^22,23^] This approach has been used for measuring the curvature of flexible structures like soft robotic fingers by combining the output of multiple 1D strain sensors.^[²⁴^] However, through either MEMS or flexible electronics methods, few studies aim at morphology reconstruction, with the core focus instead placed on 1D signal feedback which can suffer from scalability and wiring challenges, instead of effective multiplexing or approaches for converting sparse measurements to the dense sensor output.

Micrometer-sized fiber Bragg gratings (FBGs) are an example of how discrete, 1D strain sensing can be leveraged to reconstruct higher dimensions, such as curvature and 3D morphology sensing. Fabricated directly into the optical fiber, FBGs can sense axial strain based on the wavelength shift of the reflected light, and achieve shape reconstruction by standalone configurations (e.g., multicore fiber^[^14,25,26^]) or by embedment in soft substrates in a variety of curvilinear routing layouts.^[^27–29^] The latter approach has generated interest in soft robotics research by directly integrating optical fibers into the robot structure to model and reconstruct its behavior.^[^30–33^] Despite the advantages of FBG fibers, including exceptional thinness (<300 μm), electromagnetic (EM) immunity, and multiplexability,^[³⁴^] they still face challenges due to their relatively rigidity which can inhibit the ultimate sensor flexibility. Additionally, FBG fibers entail high costs and bulky measurement equipment (i.e., optical interrogators).^[^35,36^]

Alternatively, others have leveraged simple optoelectronic components, i.e., light-emitting diodes (LEDs) and various types of photodetectors (PDs), with flexible waveguide materials such as PDMS to measure the light intensity variation caused by deformation. Some purposely induce light transmission losses during deformation as a result of microcracks in the reflective surface coating along a PDMS waveguide,^[⁶^] and others have used the deformation-dependent feature of light loss in polymethylmethacrylate (PMMA).^[¹²^] In the prior art, LED and PD pairs are typically placed at either end of a thin waveguide, essentially providing 1D measurement per waveguide. An exception is the work done by Bai et al.,^[⁷^] who used a colored dye along the waveguide to delineate the portion that was deformed by bending or compression. The use of transmission loss takes advantage of geometrical optics, and heavily reduces formulation and modeling complexity often found in the light modulation approach.^[^37,38^] Generally, simplified PD–LED-based sensors serve as an interesting proposition with low fabrication costs, ease of scaling, and potential for unique waveguide and component placement.

Regardless of the sensing approach, it remains a challenge to combine multiple low-level sensors to predict high-order morphology changes, particularly for soft mediums which possess infinitely possible degrees of freedom.^[^39,40^] The substantial complexity in computing finite sensory information for high-level state estimation requires novel hardware designs and modeling methods. The data density required to represent the ground truth of deformed 3D curvatures is generally unmet by current motion capture technology (e.g., optical motion capture, EM sensing). One possible approach is utilizing computational analysis data to tremendously reduce the density of sensing elements. Given a valid geometrical design and consistent material properties, computational mechanics could produce infinite possible virtual configurations. Strains and displacements in these virtual configurations could be employed as a noise-free dataset.^[⁴¹^] This data enrichment method can provide significant benefits in cases where only limited ground truth data are available to estimate a complex surface. When provided with a comprehensive and consistent set of computational outputs, data-driven mapping between sensory data (e.g., resistance/refracted wavelength) and mechanical stimuli (e.g., pressure/shape change) can be modeled. With data-driven modeling, convincing performance is shown in both classification (e.g., the spatial accuracy of pressure) and regression tasks (e.g., pressure magnitude estimation).^[⁴²^] A specialized neural network architecture for high-order sensing outputs, however, requires much more research effort to explore and verify.

In our previous work, several multilayer perceptron (MLP) models were ensembled to predict the displacement of markers on a flat silicone sensor with strain measurements given by FBGs. An overall prediction RMSE of 2.28 mm was attained.^[²⁸^] However, it was occasionally observed that the prediction could momentarily deviate from the ground truth significantly, which may be caused by overlooking the data's temporal characteristics. Recurrent neural networks (RNN), such as long short-term memory (LSTM), are also popular in soft sensing. An example of a single-layer LSTM combined with an MLP was used to predict the magnitude of contact force with an average error of 0.05 ± 0.06 N in a soft finger.^[⁴³^] However, a notable delay was present which may be caused by the high computational cost of LSTM. Convolutional neural networks (CNN) are utilized in sparsely distributed sensors, e.g., a CNN layer was used to classify stimuli type exerted on robotic skin with an accuracy of 98.7%.^[⁴⁴^] As the criteria for selecting such learning-based methods were not explained in previous research, it is challenging to determine the appropriate framework for newly developed soft sensors.

Our previous effort in shape sensing highlights how finite element analysis (FEA) produces the displacement predictions for an A4-sized soft skin^[²⁸^] and predictions are further used as training data for multiple neural networks. Followed by ensemble learning that takes account of local deformation, 3D skin shape reconstruction is realized. In this study, we propose a general framework for flexible surface shape sensing in real-time, validating it on a soft and self-contained optical waveguide sensor using sparsely placed PD and LEDs. Unlike our previous work, we incorporate temporal data characteristics to minimize jittering and inaccuracies during real-time sensing. The framework takes advantage of computational multiphysics FEA to assist sensor parameter design, as well as sparse data enrichment. An autoregressive-based learning model is introduced to target the spatial time series captured by the sensor. We also propose a novel sensor design optimization process for data quality enhancement, and model architecture design that depends on the sensing data characteristics. The primary contributions of this study can be summarized as follows: 1) To form a self-contained “skin” capable of untethered sensing of large-scale shape changes at high frequency using flexible waveguide with simple optics devices, namely, LEDs and PDs, where multiphysics FEA is adopted to optimize the sensor layout. 2) To develop an autoregression (AR)-based learning framework for decoding RGB light signals into deformation patterns accurately, which can be a general approach for spatial and temporal sensing data across different sensing modalities. 3) To validate the proposed shape decoding framework in a waveguide sensor prototype in terms of sensing accuracy and repeatability in high-order complex deformations, alongside usage underwater.

Results and Discussion

The workflow of the proposed shape decoding framework is described in Figure 1, consisting of four main sequential steps, i.e., light transmission data analysis, sensor fabrication, data enrichment by mechanical FEA, and autoregressive deep neural networks (DNNs)-based decoding. To demonstrate the working principle of the raised waveguide sensor, an A5-sized soft skin (148 × 210 × 4 mm) embedded with three pairs of LED and PD was developed. Before the fabrication of the real prototype, the locations of sensing elements were determined by multiphysics FEA which investigated the influence of PDs/LEDs distribution on the light signal transmission inside the skin. The training data for the mapping between the light signal and the skin shape were obtained in a fish-shaped prototype underwater, where five EM tracking markers were used to capture the discrete node coordinates. Prior to model training, these sparsely distributed node coordinates were enriched by mechanical predictions, and then were analyzed by autocorrelation functions (ACFs). The developed AR-based model could reconstruct the skin deformation continuously. The repeatability test confirmed the high quality of the collected data.

View Image - Figure 1. Workflow of the proposed waveguide shape decoding framework. The skin deformation changes the light transmission within the waveguide body, leading to the mapping between the skin shape and the light signal. Distributions of LEDs and PDs are optimized by multiphysics FEA. Then, the collected sparse shape data are enriched by mechanical FEA. Finally, an autoregressive DNN is developed to reconstruct the skin shape for any given light signals.

Figure 1. Workflow of the proposed waveguide shape decoding framework. The skin deformation changes the light transmission within the waveguide body, leading to the mapping between the skin shape and the light signal. Distributions of LEDs and PDs are optimized by multiphysics FEA. Then, the collected sparse shape data are enriched by mechanical FEA. Finally, an autoregressive DNN is developed to reconstruct the skin shape for any given light signals.

Light Transmission Response

The number of sensing elements (i.e., LED and PD) in the proposed shape decoder is limited and their locations would affect the light intensity data. To create the deformation-light mapping using a data-driven method, the data pair is required to be one-to-one. Additionally, the data dimension must be adequate to support the recognition of high-order skin deformations. To satisfy such requirements on data quality, the optimization of LED and PD locations is expected. Before optimization, we first established a multiphysics (ray optics cum mechanics) analysis model to investigate the influence of sensing unit distribution on light intensity. In the simulation, three LEDs and a PD were, respectively, placed at the clamped and free end of an A5-sized rectangular waveguide which is flexed in the portrait and landscape modes (Figure 2A). The light from LEDs would experience intensity loss due to reflection, refraction, and transmission before being captured by PD which is mounted at the midlength of the skin-free side. As shown in Figure 2B, light signal responses are almost symmetric about the horizontal configuration of the skin for all three wavelengths during the downward (①→②) and upward (②→③) bending, which is a hinder to motion modeling. The same problem would appear in the landscape mode in which the free-hanging length is shorter (Figure 2C). During small deflection (free-side deflection <±20 mm, Figure 2C), the red and blue light intensities are nearly zero as the red and blue lights were put close to the longitudinal sides and most of the light rays were absorbed or refracted. Besides, the spiky noise is obvious. In sum, the problems of symmetricity, zero light intensity variation, and noise would appear when the arrangement of LED and PD pairs is not optimized.

View Image - Figure 2. Multiphysics FEA of an A5-sized waveguide without design optimization. A) Simple bending with a clamped side along two orientations (i.e., portrait and landscape orientation). The skin was released from the lower bent position (①, time = 0), passing the flat state (②, time = 0.5 s), to the higher position (③, time = 1 s) symmetrically. B) Multiphysics predictions of normalized light intensities at the mid-length of the free side during the period of portrait flipping motion. The schematic of RGB light transmission inside the skin at the flat state (time = 0.5 s) was depicted. C) Problems in RGB light intensities for different free-side deflections in both flipping modes (highlighted in pale yellow). When the free-side deflection is ±60 mm, the red light intensities are 0.409 and 0.406, respectively, which are almost symmetric about the zero free-side deflection. When the free-side deflection is within ±20 mm, the blue light intensity keeps unvaried (difference [less than] 0.01). The raw light intensity was normalized to zero mean and unit variance.

Figure 2. Multiphysics FEA of an A5-sized waveguide without design optimization. A) Simple bending with a clamped side along two orientations (i.e., portrait and landscape orientation). The skin was released from the lower bent position (①, time = 0), passing the flat state (②, time = 0.5 s), to the higher position (③, time = 1 s) symmetrically. B) Multiphysics predictions of normalized light intensities at the mid-length of the free side during the period of portrait flipping motion. The schematic of RGB light transmission inside the skin at the flat state (time = 0.5 s) was depicted. C) Problems in RGB light intensities for different free-side deflections in both flipping modes (highlighted in pale yellow). When the free-side deflection is ±60 mm, the red light intensities are 0.409 and 0.406, respectively, which are almost symmetric about the zero free-side deflection. When the free-side deflection is within ±20 mm, the blue light intensity keeps unvaried (difference [less than] 0.01). The raw light intensity was normalized to zero mean and unit variance.

As the low-quality data are unfavorable to sensing resolution and data-driven modeling, the multiphysics FEA model is used to optimize the offset angle α and distance D between LEDs and PD which are dominant factors to the captured light intensity (see Figure 3). The light intensities predicted at 46 αs ranging from 0 to 90° and four Ds ranging from 30 to 150 mm were compared in Figure 3A. For a specified α, the light intensity drops exponentially with the distance. Most primary rays would undergo free scattering and cannot reach PD through a specific pathway like conventionally in optical fiber. Moreover, the extended distance would bring increased light loss. Under a fixed distance, the angle of 90° is optimal as the PD can only receive light from the front side. Figure 3B shows the light transmission predictions inside a skin sensor with the optimized setting in which α and D of red, green, and blue LEDs with respect to PD are (45°, 45 mm), (90°, 60 mm), and (150°, 60 mm), respectively. When corner I at the left-hand side of the skin sensor is lifted (Figure 3B I), the light intensity is varying continuously without plateau phases, and thus one type of shape corresponds to one set of light. When corner II at the right-hand side is lifted (Figure 3B II), the zero light intensity variation, i.e., plateau phase, exists in all colors. These imply the need for optimizing the location of PDs/LEDs, or the numbers of PDs/LEDs pairs in order to increase the data dimension. To demonstrate the optimized sensing effect, we prototyped a five-layered PDMS silicone sensor (148 × 210 × 4 mm) with three pairs of PDs/LEDs (Figure 3C) and their locations are set to the optimized ones based on the multiphysics prediction. It can be observed that the light intensity is correlated with the deformation mode. For instance, when the sensor was lifted at its top left corner (first column in Figure 3C), the first (left) PD's signal varied obviously while the third (right) one tended to be stagnant. This deformation mode mostly affected the light transmission in the left-hand and middle regions while only having a slight effect on the light path in the right-hand region. When the sensor was lifted at the other three corners (other columns in Figure 3C), the signal and motion are also consistent. With this optimized PDs/LEDs setting, the discrete deformation pattern can be roughly recognized from the light intensity data. For continuous shape reconstruction in detail, we still have to establish a model capable of mapping light intensity to sensor shape.

View Image - Figure 3. Design optimization of LED and PD placement, and validation in an A5-sized sensor (148 × 210 × 4 mm). A) Light intensity predictions for different absolute distance D and the offset angle α between the LED and PD, assuming the PD captures the light ray only from Lambertian light distribution. The offset angle α is regarded as 0 and 90° when the front face of PD and the side face of LED are, respectively, parallel and perpendicular. B) Light intensity when two opposite corners (I and II) of the skin embedded with three LEDs (red, green, and blue) and a PD are lifted. C) LED and PD locations and the four deformation patterns (first row), measured light intensities of A5-sized waveguide sensor with three pairs of LED and PD under four deformation patterns, which are respectively fixed in left (second row), middle (third row), and right (fourth row) regions.

Figure 3. Design optimization of LED and PD placement, and validation in an A5-sized sensor (148 × 210 × 4 mm). A) Light intensity predictions for different absolute distance D and the offset angle α between the LED and PD, assuming the PD captures the light ray only from Lambertian light distribution. The offset angle α is regarded as 0 and 90° when the front face of PD and the side face of LED are, respectively, parallel and perpendicular. B) Light intensity when two opposite corners (I and II) of the skin embedded with three LEDs (red, green, and blue) and a PD are lifted. C) LED and PD locations and the four deformation patterns (first row), measured light intensities of A5-sized waveguide sensor with three pairs of LED and PD under four deformation patterns, which are respectively fixed in left (second row), middle (third row), and right (fourth row) regions.

Repeatability and Hysteresis Test

To further verify that the proposed waveguide sensing method is sufficiently robust for subsequent data-driven modeling and potential task-based application, a repeatability test of 1000 deformation cycles was conducted. A fish-shaped prototype was developed (Figure S1, Supporting Information). With one side clamped, it was undulated underwater by external hydrodynamic force (top view initial state ① shown in Figure 4A). The undulating motion was cyclic, flexing leftward twice, and flexing rightward twice (see ②, ③, ④, and ⑤ in Figure 4A).

View Image - Figure 4. Repeatability and hysteresis analyses of the fish-shaped sensor in 1000 cycles of asymmetric undulating motion underwater. A) Top view of five sensor morphologies. The sensor was deformed in a cyclic mode in sequence ②, ③, ④, ⑤, and ① is the initial state). B) RGB light intensity variations of the second pair of LED/PD in the first cycle of motion captured at 150 Hz. C) Close-up view of (B) at the initial undeformed state (0–0.2 s). The signal noise is below 0.05%. D) Hysteresis plot of the red light intensity captured by three PDs along with the maximum nodal deflection of the skin respectively, where the green shaded region refers to the 95% confidence interval.

Figure 4. Repeatability and hysteresis analyses of the fish-shaped sensor in 1000 cycles of asymmetric undulating motion underwater. A) Top view of five sensor morphologies. The sensor was deformed in a cyclic mode in sequence ②, ③, ④, ⑤, and ① is the initial state). B) RGB light intensity variations of the second pair of LED/PD in the first cycle of motion captured at 150 Hz. C) Close-up view of (B) at the initial undeformed state (0–0.2 s). The signal noise is below 0.05%. D) Hysteresis plot of the red light intensity captured by three PDs along with the maximum nodal deflection of the skin respectively, where the green shaded region refers to the 95% confidence interval.

Initially, the fish-shaped waveguide sensor was kept in the neutral position until 0.2 s (Figure 4B), and three channels of light intensity remained steady with fluctuation below 0.05% (Figure 4C). The red light intensities received by three PDs in the 1st and 1000th cycles were analyzed as shown in Figure 4D. In terms of the nodal deflection, the light intensity changes in these two cycles are nearly the same and the maximum difference is below 0.31%. In conclusion, the results imply that the sensing data are stable with small noise, and reliable even after 1000 repeated motion cycles. It also reveals that the rigidity of the tiny LEDs and PDs is not hindering the flexibility of soft sensors, especially in the case of high-order morphology changes involving bending, twisting and stretching. Data communication in an underwater environment (with negligible water pressure) is also stable due to the excellent water-repellence of silicone PDMS.

Model Architecture

The soft skin shape reconstruction through multiphysics FEA needs large computational time, which fails to meet the requirement of high-frequency response and high-accuracy sensing on sensors in practical use. Therefore, we proposed to exploit deep learning to create the end-to-end mapping between light signals and skin configuration that could be represented by 3D coordinates of spatial nodes on the skin surface.

Prior to the training attempt using various deep learning models, an analysis of sensing data was carried out to observe the spatial and temporal characteristics, as well as to select appropriate learning models. Considering the skin deformation is of spatial locality, we grouped nodes with respect to their locations as shown in Figure 5A, and inspected the average transverse deflections of these three groups during the bending deformation. The nodal coordinates have an obvious tendency along with the skin bending (Figure 5B). When the top/bottom right (left) corner was deformed, the coordinates of the nodes in right (left) group would change sharply; and the coordinate variation of nodes in middle group is relatively mild because the deflection at the skin corner would hardly affect them. The deflection data are of spatial locality, and therefore, techniques targeting at spatial data such as convolution operators and patch-wise processing can be considered in the deep learning model. As the flexing motion is smooth rather than impulsive, it can be assumed that the data are also time-sequential, meaning that the historical signal would hold influence over a period of time. We evaluated the ACF of light signal and nodal coordinates using lag k ranging from 0 to 10, which could describe the degree of similarity between a time series and its lagged version.^[⁴⁵^] As displayed in Figure 5C, most of the autocorrelation values exceed the error band. In other words, the data have a significant autocorrelation. For such a time series, we consider exploiting the AR model to extract the time-sequential feature of the data, and evaluate complex motions such as the combination of flexing and twisting.

View Image - Figure 5. Data analysis on the light intensity and nodal deflection collected on the A5-sized waveguide sensor. A) Nodes on the skin surface are clustered into left, middle, and right groups. B) Average transverse deflection of the three groups and corresponding skin deformation patterns during a series of bending motions. C) Autocorrelations of light intensity and grouped nodal displacement with lags ranging from 0 to 10. The blue bar represents the maximum ACF value in all channels under a specified lag, short colored horizontal lines are ACF of data channels (i.e., nine light intensity channels and three node displacement channels), and the green shaded regions are corresponding error bands.

Figure 5. Data analysis on the light intensity and nodal deflection collected on the A5-sized waveguide sensor. A) Nodes on the skin surface are clustered into left, middle, and right groups. B) Average transverse deflection of the three groups and corresponding skin deformation patterns during a series of bending motions. C) Autocorrelations of light intensity and grouped nodal displacement with lags ranging from 0 to 10. The blue bar represents the maximum ACF value in all channels under a specified lag, short colored horizontal lines are ACF of data channels (i.e., nine light intensity channels and three node displacement channels), and the green shaded regions are corresponding error bands.

When constructing the deep learning model to map the light signal to skin shape, two criteria should be considered: data characteristics and computation efficiency. Since the skin motion has the characteristics of time-continuity and space-locality, the sensing data are time series as well as spatial. Having considered the requirement for high update frequency in a real-time sensing application, we should take a trade-off between the computational cost and accuracy of our proposed modeling. Taking the aforementioned two criteria into account, we developed a patch-wise AR model as shown in Figure 6. The input was divided into two modules which are the light intensity at the current time-step t and the nodal displacement at the last time-step t−1. The output was the nodal displacement corresponding to the current time-step t. The model can recursively generate and receive the nodal displacement with the aim of data training and forecasting in tests. To enhance the continuity of prediction, a time window was defined for temporal data sampling, implying that both the input and output would be a series of frames within a fixed time range. To handle the spatial deformation variation, the nodal displacements have to be clustered into several groups according to the nodal location. Mappings of each group from the light signal to the nodal coordinates would be processed using independent MLPs.

View Image - Figure 6. Model architecture. The input consists of two parts, i.e., the history module storing the nodal displacement at the last step t−1, and the light module storing the light signal at the current step t. The output is the predicted nodal displacement at the current step t. All the input and output are within the time window from t−w to t. To handle the spatial locality of skin deformation, the whole architecture is ensembled by three models, i.e., the head, body, and tail models. The prediction rollout iteratively to the next step t + 1 with nodal displacement at step t and light signal at step t + 1, and so forth.

Figure 6. Model architecture. The input consists of two parts, i.e., the history module storing the nodal displacement at the last step t−1, and the light module storing the light signal at the current step t. The output is the predicted nodal displacement at the current step t. All the input and output are within the time window from t−w to t. To handle the spatial locality of skin deformation, the whole architecture is ensembled by three models, i.e., the head, body, and tail models. The prediction rollout iteratively to the next step t + 1 with nodal displacement at step t and light signal at step t + 1, and so forth.

Shape Decoding Using Autoregressive DNNs

To assess the sensor's shape-decoding performance, a total of 3000 frames of data (2300 for training and 700 for testing) were collected underwater on the fish shape skin over 48 nodes in three groups (Figure 7A). The edge in the fish head was clamped inside the water tank and the body complied with the water flow. Each frame of data consists of light intensity (input) provided by PDs and 3D nodal displacements (output). To continuously represent the shape of the soft sensor and make use of spatial characteristics, we enriched five EM-tracked nodes to 48 densely distributed nodes (detailed in Section 3.3). The nodal displacement error distribution of three groups and 48 nodes is depicted in Figure 7B and collectively in Figure 7C, respectively. It can be inferred that error would grow while the sensors are undergoing larger deformation. An ablation study was conducted by removing three key components, i.e., time window (TW), history (HX) module, and patch-wise (PW) processing from the original model (OM) individually or jointly with an error comparison table shown in Figure 7D and histograms in Figure S2, Supporting Information. The smallest error among these five models indicates the importance of these components to the model accuracy. As discussed in Section 2.2, sensing data form a time series, such that the history module and time window data offering information on previous steps could play important roles. From the comparison between OM and OM w/PW, we can conclude that patch-wise processing improves the ability of the model in tackling the spatial locality of data. The finite element (FE) mesh for nodal coordinate enrichment in the mechanical FEA and the overview of the sensing system are depicted in Figure 8A,B, respectively. The predicted skin shape is compared with the reality in Figure 8C. The reconstruction is shown to be close to the ground truth, despite the relatively complex deformation which combines flexing and twisting, and involves large (100 mm) deflection of the tail region. As displayed in Figure 8D, the shape reconstruction error is less than 5 mm during a 2 s time span without momentary large deviations, indicating that our decoding model could predict the skin shape.

View Image - Figure 7. Sensor skin configuration and prediction performance through 33 600 node instance samples collected from 700 frames. A) Selected 48 nodes to represent the overall sensor morphology. B) Distribution of nodal displacement error per group in box plot and mean per node with colormap. C) Distribution of nodal displacement error of all node instances. D) Error comparison between the OM and other four other models in the ablation study, which do not include three key components, individually or jointly, namely TW, HX module, and PW processing.

Figure 7. Sensor skin configuration and prediction performance through 33 600 node instance samples collected from 700 frames. A) Selected 48 nodes to represent the overall sensor morphology. B) Distribution of nodal displacement error per group in box plot and mean per node with colormap. C) Distribution of nodal displacement error of all node instances. D) Error comparison between the OM and other four other models in the ablation study, which do not include three key components, individually or jointly, namely TW, HX module, and PW processing.

View Image - Figure 8. Shape decoding of the fish prototype in the underwater test. A) FE mesh used in the mechanical FEA for data enrichment. B) Components of the sensor skin system. The sensor body was embedded with three pairs of PD and LED, and connected with an FPGA board that carried the Bluetooth module and battery for data transmission. C) Four different motion poses ②, ③, ④, and ⑤, following the numbering in Figure 4A, and their corresponding decoded shape with colorbar showing the nodal displacement error. D) Error of decoded shapes during this 2 s deformation. The time instants of four poses in (C) were marked. E) Isometric view of the reconstructed skin shape. The colored shape referred to the four poses in (C), while the orange phantom represented other intermediate poses during this 2 s time span.

Figure 8. Shape decoding of the fish prototype in the underwater test. A) FE mesh used in the mechanical FEA for data enrichment. B) Components of the sensor skin system. The sensor body was embedded with three pairs of PD and LED, and connected with an FPGA board that carried the Bluetooth module and battery for data transmission. C) Four different motion poses ②, ③, ④, and ⑤, following the numbering in Figure 4A, and their corresponding decoded shape with colorbar showing the nodal displacement error. D) Error of decoded shapes during this 2 s deformation. The time instants of four poses in (C) were marked. E) Isometric view of the reconstructed skin shape. The colored shape referred to the four poses in (C), while the orange phantom represented other intermediate poses during this 2 s time span.

Experimental Section

To ensure reproducibility of this work, we provide detailed information for constructing the waveguide sensor, and the settings in the multiphysics and mechanical FEA. PDMS and RGB LED–PD pair were chosen as the light transmission medium and light sensing elements, respectively, and Bluetooth worked for wireless data delivery. COMSOL Multiphysics was used in the multiphysics (ray optics cum mechanics) FEA for light transmission simulation. Meanwhile, ABAQUS is used in the mechanical FEA for data enrichment.

Fabrication

The proposed optical waveguide sensor is composed of three main parts, namely, the soft skin, sensing elements, and wireless data transmission modules. The soft skin consisting of five layers as shown in Figure S1B and Video S1, Supporting Information, works as the medium of light transmission. Isotropic and nondispersive PDMS is commonly used as the substance for light-transmitting due to its high refractive index (≈1.4) and transmittance (>90%) for visible light.^[⁴⁶^] The fabrication of soft skin followed the standard silicone curing process, which was repeatedly carried out for all layers. The mixed PDMS (Sylgard 184) in a 10:1 ratio was degassed and cured under 60 °C for 48 h, followed by 120 °C for 30 min. For the opaque and semiopaque layers, PDMS was additionally mixed with silicone dye. RGB LEDs (Kingbright 0603 LED) and PDs (AMS TCS34725FN, 400 kHz) were, respectively, selected as light-emitting and transducing elements, both of which were embedded in the transparent layer. The electronics were connected to a Field Programmable Gate Arrays (FPGA)-based printed circuit board (PCB) with Bluetooth 5.0 (HC Tech, nRF52832) module and lithium-ion battery (3.7 V, 400 mAh) (Figure S3, Supporting Information). The battery life is approximately 30 min for consecutive sensing. These components are all off-shelf available in the market and interchangeable. For instance, the waveguide medium can be substituted by the synthetic hydrogel. The total cost is around 150 USD.

Data Acquisition

For learning-based modeling dataset preparation, we collected multiple data sequences on both the A5-sized and fish-shaped waveguide sensors. In particular, the fish-shaped sensor was clamped along one side and performed undulating swimming motions driven by external hydrodynamic force. During deformation, light intensities were acquired by multiple PDs and transmitted by an FPGA board (Figure S3, Supporting Information). Five EM tracking markers were sparsely adhered to the sensor to capture real-time 3D coordinates at 20 Hz (Aurora V3, NDI). Other motion-tracking methods (e.g., infrared-based or dynamic Lidar detection) could also be alternatives if the difficulty of line-of-sight is overcome. During the model evaluation and repeatability test, light intensity data were wirelessly transmitted to the processing PC (i9-12900H, RTX 3060, 16 GB RAM) at 150 Hz for shape decoding.

Multiphysics FEA for Design Optimization

Multiphysics FEA was conducted using the Geometrical Optics and Solid Mechanic modules in COMSOL. The A5-sized waveguide sensor was meshed into 1260 eight-node hexahedral elements (C3D8). The refractive index of PDMS varies with light wavelength. For red (700 nm), green (510 nm), and blue (440 nm) lights, the indexes are 1.4273, 1.4364, and 1.4433, respectively.^[⁴⁶^] Assuming the embedded optoelectronic components would not affect the flexibility of the skin, the waveguide medium PDMS was set to be an elastic material with Poisson's ratio of 0.495,^[⁴⁷^] density of 965 kg m⁻³, and elastic modulus of 2.5 MPa.^[⁴⁸^] Light rays in these three colors, each simulated with 5000 vectors, were emitted based on a Lambertian distribution. The predicted light loss was set at a reflection coefficient of 0.75. Zero polarization is assumed. A static study with nonlinear geometricity was carried out for deformation at step 1 (t = 0), followed by a ray tracing study at step 2 (t = 20 ns).

Mechanical FEA for Data Enrichment

The dense data point is crucial to subsequent shape reconstruction regardless of motion tracking technology. As proved in our previous work, FE-based data enrichment outperforms bilinear and nonlinear interpolation in terms of error, and thus it is exploited in this study.^[²⁸^] The FE model was fed with 3D coordinates of five markers as point displacement constraints, and generated 48 sets of nodal coordinates via the commercial software ABAQUS. And the material property setting of PDMS is the same as in Section 3.3. The FE model contained 5394 eight-node hexahedral elements with incompatible modes (C3D8IH), which deliver much better accuracy than the standard hexahedral element under bending deformation.

Learning-Based Model Configuration and Error Metrics

We implemented the proposed deep learning framework using PyTorch, and trained the neural networks with a batch size of 128 using L2 loss[Image Omitted. See PDF]where x_i and $x_{i}^{\star}$ are, respectively, the predicted and label nodal displacements, and n is the number of nodes. The data were sampled using a time window of five frames (≈0.03 s). To enhance generalizability of the model, both the input and output were values relative to the one in the stable initial state, and then normalized to zero mean and unit variance. All six MLPs in our architecture have four hidden layers with 128 neurons and ReLU is the activation function. Dropout (p = 0.5) was adopted to alleviate overfitting. The RMSE was utilized to evaluate model performance quantitatively[Image Omitted. See PDF]

For the ablation study (Figure 7D), RMSE is the mean error on 48 nodes in 700 frames; for the study of prediction stability (Figure 8C), RMSE is the mean error on 48 nodes. To determine whether the model for time series is suitable for the study, we used k-order ACF to analyze the light signal and node displacement[Image Omitted. See PDF]where k = 1, 2, 3, … is the lag value, y_i is the data at the ith time step, $\bar{y}$ is the average value, and m is the last time step.

Conclusion

In this study, we presented a shape decoding framework for the light-transmission-based soft skin sensor utilizing multiphysics FEA for design optimization, mechanical FEA for data enrichment, and deep learning for shape modeling. The multiphysics FEA uses ray optics and mechanics principles. It plays an important role in the prefabrication design analysis of the sensor, allowing us to optimize the distribution of sensing units on the skin sensor in a short time. The effect of distance and offset angle between LEDs and PDs on light transmission was investigated in the analysis, and low-resolution issues were discovered. Many-to-one in sensing data could have been eliminated after design optimization. The resulting real-time shape decoding performance was demonstrated using a simple test as shown in Video S2, Supporting Information, in which a corner of the sensor was lifted. The A5-sized prototype indicates that a relatively short distance (≈75 mm) and large angle (≈90°) between PD and LED could reduce light energy loss and promote data recognizability for data-driven modeling. The interpolation of sparse data using the mechanical FEA provides significantly more datasets for model training, easing the requirement for the dense markers on skin sensors for the record of high-resolution and complicated deformation. We described the skin sensor deformation using 48 nodal coordinates interpolated through the analysis from the coordinates obtained by five EM markers. Thus, the dataset including the measured continuous light signal and the enriched nodal coordinates was collected. The repeatability test shows that data of the fish shape sensor are reliable even after 1000 deformation cycles, with acceptable noises (<0.05%) and the difference of light intensity below 0.31%. Before training, we analyzed the data and deemed out that the skin motion was of spatial locality and temporal continuity. Results indicate that the averaged coordinates in the three groups of nodes can be used to characterize the skin deformation, and ACF reveals both the light signal and nodal coordinates are affected by their history. We developed the mapping from light intensity to skin shape based on an autoregressive model in which time window and patch-wise processing were utilized. The trained model could reconstruct the nodal displacement with RMSE of 0.27 mm (for the 700-frame test data). The predicted skin shape was close to ground truth as supported in Video S3, Supporting Information, even for the complex motion (e.g., a combination of bending and torsion). The ablation study on the model architecture implies the three key components of the framework, namely, the time window, AR, and patch-wise processing, are beneficial to model accuracy.

The proposed LED–PD-based optical sensing could be integrated into artificial skin to enable the perception of human body motions, or enclosed in soft robots to offer shape information in robot–environment interaction. It is important to note that our study focused on 3D morphological changes, and further research is needed to investigate other sensing modalities such as accurate proprioception involving localized pressure or stretching (e.g., multipoint fingertips). In summary, our study presents a shape-sensing framework for an LED–PD-based soft waveguide sensor. The FE-based analyses for sparse-to-dense data processing and design optimization, and the autoregressive shape prediction model can also be extended to other transducing techniques such as electrical impedance^[⁴⁹^] or acoustic-based methods.^[^44,50^]

Acknowledgements

C.H.M. and Y.L. contributed equally to the study and should be considered as co-first authors. Y.L. and K.W.K. are co-corresponding authors. Research in this article was supported by the Research Grants Council (RGC) of Hong Kong (grant nos. 17205919, 17207020, 17209021, and T42-409/18-R) and the Innovation and Technology Commission (ITC) (grant no. MRP/029/20X) of the HKSAR Government under the InnoHK initiative, Hong Kong, via Multi-Scale Medical Robotics Center Limited and Centre for Garment Production Limited.

Conflict of Interest

The authors declare no conflict of interest.

Data Availability Statement

Research data are not shared.

Word count: 5932

Show less

© 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.

Abstract

Translate

Optical waveguides create interesting opportunities in the area of soft sensing and electronic skins due to their potential for high flexibility, quick response time, and compactness. The loss or change of light intensities inside a waveguide can be measured and converted into useful sensing feedback such as strain or shape sensing. Compared to other approaches such as those based on microelectromechanical system modules or flexible conductors, the entire sensor state can be characterized by fewer sensing nodes and less encumbering wiring, allowing greater scalability. Herein, simple light-emitting diodes (LEDs) and photodetectors (PDs) combined with an intelligent shape decoding framework are utilized to enable 3D shape sensing of a self-contained flexible substrate. Multiphysics finite element analysis is leveraged to optimize the PDs/LEDs layout and enrich ground-truth data from sparse to dense points for model training. The mapping from light intensities to overall sensor shape is achieved with an autoregression-based model that considers temporal continuity and spatial locality. The sensing framework is evaluated on an A5-sized flexible sensor prototype and a fish-shaped prototype, where sensing accuracy (RMSE = 0.27 mm) and repeatability (Δ light intensity <0.31% over 1000 cycles) are tested underwater.

Details

Title

Intelligent Shape Decoding of a Soft Optical Waveguide Sensor

Author

Chi-Hin Mak¹; Li, Yingqi¹

; Wang, Kui¹; Wu, Mengjie¹; Di-Lang Ho, Justin¹; Dou, Qi²; Kam-Yim Sze¹; Althoefer, Kaspar³; Ka-Wai Kwok¹

¹ Department of Mechanical Engineering, The University of Hong Kong, Hong Kong, P. R. China
² Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, P. R. China
³ School of Electronic Engineering and Computer Science, Queen Mary University of London, London, UK

Section

Research Articles

Publication year

2024

Publication date

Feb 2024

Publisher

John Wiley & Sons, Inc.

e-ISSN

26404567

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1002/aisy.202300082

ProQuest document ID

2928475005

Intelligent Shape Decoding of a Soft Optical Waveguide Sensor

Jump to:

Full text

Abstract

Details

Suggested sources