Content area
Inspired by bird flight, flapping‐wing robots have gained significant attention due to their high maneuverability and energy efficiency. However, the development of their perception systems faces several challenges, mainly related to payload restrictions and the effects of flapping strokes on sensor data. The limited resources of lightweight onboard processors further constrain the online processing required for autonomous flight. Event cameras exhibit several properties suitable for ornithopter perception, such as low latency, robustness to motion blur, high dynamic range, and low power consumption. This article explores the use of event‐based vision for online processing onboard flapping‐wing robots. First, the suitability of event cameras under flight conditions is assessed through experimental tests. Second, the integration of event‐based vision systems onboard flapping‐wing robots is analyzed. Finally, the performance, accuracy, and computational cost of some widely used event‐based vision algorithms are experimentally evaluated when integrated into flapping‐wing robots flying in indoor and outdoor scenarios under different conditions. The results confirm the benefits and suitability of event‐based vision for online perception onboard ornithopters, paving the way for enhanced autonomy and safety in real‐world flight operations.
Introduction
Flapping-wing robots, also known as ornithopters, are aerial platforms that generate lift and thrust by mimicking the flight mechanism of birds and insects. These robots have high maneuverability and combine glide and flapping flight modes to minimize energy consumption. Compared to multirotor and fixed-wing platforms, flapping-wing robots consume less energy and are less dangerous in the event of a collision. In addition, their wide range of potential applications has motivated a significant research and development interest in recent years.[1–3]
The design of perception systems for flapping-wing robots faces various challenges and limitations.[4] First, ornithopters have strict restrictions on payload and weight distribution, which directly affect the number, size, and shape of sensors, electronics, batteries, and other components onboard. In addition, their agile movements and flapping strokes produce strong mechanical vibrations that impose relevant constraints on the sensors that can be used for online perception. In addition, online onboard processing plays a crucial role in aerial robot perception. It enables the real-time decision-making and control required to achieve high degrees of safety, autonomy, and responsiveness through autonomous functionalities such as obstacle detection, collision avoidance, and navigation, among others. Fast reactivity is particularly critical considering the agile maneuvers of flapping-wing platforms. However, their constrained payload capacity also imposes complex restrictions on the processing units that can be used, requiring low-sized lightweight computers whose resources can be limiting for some applications.
The ornithopters’ constrained payload and onboard computational resources limit the number and type of sensors that can be used and the type of onboard perception processing. Sensors such as light detection and ranging (LiDARs), ultrasound sensors, radars, and infrared cameras, which are widely used in multirotors, present several problems when used for flapping-wing robots, whose payload is in the range of a few hundred grams.[5] On the contrary, vision sensors have low size and weight, providing rich information about the environment. Most existing works have proposed using vision-based perception for ornithopter autonomy.[4,6] Some works have used schemes based on traditional frame-based cameras in monocular[7–9] or stereo[10–14] configurations. Others have proposed perception schemes based on event cameras.[15,16] These novel bioinspired sensors, which operate asynchronously and per-pixel independently, try to mimic animal vision systems, either in speed and energy efficiency,[17] offering advantages such as low latency, high robustness against motion blur, and high dynamic range, among others.[18,19] In a recent previous work,[20] we compared frame-based and event cameras for flapping-wing robot perception, concluding that although event-based technology has a lower degree of maturity, it suits the requirements of ornithopters. Taking that work as the starting point, we analyze the use of event-based vision for online perception on board flapping-wing robots with a broader approach, evaluating commonly used event-based vision algorithms, suitability for indoor and outdoor ornithopter perception, ease of integration, and available resources and support tools.
This article broadly analyzes the suitability of event-based vision for the online onboard perception of flapping-wing robots. We intend to answer these three questions:
Q1: Are event cameras suitable for online onboard perception considering the challenging flight conditions of flapping-wing robots?
Q2: Are event cameras suitable to be integrated (HW&SW) on flapping-wing robots considering the constraints of these platforms?
Q3: Are the performance and computational cost of event-based algorithms feasible to enable online onboard perception task for ornithopters?
We aim to answer this question by performing three analyses: 1) evaluation of the suitability and robustness of event cameras in experiments that mimic the conditions that can be found in flapping-wing flights; 2) review of the available event-based devices and resources and support tools focusing on the HW&SW integrability on flapping-wing platforms; and 3) experimental evaluation of the performance and computational cost of some widely used event-based vision algorithms when integrated into flapping-wing robots flying in different scenarios (including the platform in Figure 1). To the best of the authors’ knowledge, this is the first work broadly analyzing the use of event vision for large-scale flapping-wing robots. In addition, we provide the data recorded onboard two different flapping-wing robots in indoor and outdoor scenarios used in the experimental evaluation of this work.
[IMAGE OMITTED. SEE PDF]
The rest of the article is structured as follows: Section 2 briefly summarizes the main works in the topics addressed in the article. The experimental analysis of the suitability of event cameras for the perception challenges of flapping-wing flight is presented in Section 3. The HW&SW integrability of event-based vision onboard ornithopters is analyzed in Section 4. The evaluation of widely used event-based vision algorithms for ornithopters’ onboard perception is presented in Section 5. Finally, Section 6 concludes the article with the findings and main future steps.
Related Work
Event Cameras
In recent years, event cameras have attracted increasing interest in fields such as artificial intelligence, computer vision, neuromorphics, and robotics. The origins of event cameras trace back to 1991 with pioneering efforts in neuromorphic vision technology and the emergence of the silicon retina,[21] which mimicked biological eyes’ processing capabilities. This foundation led to the EU's CAVIAR project,[22] which advanced early AER-based event vision systems. In 2008, the first commercial event cameras (128 × 128 resolution, 40 μm pixel size; now iniVation's DVS128) were introduced.[23–26] By 2019, major players like Sony and Samsung entered the market, signaling growing commercial interest in event-based sensors. Technological advancements continued rapidly, with the first HD event cameras[27] (1280 × 720 resolution, 5 μm pixel size) launching in 2021. In 2022, Meta began showing interest in this innovative technology,[28] followed by a significant collaboration between Prophesee and Qualcomm () in 2023 and with AMD and Lucid Vision Labs (among others) () in 2024. Foundation studies introducing novel event camera designs have explored their advantages over traditional sensors.[23–27,29–34] However, there are few works that analyze and compare frame-based and event-based vision. Addressing this gap, Holešovský et al.[35,36] conducted an experimental analysis using a speed-controlled spinning disk and a bullet fired at various velocities to compare the performance of an ATIS HVGA Gen3 event camera, a DVS240 event camera, and two high-speed global-shutter cameras. Their results demonstrated the advantages of event cameras, particularly in terms of bandwidth efficiency, and also explored the limitations of event-based sensors in pixel latency and readout bandwidth, particularly in highly cluttered scenes. Similarly, Barrios et al.[37] carried out a comparison employing a GENIE M640 CCD camera and a non-commercial event CMOS camera connected to a Powerlink IEEE 61 158 industrial network for controlling a two-axis planar robot during object tracking. The results showcase the event camera's capability to enable the robot to track the target with greater speed, accuracy, and stability, especially under varying light conditions. The work of Censi et al.[38] proposed a formal evaluation of diverse sensor families using a power-performance curve. The study focused on contrasting traditional CCD/CMOS sensors with neuromorphic vision sensors and revealed the task-dependent dominance of different sensors across various sensing power ranges. Cox et al.[39] introduced a theoretical methodology to assess the performance of event and frame cameras. Their approach involved employing system-level models and surrogate performance metrics for target recognition tasks.
Furthermore, data-processing perspectives have been explored in comparative studies between frame and event cameras. Farabet et al.[40] conducted a comparison between frame-based convolutional neural networks and frame-free spiking neural networks for object recognition applications. Implementation examples using VLSI chips and field-programmable gate array (FPGAs) were provided, analyzing differences in computational speed, scalability, multiplexing, and signal representation. Rebecq et al.[41,42] proposed an image reconstruction method based on a recurrent neural network trained with simulated events, evaluating and comparing the quality of the reconstructed images using standard computer vision algorithms such as visual-inertial odometry and object classification. Further, some studies shed light on the use of events and frames for specific tasks (e.g., the work from Rodríguez-Gómez et al.[43] on visual stabilization).
Flapping-Wing Robot Perception
The development of online onboard perception systems for flapping-wing robots is a challenging problem. Traditionally, control and guidance methods for ornithopters have relied on external sensors such as motion capture systems.[44–46] In addition, some studies have performed off-board processing of visual sensors.[6,7,47–51] However, recent advances in the size and weight reduction of visual sensors and processors have enabled the integration of fully onboard perception systems for ornithopters. One of the first methods proposed for ornithopters was the obstacle avoidance method of Wagter et al.[11] and Tijmons et al.[12] which used a lightweight stereo camera with a CPU module to detect static obstacles using disparity maps. More recently, Gómez et al.[15] presented an ornithopter guidance method based on event cameras. The algorithm tracks line pattern references from events and feeds a visual servoing controller to guide the robot toward a goal position. Rodríguez-Gómez et al.[16] presented an event-based dynamic obstacle avoidance method for ornithopters, which quickly detects moving obstacles and triggers evasive actions controlling ornithopter tail deflections. The use of event-based frequency processing onboard a flapping-wing robot is explored in Tapia et al.[52] These works use different types of vision sensors (frame-based, event-based, and/or stereo setups) but do not conclude which sensor is the most suitable for flapping-wing robots. In a previous work,[20] we compared frame and event cameras for ornithopters, concluding that although event cameras have a lower level of maturity, they are more suitable for flapping-wing robot onboard perception than frame-based cameras. This article represents a step forward and evaluates the use of the complete event-based vision system for ornithopter onboard perception from a comprehensive perspective by assessing 1) the operational limits of event-based vision considering effects such as vibration level; 2) the ease of integration of event cameras on board ornithopters; and 3) the capacity of event-based systems to perform different perception tasks on board our flapping-wing robots indoors and outdoors.
Flight Challenges
This section intends to answer Q1: Are event cameras suitable for online onboard perception considering the challenging flight conditions of flapping-wing robots? For this purpose, it is necessary to consider what these challenging conditions are and how they affect onboard vision systems. As stated in ref. [20], the flight of the ornithopters imposes substantial requirements on the onboard vision system. In search of efficiency and robustness, we need to know the operational limits of event-based vision under several aspects.
Agile movements, strong vibrations, and changes in tilt angle caused by flapping strokes lead to sudden changes in the visual scene captured by the camera. Therefore, the onboard vision system requires a high temporal resolution to ensure that these rapid changes can be perceived. The ability to capture data with a precise temporal resolution enables a high degree of responsiveness and reactive decision-making, which is particularly critical in dynamic scenarios. However, high temporal resolution is generally associated with generating a large amount of information during flight. Hence, it is also necessary to analyze the bandwidth of the onboard sensors. In addition, to avoid information loss, the vision system must be robust to the motion blur that can be caused by the rapid shifts and vibrations mentioned earlier. Although event cameras are significantly robust to blur, they are not entirely insensitive.[53] Moreover, vibrations, fast motions, or changes in the tilt angle can cause abrupt fluctuations in the lighting captured by the camera. This potentially leads to information loss as the camera fails to adapt to changing lighting conditions. Hence, a high dynamic range is critical, particularly in outdoor scenarios. The analysis of operation in dark lighting conditions is also relevant for indoor applications.
We have experimentally analyzed the response of event-based vision systems in terms of 1) temporal resolution; 2) bandwidth; 3) motion blur; 4) dynamic range; and 5) operation in dark conditions. These analyses include experiments in test benches designed to mimic the flapping-wing flight conditions.
Temporal Resolution
Ornithopter vibration and agile maneuvers require perception systems capable of effectively capturing and analyzing objects with very high temporal resolution. This is particularly crucial, for instance, for sense-and-avoid systems. Unlike traditional cameras that operate by capturing a fixed number of frames per second, event cameras capture visual information asynchronously, hence enabling temporal resolutions that are mainly limited by the camera's refractory period. To illustrate the potentialities of event sensors in terms of temporal resolution, we designed an experiment in which different cameras were used to detect a blinking LED, whose frequency varies from 1 Hz to 1 kHz. The temporal resolution of event-based cameras provides a significant advantage for detecting and tracking rapid movements. In the context of ornithopter perception, it is essential to consider the combined effects of the flapping motion and the detected target dynamics. These factors can increase the operational frequency demand of the detection system. Therefore, the operational flapping frequency of the ornithopter (yellow area in Figure 2) should be regarded as a minimum requirement for the onboard detection system. The dynamic vision sensor (DVS) of a DAVIS346, a DVXplorer Mini, and two frame-based sensors (with frame rates 30 and 40 Hz) were evaluated. To avoid any possible degradation produced by motion blur, see discussion in Section 3.3, the LED was placed to cover a sufficient number of pixels, ensuring its detection even if some pixels did not produce events. In the case of the DVS and the DVX, a blink of the LED is detected as two consecutive events with opposite polarities (ON and OFF) in the corresponding pixels. For the frame cameras, an abrupt change in the mean intensity of the LED's pixels is computed.
[IMAGE OMITTED. SEE PDF]
As shown in Figure 2, DVS and DVX can correctly detect blinking LEDs in the entire range of frequencies analyzed. In contrast, frame-based detection of the blinking frequency is limited by the Nyquist frequency—that is, half of the camera frame rate: 15 and 20 Hz, respectively. A frame camera with a temporal resolution similar to that of the DVS or the DVX would require a frame rate of >2 kHz, which would require bigger and heavier cameras that are far beyond the payload capabilities of ornithopters. Besides, processing frames at high frequencies requires dedicated hardware (e.g., GPU, parallelization) while increasing payload and power consumption. Results for higher frequencies are not shown because the event camera's refractory period would deteriorate the ON/OFF trigger detection.
Bandwidth
The lightweight computers that can be mounted on board ornithopters impose severe limitations in terms of computing power (i.e., bit/s that can be processed). Standard cameras suffer from the bandwidth-latency trade-off (BW ∝ 1/Δt). Conversely, event cameras are not governed by a framerate and, hence, can achieve a variable bandwidth while maintaining a low latency. Given an electronic camera configuration (i.e., the current biases that control bandwidth, contrast threshold, and refractory period, among others), the event generation rate depends mainly on the relative motion between the camera and the scene. Therefore, it is necessary to quantitatively evaluate the amount of information to be processed in both frame and event cameras. For that purpose, we analyzed the number of pixels (APS) and events (DVS) generated per second by the DAVIS346 event camera in Soccer, Testbed, and Hills scenarios from the GRIFFIN Perception Dataset.[4] Both sensors have the same resolution (346 × 260), have their pixel coordinates aligned, and share the same optics (i.e., same FoV, AFoV, focal length, distortion, etc.). Table 1 shows the number of events and pixels generated per second for three different scenarios in the dataset. It is worth noting that the values are lower for the DVS since it only captures changes in brightness, hence reducing the amount of redundant information from the scene (e.g., static background such as open sky). This can also be inferred in Figure 3, where the number of events and pixels per millisecond in the sequence Hills Base 1 are shown. The influence of the different ornithopter flight stages on event generation can be easily noticed: 1) launching (low event generation); 2) flapping (high event generation); and 3) landing (abrupt event generation). We can conclude that frame-based sensors suffer from oversampling during most of the flight but also from undersampling during certain aggressive maneuvers (e.g., ornithopter landing at t = 38 s at Figure 3). In contrast, the ability of event cameras to generate information according to the scene dynamics contributes to an enhancement in information capture efficiency.
Table 1 Mean, median, and maximum number of pixels and events generated in all the Soccer, Hills, and Testbed sequences.[4] MEPS stands for million events per second, MPPS stands for million pixels per second.
| Soccer | Hills | Testbed | |||||||
| Mean | Median | Max | Mean | Median | Max | Mean | Median | Max | |
| APS [MPPS] | 3.60 | 3.60 | 3.60 | 3.60 | 3.60 | 3.60 | 3.60 | 3.60 | 3.60 |
| DVS [MEPS] | 0.42 | 0.39 | 8.35 | 0.39 | 0.30 | 8.69 | 0.91 | 0.80 | 4.16 |
| DVS [%]a) | 3.50 | 3.25 | 69.58 | 3.25 | 2.50 | 72.42 | 7.58 | 6.67 | 34.67 |
[IMAGE OMITTED. SEE PDF]
Motion Blur
Global and rolling shutter frame-based cameras suffer from motion blur, mainly when the exposure time is relatively long with respect to the dynamic of the visual scene. In contrast, event cameras are more robust against motion blur due to their asynchronous nature, low latency, and high temporal resolution. Ornithopter flapping strokes produce aggressive tilting motions that affect the information captured by the cameras.[4] Although the work in ref. [54] presents a mechanical stabilizer to reduce tilt motion, the strict payload requirements of flapping-wing robots restrict the integration of gimbal devices. Furthermore, although some software deblurring or stabilization approaches may be feasible for low levels of blur,[55] they introduce a significant delay in the processing pipeline that prevents real-time operation.
Despite the higher robustness to motion blur when compared to frame-based cameras, [ref. 20, Section 5.2], event cameras can also suffer from motion blur under certain conditions. However, the effect observed in an event-based sensor when the relative camera-scene motion increases is quite different from that experienced by a conventional sensor. According to Benosman et al.[53] as the speed increases, motion blur causes events to form clusters rather than sharp edges, resulting in a sparser motion flow in event-based systems. Additionally, the camera does not generate enough events for all spatial locations, and as a result, some pixels are not activated. This effect, produced by the latencies of the sensor when capturing the light, was evaluated by mounting a DAVIS346 on a platform that mimics the horizontal flapping motion of ornithopters; see Appendix A. The camera was set pointing toward a black canvas with a white horizontal line under uniform and constant illumination conditions, and the camera-canvas distance was such that the line was present within the camera FoV during the whole experiment. Events are mainly triggered by the changes in pixels’ intensities caused by the platform's oscillatory motion. Thus, in these experiments, events are triggered at the projections of the line edges on the image plane. During the experiment, the platform oscillates first at 5.0 Hz and then gradually reduces to 2.5 Hz.
To estimate the motion blur, we composed event images by accumulating a fixed number of events into frames and drawing them as black pixels in a white image. Figure 4 shows the event frames obtained by accumulating 1000 events at different oscillation frequencies. At higher frequencies, motion blur becomes visible as holes (white pixels) among the black pixels that describe the shape of the line. Another effect can be observed: the thickness of the line increases with higher frequencies. This occurs since the number of activated pixels decreases with the increase in frequency (i.e., the line's speed), as previously discussed. We define the percentage of triggered events (PTE) as the number of triggered events divided by the area of the patch caused by the line in each event frame. That is, PTE is an indicator of events lost by motion blur on event cameras. Figure 5 shows the experimental results obtained. At 5.0 Hz, PTE is around 74.38% and increases as the oscillation frequency reduces, evidencing that flapping strokes produce motion blur in event cameras. This experiment cannot directly measure the total number of lost data, as the sensor might lose one or more events triggered at the same pixel, but it provides a valid approximation.
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
Dynamic Range
Due to mechanical vibrations and sudden tilt angle changes due to the flapping strokes, the cameras on board ornithopters can have sudden drastic changes in lighting conditions, including brightly lit scenes, dark scenes, and also cases in which the camera FoV includes both bright and dark scene parts at the same time. Therefore, dynamic range is a critical aspect to consider when selecting the onboard sensors. In [ref. 20, Section 5.1], a DVS sensor was evaluated for ArUco detection in event-reconstructed images under different dynamic range conditions and presented an excellent performance of up to 80 dB. In that experiment, the noise and the insufficient number of events prevented the correct reconstruction used for the detection with strong dynamic range conditions. To remove that influence and complement the analysis, we designed a setup consisting of a spinning dot (a black rotating disk with a white dot on it), which was partially illuminated with a strong light source (mimicking a situation similar to a visual scene partially affected by the sunlight). The disk was spinning at a constant velocity. The velocity was low enough to assume that motion blur was not produced. We pointed our DAVIS346 camera toward the disk together with other frame-based cameras (Intel Realsense D345 and ELP Mini720p; hereinafter RS and ELP, respectively) to detect the dot. We recorded a sequence of 30 s for each camera and then processed the images using a circle detection algorithm based on the Hough transform. A detection is considered when the number of votes received in a Hough space cell is higher than a threshold τ = 0.6τmax, where τmax is computed by assuming that all pixels of the dot perimeter produce a vote in the same coordinates. As a result, the algorithm was able to reject false positives while maintaining an accurate detection. The input images for the Hough detection were obtained using Canny (for frame-based cameras) or event images generated by accumulating events in the temporal windows of 3 ms. Then, we computed the number of dot detections performed in different disk sectors. Figure 6 shows a polar histogram with the number of dot detections in each disk sector for each sensor; the disk sector affected by the light source is shown in gray color. The DVS was able to detect the dot in the illuminated area, and the number of detections was similar to that in the non-illuminated sector. In contrast, all frame-based cameras failed to detect the dot in the illuminated area due to the sensor saturation.
[IMAGE OMITTED. SEE PDF]
Dark Lighting Conditions
Perception using visual sensors becomes particularly challenging in dark scenes. Although, in some cases, increasing the exposure time of frame-based sensors can mitigate the problem, it might also increment the motion blur and noise level. We evaluated the performance of DAVIS346's DVS and active pixel sensor (APS), ELP, and RS under three different conditions: pitch-dark (≈0 lx), dark (≈5 lx), and well-lit conditions (≈100 lx). All cameras were set with parallel optical axes that pointed to a pattern with four lines. The pattern was moved slowly to generate events while minimizing motion blur. In addition, the three different lighting conditions were applied sequentially (≈10 s between every change). Line detection was performed using the Hough transform applied to frames and event images (generated by accumulating 1000 events per image, see Figure 7). For a fair evaluation, the frame-based sensors were configured with autoexposure. DVS was the only sensor capable of detecting lines during the whole experiment. As expected, pitch-dark conditions hinder line detection for frame-based cameras even with high exposure times. Conversely, although the abrupt change in the lighting conditions generates events that lead to false line detections, the event camera was able to detect all lines in all the tested lighting conditions satisfactorily.
[IMAGE OMITTED. SEE PDF]
Discussion
Event cameras demonstrate a strong potential for onboard perception in flapping-wing robots by addressing the challenges posed by their dynamic flight. Their asynchronous nature allows them to capture only relevant scene changes, reducing data load while maintaining responsiveness during agile maneuvers. Although motion blur can affect frame-based cameras’ performance under extreme conditions (e.g., vibrations and aggressive flapping strokes), event cameras remain more robust. The reliability and high autonomy of flapping-wing robots make them suitable platforms for both indoor and outdoor operations. Hence, a high dynamic range and good performance under different light conditions are properties to be emphasized. All the results presented in this section highlight the suitability of event cameras for onboard vision in ornithopters. However, the growing interest in R&D for ornithopter technology is leading to platforms with higher payload capacities, with smoother flapping strokes, and capable of performing smoother trajectories. This may enable the integration of multi-sensor approaches. The use of event cameras with additional sensors can enhance the performance and reliability of perception systems by mitigating individual sensor limitations, leading to a more robust and adaptable perception system through data fusion.
Integration Challenges
This section aims to answer Q2: Are event cameras suitable to be integrated (HW&SW) on flapping-wing robots considering the constraints of these platforms? The strict payload and weight distribution requirements of the flapping-wing robots severely restrict the integration of onboard sensors. The size and weight of the camera are critical and can severely affect the flight ability and controllability of these platforms. Recent advances in event-based technology point to an increasing miniaturization of devices with increasing spatial resolution—now over 1 Mpx.[34,56] In addition, the limited payload of the ornithopter prevents the use of high-capacity batteries. Hence, low energy consumption becomes a critical aspect to be considered. Frame-based camera technology has been in development for years. Consequently, there are a large number of algorithms, datasets, and benchmarks available. On the contrary, event-based vision does not have such a long history, and the direct use of frame-based algorithms and data is not always feasible. Therefore, from a software integration point of view, it is also necessary to analyze the availability of resources and support tools. In this section, we analyze the integrability of event cameras onboard flapping-wing robots by reviewing sensor miniaturization, resolution, energy consumption, and the availability of resources and support tools.
Different large-scale flapping-wing aerial platforms have been developed in recent years, such as RoBird,[57] Festo SmartBird (), RoboRaven,[58–61] Beihawk.[62] Other bioinspired platforms with a shorter wingspan are Bat Bot,[63] Dove,[64] ThunderI,[65] and UST-Bird.[66] Besides these platforms, other small-scale and insect-like solutions[67–69] are DelFly,[6,11,70] Flapper Nimble+ by Flapper Drones (), Nano Hummingbird,[71] KUBeetle-S,[72,73] MetaFly by BionicBird (), and Robobee[45,74–78] (which inspired the robot in ref. [49,50] and Bee+).[79] The GRIFFIN ERC Advanced Grant project (GRIFFIN (Action 7 882 479): General compliant aerial Robotic manipulation system Integrating Fixed and Flapping-wings to INcrease range and safety. ) aims to develop novel ornithopters with advanced perception and manipulation capabilities. One of the platforms developed in this project is E-Flap,[5,80–83] the first flapping-wing platform with onboard processing and perception capabilities, a remarkable payload, and the ability to fly at low speed, enabling the interaction with the environment.[84–86] In addition, the Hybrid robot[87] offers autonomous navigation capabilities with an onboard autopilot capable of switching between fixed-wing and flapping-wing flight modes. Other research projects of great relevance include the PortWings ERC Advanced Grant[88,89] and the DelFly Project.[2,6] To bridge the gap between the experimental setup and real-world conditions, we conducted an analysis to examine the feasibility of integrating the event cameras detailed in Table 2 into several flapping-wing robots (encompassing data from both research literature and commercially available designs) listed in Table 3. The main goal is to provide insights into the possibility of enhancing the performance of the robots and expanding their operational capabilities through the use of advanced event-based perception technology.
Table 2 Summary of the main specifications of some event cameras from various companies. Data collected from[18] and from manufactures’ datasheets and product briefs.
| Camera | Sensor | ||||||||
| Weighta) [g] | Volume [mm] | Consumption | Max BW [MEPS] | Latency [μs] | Spatial Res. [px] | ||||
| iniVationd) | DVS128 | [26] | 65 | 40 × 60 × 25 | 60 mA@5 v | 1 | >12 | 128 × 128 | |
| DVS240 | [32] | 75 | 40 × 60 × 25 | 180 mA@5 v | 12 | >12 | 240 × 180 | ||
| DAVIS240 | [32] | 75 | 56 × 55 × 27 | 180 mA@5 v | 12 | >12 | 240 × 180 | ||
| DAVIS346 | – | 100 | 40 × 60 × 25 | 180 mA@5 v | 12 | <1000 | 346 × 260 | ||
| DVXplorer | – | 100 | 40 × 60 × 25 | 140 mA@5 v | 165 | <1000 | 640 × 480 | ||
| DVXplorer Lite | – | 75 | 40 × 60 × 25 | 140 mA@5 v | 100 | <1000 | 320 × 240 | ||
| DVXplorer Mini | – | 21 | 29 × 29 × 32 | 140 mA@5 v | 450 | <1000 | 640 × 480 | ||
| Prophesee | EVK1 | ATIS | [30] | 82.5 | 60 × 38 × 50 | 50–175 mWb) | – | >3 | 304 × 240 |
| Gen3 ATIS | – | 82.5 | 60 × 38 × 50 | 25–87 mWb) | 66 | 40–200 | 480 × 360 | ||
| Gen3 CD | – | 82.5 | 60 × 38 × 50 | 36–95 mWb) | 66 | 40–200 | 640 × 480 | ||
| Gen4 CD | [27] | 82.5 | 60 × 38 × 50 | 32–84 mWb) | 1066 | 20–150 | 1280 × 720 | ||
| EVK2 | Gen4 CD | [27] | 260 | 102 × 58 × 42 | 7500 mW | 1066 | 220 | 1280 × 720 | |
| EVK3 | Gen3 CD | – | 112 | 108 × 76 × 45 | 4500 mW | 1.6 Gbps | 220 | 640 × 480 | |
| Gen4 CD | [27] | 112 | 108 × 76 × 45 | 4500 mW | 1.6 Gbps | 220 | 1280 × 720 | ||
| GenX320 | – | 112 | 108 × 76 × 45 | 4500 mW | 1.6 Gbps | 220 | 320 × 320 | ||
| EVK4 | IMX636Esc) | – | 40 | 30 × 30 × 36 | 500 mW | 1.6 Gbps | 220 | 1280 × 720 | |
| Samsung | DVS-Gen2 | [33] | no case | no case | 27–50 mWb) | 300 | 65–410 | 640 × 480 | |
| DVS-Gen3 | – | no case | no case | 40 mWb) | 600 | 50 | 640 × 480 | ||
| DVS-Gen4 | [34] | no case | no case | 130 mWb) | 1200 | 150 | 1280 × 960 | ||
| CelePixele) | CeleX-IV | [156] | no case | no case | – | 200 | >10 | 768 × 640 | |
| CeleX-V | [56] | no case | no case | 400 mWb) | 140 | >8 | 1280 × 800 | ||
| Insightnessf) | Rino3 | – | 15 | 350 × 350 | 20–70 mWb) | 20 | 125 | 320 × 262 |
Table 3 Summary of the main characteristics of some flapping-wing robots from various research institutions and manufacturers with their specifications.
| Weight [g] | Wingspan [m] | Flap. Freq. [Hz] | Payload [g] | Battery [mAh@LiPo cells] | ||
| Robobee[45] | Harvard U. | 0.075 | 0.035 | 120 | – | – |
| Four-wings[157] | UW | 0.143 | 0.056 | 160 | – | – |
| Bee+[79] | USC | 0.095 | 0.033 | 100 | – | – |
| DelFly Micro[6] | TU Delft | 3.07 | 0.10 | 30 | 0.4a) | 100-120@1S |
| DelFly Explorer[11] | TU Delft | 20 | 0.28 | 10–14 | 4a) | 250@1S |
| DelFly Nimble[70] | TU Delft | 29 | 0.33 | 17 | 4a) | 250@1S |
| Flapper Nimble+b) | Flapper Drones | 102 | 0.49 | 12–20 | 25 | 300@2S |
| N. Hummingbird[71] | AeroVironment | 19 | 0.165 | – | – | – |
| KUBeetle-S[73] | Konkuk U. | 15.8 | 0.20 | 18 | 4 | 160@1S |
| MetaFlyc) | BionicBird | 9.5 | 0.29 | – | – | – |
| MetaBirdd) | BionicBird | 9.5 | 0.33 | – | – | – |
| X-Flye) | BionicBird | 12 | 0.38 | – | – | 60@1S |
| Bat Bot[63] | UIUC | 93 | 0.469 | 10 | – | – |
| Dove[64] | NWPU | 220 | 0.50 | 12 | – | – |
| Thunder I[65] | NMSU | 350 | 0.70 | 5.88 | – | 800@3S |
| UST-Bird[66] | USTB | 83.2 | 0.80 | 4 | 18 | 180@3S |
| USTB-Hawk[158] | USTB | 985 | 1.78 | 4.108 | 192 | 7000@3S |
| RoboRaven I[61] | UMD | 285 | 1.168 | 4 | 43.8 | 370@2S |
| RoboRaven II[61] | UMD | 301.6 | 1.330 | 4 | 80.4 | 370@2S |
| RoboRaven III[61] | UMD | 317 | 1.330 | 4 | 71 | 370@2S |
| RoboRaven IV[61] | UMD | 438.1 | 1.168 | 4 | 272.9 | 370@2S |
| RoboRaven V[61] | UMD | 438.1 | 1.168 | 4 | 272.9 | 950@2S |
| Beihawk[62] | Beihang U. | 1200 | 1.50 | 10 | – | – |
| E-Flap[5] | U. Seville | 510 | 1.50 | 5.5 | 520 | 450@4S |
| Hybrid[87] | U. Seville | 930 | 1.50 | 3.0 | 300 | 450@4S |
| RoBird B. Eagle[57] | U. Twentel) | 2100 | 1.76 | 4 | 1000 | – |
| RoBird P. Falcon[57] | U. Twentel) | 730 | 1.12 | 5.5 | 100 | – |
| SmartBirdf) | Festo | 450 | 2.00 | – | – | 450@2S |
| BionicOpterg) | Festo | 175 | 0.63 | 15–20 | – | – |
| eMotionButterfliesh) | Festo | 32 | 0.50 | 1–2 | – | 7.4@2S |
| BionicFlyingFoxi) | Festo | 580 | 2.28 | – | – | – |
| BionicSwiftj) | Festo | 42 | 0.68 | – | – | – |
| BionicBeek) | Festo | 34 | 0.24 | 15–20 | – | 300@1S |
Miniaturization
Similarly to frame-based vision systems, the improvement in event vision technology has led to smaller sensors, as evidenced in Figure 8. This clear trend toward sensor miniaturization endorses the possibility of developing smaller and lighter event cameras. Figure 9 shows the evolution in weight and volume of some commercial devices manufactured by iniVation, one of the most relevant event camera companies. There is a widespread trend among several event camera manufacturers toward achieving levels of miniaturization comparable to those of frame-based vision sensors. Relevant is to mention SPECK (https://www.synsense.ai/products/speck-2) by SynSense, the first fully event-based neuromorphic vision system-on-chip (SoC) that weighing a few grams integrates a DVS. To assess the feasibility of integrating event camera models into existing flapping-wing robots, Table 4 presents a comparison between the original weights of these cameras and the payload capacities of the ornithopters. While this analysis provides a preliminary insight into the potential for incorporating event-based sensors, a successful integration would also require consideration of additional factors such as: 1) weight distribution; 2) volume; 3) camera connector and cable weights; or 4) weight reduction (e.g., replacement of heavy cases by lighter cases), among others.
[IMAGE OMITTED. SEE PDF]
[IMAGE OMITTED. SEE PDF]
Table 4 Comparison of the weight of event-based vision cameras (Table 2) and the payload capacity of flapping-wing robots (Table 3). A green cell indicates that the camera could be mounted onboard (considering payload restrictions only), while a red cell indicates the opposite.
|
|
Spatial Resolution
High-resolution cameras offer more detailed information about the scene, leading, for example, to finer grain mapping or more accurate detection algorithms. Event cameras are trending toward higher sensor resolutions, as illustrated in Figure 10. However, high resolution involves some drawbacks, such as increased weight and power consumption. There are different opinions on the benefits of using high-resolution event cameras to solve standard computer vision tasks. Higher resolutions involve higher computational resources to process the generated events. In particular, Gehrig and Scaramuzza[90] pointed out that low-resolution event cameras can achieve better performance than high-resolution cameras while using significantly less bandwidth.
[IMAGE OMITTED. SEE PDF]
Energy Consumption
Ornithopters’ payload constraints also affect the onboard batteries. Therefore, the power consumption of the different components mounted onboard requires careful consideration. First, it has been experimentally demonstrated that flapping-wing robots consume in gliding flight mode about 90% less than in flapping flight mode.[91] Thus, planning the flight stages to minimize the flapping and efficiently manage the energy is of high relevance. In addition, the results in ref. [87] suggest a better efficiency of flapping under certain conditions. The importance of gliding to save energy is a major paradigm shift with respect to multirotor platforms. Event cameras suit perfectly in this shift. Whereas frame-based cameras always generate images at a constant rate, the event generation rates of event cameras depend on the motion speed. Event cameras are not only more energetically efficient than frame cameras but also their consumption is reduced with lower event generation rates.[91] During gliding, event cameras have lower event generation rates than during flapping, involving lower energy consumption. In addition, lower event generation rates involve less information to be processed by the onboard computer and, consequently, further energy consumption reductions.
To assess the relation between the event generation rate and the power consumption of the camera and the potential limitation this consumption could pose to the integration of these sensors onboard an ornithopter, we selected a DAVIS346 (180 mA@5 v in Table 2) and mounted it on the motorized mechanism described in Appendix A. We selected this model because it is the one that consumes the most power among all the cameras we have available, serving as an upper bound of many other models with even lower power consumption. The setup incorporated a VectorNav VN-200 inertial navigation system to measure pitch rates and an INA219 power monitor to measure power consumption. The camera power consumption was recorded as the pitch rate progressively increased from 0 to 6 Hz (a typical operational range for various flapping-wing robots, as seen in Table 3). The power consumption of event cameras is influenced by the number of triggered events; hence, the experiment was conducted in three types of scenarios with different levels of internal motion: static (no motion in the scene; 0.04–0.24 MEPS), low-dynamic (some objects in the scene moved slowly; 0.43–3.11 MEPS), and high-dynamic (some objects moved fast; 1.68–11.58 MEPS) scenarios. Figure 11 presents the average power consumed by the camera as a function of the pitch rate. The results include the consumption not directly produced by event generation (e.g., APS, IMU, noisy events, LEDs, other electronic components), which is ≈0.7w, measured with the camera completely static in a scenario without movement. Instantaneous pitch rates were derived from the VectorNav's attitude measurements using the Hilbert transform. The results revealed that the camera exhibited a power consumption that is perfectly compatible with the typical battery range of ornithopters listed in Table 3 even in the cases with higher consumption (high-dynamic scenario), with few exceptions for very small-scale devices such as Robobee, MetaFly, and DelFly Micro, among others. Therefore, both the DAVIS346 and other lower-consumption models (e.g., DVS240, DVXplorer, Prophesee Gen3 CD/ATIS, Samsung DVS-Gen2–4, among others) seem to be suitable for mounting onboard most ornithopters from the energy requirements perspective.
[IMAGE OMITTED. SEE PDF]
Resources and Support Tools
The growing interest in event cameras has motivated a growing research community that is rapidly increasing the number of publications in the field and the availability of resources. Event-based vision resources () are one of the largest resource collections for event vision research. It includes publications classified by categories, datasets, software, and other relevant information. Event cameras have also received strong interest from the industry. Companies such as iniVation () (acquired by SynSense) and Prophesee () not only manufacture event-based vision sensors but also offer comprehensive software tools to enhance their usage. For example, Prophesee presents the Metavision SDK (Metavision Intelligence Suite) built upon the OpenEB open-source architecture, and iniVation provides a framework based on the DV GUI, complemented by DV-processing—an open library of reusable algorithms. In addition, major companies such as Samsung ()[33,34,92–95] and Sony ()[96,97] are actively contributing to the production of these event-based visual sensors. Other companies like CelePixel (acquired by OmniVision, Will Semiconductor), Insightness () (acquired by Sony), and SynSense () have also developed their own products. The community can also benefit from various open-source libraries and tools like jAER (), libcaer (), RPG ROS DVS package (), Prophesee's Metavision (), inivation SDK (), CelePixel SDK (), and ASAP,[98,99] which facilitates the deployment of software applications and research implementations.
Publicly available datasets play a key role in the development of novel event vision algorithms. Datasets have been developed for general computer vision, aerial robotics, and some specific applications. For tasks such as optical flow, motion segmentation, HDR image reconstruction, visual odometry, and SLAM, datasets such as the Event Camera Dataset,[100] EVIMO EVIMO,[101,102] TUM-VIE,[103] EDS,[104] DSEC,[105] VECtor,[106] ECMD,[107] HKU,[108] and others presented in refs. [42,109,110] offer diverse scenarios for the development and evaluation of event-based vision algorithms. In aerial robotics, the multivehicle stereo event camera dataset (MVSEC),[111] UZH-FPV drone racing dataset,[112] and multi-robot, multi-sensor, multi-environment event dataset (M3ED),[113] among many others, include sequences of event camera data along with other sensors onboard different aerial vehicles within diverse scenarios including urban and forest environments. Regarding ornithopter onboard perception, the only available dataset is the GRIFFIN perception dataset.[4] Furthermore, other specialized datasets such as CED,[114] MNIST-DVS,[115] ATIS PLANE,[116] DVS Gesture Dataset,[117] and PEDRo[118] are tailored to specific applications, providing researchers with resources to address a wide variety of challenges in event-based vision.
Event camera simulators also play a crucial role in the generation of synthetic data for various scenarios. ESIM[119] provides an open-source platform whose architecture tightly integrates the rendering engine and the event simulator for adaptive sampling of the visual signal. Work[120] proposes v2e, a toolbox to generate realistic synthetic DVS events from intensity frames. The work in ref. [121] presents an extended DVS pixel simulator, which simplifies the latency and noise models. Work[122] provides a real-time simulation of events using a frame-based camera.
Discussion
Commercially available event cameras are now suitable for integration in flapping-wing robots due to their miniaturization, robustness, and energy efficiency, all of which address the constraints of these platforms. The advances in event camera technology, including small, lightweight designs and higher resolutions, are aligned with the strict payload and weight requirements of flapping-wing robots. However, the superiority in miniaturization and ease of integration of frame-based technologies, primarily due to many more years of advancement and development, is undeniable. Tiny and lightweight devices are available at very affordable prices, while event cameras still maintain high prices. On the contrary, the amount of software resources on frame-based vision is immense (e.g., OpenCV, scikit-image, and lots of datasets). Despite the efforts to standardize frameworks, event-based vision researchers still tend to use custom libraries and tools. Although event cameras are viable for HW&SW deployment into low-payload platforms such as ornithopters, such integration is still challenging, and a meticulous design is critical to make sure that all requirements and constraints are fulfilled.
Processing Challenges
Using event cameras for onboard ornithopter perception requires event-based algorithms capable of coping with the strong vibrations and sudden motions of flapping-wing flight while fulfilling the strict computational requirements to enable online execution on resource-constrained computers that can be mounted on ornithopters. Addressing these requirements poses significant challenges. This section aims to answer Q3: Are the performance and computational cost of event-based algorithms feasible to enable online onboard perception task for ornithopters? First, event-based algorithms should be executed online, which is very challenging considering the high event generation rates while the ornithopter is performing agile motions or flying in flapping mode. In addition, ornithopter vibrations and sudden motions during flapping can significantly degrade the performance of event-based algorithms that work successfully under other not-so-demanding conditions. In general, the performance of event-based algorithms onboard ornithopters cannot always be predicted. For instance, event methods based on frame-like representations (e.g., event images, and time surfaces) may be primarily affected by the strong vibrations when the ornithopters fly in flapping mode, but, on the contrary, can be satisfactory when they fly in gliding mode. Conversely, event-by-event processing tends to offer better adaptability to gliding and flapping modes but may become computationally intensive when large event rates need to be processed, which can cause processing bottlenecks.
In this section, we analyze the performance and computational cost of event-based vision methods for common perception tasks: corner detection, line detection, feature tracking, object tracking, and visual odometry. The analyses were performed using experimental datasets recorded in ornithopter flights in indoor and outdoor scenarios, which involve very different conditions that strongly affect the performance and computational cost of event-based algorithms. Indoors, the ornithopter flies only in flapping mode (outdoors, in both flapping and gliding modes), involving stronger vibrations and higher event generation rates. In addition, the flights and distances to objects indoors are shorter, and the lighting conditions are more uniform than in outdoors. In this evaluation, we have used data from our previous contribution[20] recorded at the GRVC Robotics Lab indoor testbed (hereinafter TORRICELLI dataset) together with new outdoor flights (hereinafter SAETA dataset) recorded specifically for this work at two airfields near Seville (Spain). New sequences were recorded using a modified version of Hybrid[87] without propellers (see Figure 1). It was equipped with a DAVIS346 event camera, an ELP RGB monocular camera, a Matek H743 Mini V3 autopilot, and a Khadas VIM3 as the processing unit. The DAVIS346 has two sensors: a DVS that outputs events and a frame-based APS. Further details of the platforms used for data acquisition are summarized in Appendix B. Both TORRICELLI and SAETA datasets are publicly available (). A detailed description of the datasets and the sensor calibration process is presented in Appendix C.
All algorithms were executed on the same single-board computer, a Khadas VIM3 (Ubuntu 20.04, 2.2 GHz quad-core ARM Cortex-A73 + 1.8 GHz dual-core ARM Cortex-A53, 4 GB DDR4 RAM) that is installed on board the robots. We use the real-time Factor (RTF) for computational cost evaluation. Given a set of sequences {S1,…SN} of events or images where d(Sk) is the duration of the k-th sequence and an algorithm that requires to completely process the k-th sequence, the mean RTF is defined as
We also use as an evaluation metric the algorithm Rate (measured in events per second), defined as the frequency at which output is provided.
Corner Detection
We evaluate the performance of well-known event-based corner detection algorithms when processing events collected on a flapping-wing robot. The eHarris,[123] eFast,[124] and Arc[125] corner detection methods are selected due to their remarkable performance and ability to exploit the asynchronous nature of event cameras. For fairness of the validation, eFast and eHarris also integrate the event filter included in ref. [125]. Corners are extracted from checkerboards in the TORRICELLI and SAETA datasets, and their Ground Truth (GT) references are manually annotated from grayscale frames. Corner detection performance is measured using the metrics Accuracy, Precision, Recall, and F1. In this context, a True Positive corresponds to a corner feature that lies within a radius of 3.5 px from the GT; otherwise, it is set as a False Positive. In contrast, False Negatives are GT references without any detected corner in a distance of 3.5 px. To ensure temporal consistency between GT references and event corner features, we consider only features in a time window of 5 ms before the timestamp of each GT sample. Moreover, our evaluation includes the Harris corner detector[126] implemented in the OpenCV library to compare the performance of the event-based detectors against a well-known frame-based method. Table 5 summarizes the average results obtained in the TORRICELLI and SAETA datasets. eHarris reports small performance improvements compared to its frame-based counterpart. Fast and arc report lower performance than Harris except in Precision as the event-based methods present very few false positives. Regarding computational cost, we evaluated the RTF for each corner detection algorithm on the Khadas VIM3 onboard computer. Arc was the only event-based corner detector that operated with an RTF lower than 1, indicating that it could process events in real time even under the high event rates generated during flapping-wing flights.
Table 5 Average corner detection results. KEPS stands for thousand events per second, FPS stands for frames per second.
| TORRICELLI | SAETA | ||||||||||
| References. | RTF | Rate | Accuracy | Precision | Recall | F1-score | Accuracy | Precision | Recall | F1-score | |
| eHarris | [123] | 46.67 | 36.27 KEPS | 0.91 | 0.89 | 0.91 | 0.90 | 0.96 | 0.98 | 0.92 | 0.96 |
| eFast | [124] | 2.78 | 605.39 KEPS | 0.79 | 0.96 | 0.60 | 0.74 | 0.90 | 0.97 | 0.77 | 0.87 |
| Arc | [125] | 0.83 | 2005.73 KEPS | 0.88 | 0.97 | 0.80 | 0.88 | 0.93 | 0.97 | 0.86 | 0.93 |
| Harris | [126] | 0.30 | 40 FPS | 0.89 | 0.97 | 0.81 | 0.88 | 0.95 | 0.96 | 0.92 | 0.95 |
Line Detection
We opted to evaluate custom non-optimized implementations of ELiSeD [ref. 127, Algorithm 1] and the event-based line detector in [ref. 128, Algorithm 1] (Although the original work uses an ATIS event camera, the light intensity measurement is not used by the method) (hereinafter, eLD). We set the circular buffer of ELiSeD with a fixed size of 5000 events. Although ELiSeD produces line segments from event clusters (support regions, [ref. 127, Section 3B]), we also computed the (ρ, θ) coordinates of the detected lines for comparative purposes. Likewise, although eLD does not directly require the value θ of the line, we calculated it for evaluation. Additionally, we used the Hough transform implemented in the OpenCV library as a detection baseline for grayscale frames. We evaluated the detection of two horizontal lines located on one of the target boards. Lines not intersecting the target board or not satisfying -assuming (ρ, 0%) is a horizontal line- were not considered for evaluation. We used the mean distance between the detected and ground truth (GT) lines Eρ, and the mean number of lines detected NL to define the detection quality. The distances were normalized by the height of the target board in the image plane to make it independent of the distance between the robot and the target. The GT lines were obtained by manually labeling the images from the APS. In the case of ELiSeD, metrics were computed for each GT line using the closest-in-time detection. Table 6 summarized the results obtained in TORRICELLI and SAETA sequences. Regarding computational performance, the RTF values for both ELiSeD and eLD indicate that the implementations are close to real-time performance when being executed on a Khadas VIM3, which is a very low computational resourced hardware. This suggests that, with code optimization and parallelization, these event-based line detection methods may be suitable for online onboard processing during flapping-wing flights.
Table 6 Average line detection results. Notation: KEPS stands for thousand events per second, FPS stands for frames per second.
| TORRICELLI | SAETA | ||||||
| References | RTF | Rate | E ρ | N L | E ρ | N L | |
| ELiSeD | [127] | 1.69 | 694.49 KEPS | 0.52 | 1.88 | 0.43 | 1.74 |
| eLD | [128] | 1.43 | 720.12 KEPS | 0.54 | 1.90 | 0.45 | 1.78 |
| Hough | [159] | 0.82 | 40 FPS | 0.44 | 1.78 | 0.41 | 1.77 |
Feature Tracking
We have evaluated the performance of the well-known eKLT method[129] onboard our flapping-wing platform. We have also included results using the frame-based KLT algorithm implemented in the OpenCV library.[130] We use the metrics track-normalized error (TNE), and relative feature age (RFA) defined in [ref. 129, Section 5.2.2]. However, we manually annotated the corners of a checkerboard as GT instead of using KLT features. Tracks are initialized using GT, and we assume that a feature has been lost if its error—in terms of Euclidean distance in pixels—with respect to GT is higher than 2 px. The results obtained are presented in Table 7, and the evolution of the feature tracks corresponding to the four external corners of the checkerboard in boards_indoors_2 sequence are shown in Figure 12. Although the accuracy and feature age of KLT are better than those of eKLT, it must be considered that the use of events enables continuous tracking between frames (i.e., higher temporal resolution), which is critical to tracking fast-moving objects. In terms of computational cost, eKLT has a high RTF when executed on the Khadas VIM3. As stated in [ref. 129, Section 5.3], eKLT “is able to process about 17 000 events per second”, while the dataset used by the authors “reaches between 54 000–130 000 events per second”, hence, “there is room for improvement using a more distributed, i.e., parallelized, platform”. We run eKLT on a laptop computer (Intel i7, 16 cores, 2.7 GHz, 16 GB RAM), obtaining RTF = 13.17 and Rate = 15.28 KEPS. This indicates that it processes events slower than real time, making it unsuitable for online processing under the high event rates experienced during flapping-wing flight.
Table 7 Feature tracking evaluation. TNE stands for track-normalized error, and RFA stands for relative feature age. For comparison, we report the results from one indoor (poster_6dof) and one outdoor (outdoor_forward5) sequence in ref. [129], where TNE∗ and RTE∗ use KLT features as GT. KEPS stands for thousand events per second, FPS stands for frames per second.
| TORRICELLI | SAETA | poster_6dof | outdoor_forward5 | ||||||||
| References | RTF | Rate | TNE | RFA | TNE | RFA | TNE∗ | RFA∗ | TNE∗ | RFA∗ | |
| eKLT | [129] | 66.32 | 9.03 KEPS | 0.97 | 0.20 | 1.41 | 0.11 | 0.64 | 0.45 | 0.80 | 0.25 |
| KLT | [130] | 0.35 | 40 FPS | 0.84 | 1.00 | 0.91 | 1.00 | – | – | – | – |
[IMAGE OMITTED. SEE PDF]
Object Tracking
We evaluated an adapted version of the event-based tracking and clustering in [ref. 131, Section 3]. GT was obtained by manually annotating objects in the sequences and was used to initialize the tracks. Clusters are initialized using N events. An event at coordinates x is inserted into the cluster if . After every insertion, events older than τ are removed from the cluster. The parameters were experimentally selected: N = 400, R = 30 px, and τ = 0.02 s. The results are presented in Table 8. The tracking error is calculated as the distance between the centroid of the cluster and the GT centroid. Frame-based CSRT,[132] KCF,[133] and MOSE[134] from the OpenCV library were also tested for comparison. Only the CSRT method could track without losing the target, offering an RTF value greater than 1. In contrast, the event-based tracking algorithm was able to track the cluster and achieved a very low RTF.
Table 8 Object tracking evaluation. Errors (in pixels) with respect to the ground truth. eC&T stands for event-based clustering and tracking. MEPS stands for million events per second, FPS stands for frames per second.
| TORRICELLI | SAETA | ||||||||||
| References | RTF | Rate | Mean | Std dev | Median | Max | Mean | Std dev | Median | Max | |
| eC&T | [131] | 0.33 | 20.41 MEPS | 2.87 | 1.19 | 2.86 | 5.55 | 2.24 | 1.61 | 1.84 | 9.92 |
| CSRT | [132] | 1.36 | 40 FPS | 2.70 | 2.86 | 1.41 | 11.04 | 1.80 | 1.13 | 1.41 | 4.47 |
| KCF | [133] | – | – | – | – | – | – | – | – | – | |
| MOSSE | [134] | – | – | – | – | – | – | – | – | – |
Visual Odometry
Visual odometry is an essential perception task that enables autonomous navigation capabilities. However, the strong rotational and translational motions and vibrations caused during flapping-wing flight impose severe challenges even for very consolidated visual odometry estimation methods for frame-based cameras resulting from decades of research. Significant efforts in the field of visual odometry have resulted in the emergence of numerous methods capable of operating under diverse conditions and scenarios.[104,135–148] However, we obtained results with very low accuracy with several event-only approaches when evaluating gps_outdoors_1 and gps_outdoors_2 sequences from the SAETA dataset, and several sequences from the GRIFFIN Perception Dataset.[4] In many cases, the drift in the trajectory was several orders of magnitude greater than the length of the GT trajectory. Despite tests in a broader range of conditions being necessary to draw definitive conclusions, this evidences the need for further research in this area.
Figure 13 shows the estimated trajectories with the DVS and APS sensors in one of the outdoor sequences using VINS-Mono.[149] For DVS, we used event images generated with temporal windows of 30 ms. Although VINS-Mono operates with intensity images, we are able to estimate the motion of the robot with event images. As shown, the estimation with event images is comparable to that obtained when processing frame images from the APS. We could not execute VIO on board our Khadas VIM3 in real time. We used a laptop computer (Intel i7, 16 cores, 2.7 GHz, 16 GB RAM) to achieve real-time performance.
[IMAGE OMITTED. SEE PDF]
Discussion
The performance and computational cost of event-based algorithms for ornithopter onboard perception vary between different tasks. Low-level event-based methods such as corner detection and object tracking demonstrate efficient performance with minimal processing delays, achieving RTF values significantly lower than 1, and are suitable for real-time onboard processing, even during flapping-wing flight. However, more complex tasks such as feature tracking and visual odometry have higher computational demands, failing to run in real time on resource-constrained computers like the Khadas VIM3. This indicates that while some event-based perception tasks are feasible for real-time onboard execution, others require further algorithmic optimization or more powerful onboard processing units to meet the real-time demands under the highly dynamic conditions of ornithopter flights. In addition, high-level navigation and decision-making methods are still in an early development stage for event cameras and struggle to meet real-time demands under the highly dynamic conditions of ornithopter flights. Thus, the overall capability of the perception system is context-dependent, requiring further research efforts to handle both performance and computational constraints effectively. A careful design of the perception system from a software perspective is crucial for onboard perception in ornithopter robots.
Although traditional frame-based methods are not as suitable for aggressive flights, they can offer suitable performance during gliding. Image processing methods have a much higher level of maturity, and the algorithms are usually maintained by the community. Image processing algorithms often provide better performance due to the texture and rich intensity information of images but at a lower rate, whereas event-based algorithms enable higher responsivity due to the high temporal resolution and minimal latency of events but often provide less accurate performance and robustness. These complementary features suggest that data fusion schemes leveraging the synergy between the two sensors could provide a feasible and efficient solution. This has already been exploited in several methods such as[129,150,151] and appears also to be a promising research direction for ornithopter perception algorithms. Another aspect to consider is the recent evolution of single-board computers (SBCs) toward more powerful devices (e.g., higher processing frequency, RAM, GPU acceleration) with lighter weight, higher energy efficiency, and lower cost.[152,153] Devices such as Nvidia Jetson, Intel Nuc, Odroid, Khadas VIM, or LattePanda—which are widely used onboard aerial robots—are offering new models with increasing performance in recent years. This trend indicates that running more computationally demanding algorithms on lightweight aerial robots is becoming increasingly feasible.
Conclusions
This article has presented a comprehensive analysis of the use of event-based vision systems for onboard perception in flapping-wing flying robots, mainly focusing on their feasibility and main challenges, requirements, and constraints. Event cameras, known for their high temporal resolution, high dynamic range, and low power consumption, have been identified as a promising solution to address the constraints imposed by the unique dynamics of ornithopter flight. This study has experimentally evaluated event-based cameras in test benches and in indoor and outdoor scenarios with two different flapping-wing robots. The analysis covers three areas: the potential to operate considering flapping-wing flight dynamics and strong vibrations, the ease of integration in these aerial platforms, and the evaluation of event-based vision algorithms for ornithopter perception.
These analyses confirm that event cameras offer advantages that align well with the challenges faced by flapping-wing robots, making them particularly suitable for aggressive flights. Although the integration of event cameras is now technologically feasible, there are still challenges in terms of miniaturization, cost, and software support, particularly compared to more mature frame-based technologies. The computational demands of event-based algorithms can vary, with real-time performance often limited during high-event-rate activities, such as flapping flight, but more feasible during gliding. The results confirm the feasibility of event-based vision but also underline the need for continued algorithmic advancements to handle the specific challenges posed by the rapid and agile motion of flapping wings. A hybrid approach combining event-based and frame-based sensors may provide the most effective solution, leveraging the strengths of both technologies.
Future work focuses on the analysis of hybrid processing schemes that combine events and frames, along with other types of sensors, whose synergy could enable flapping-wing aerial robots to perform increasingly complex tasks autonomously.
Appendix A - Flapping Emulator Mechanism
The mechanism used to emulate the flapping-wing motion in benchmark experiments is shown in Figure A1. The ornithopters’ flapping strokes were emulated by varying the pitch angle coupled with a linear translation motion. It uses a brushed DC motor controlled by a HW-687 pulse-width-modulated speed regulator. Pitch angle ranges [−30, 30]% and longitudinal translation ranges [−2.5, 2.5] cm with respect to the initial pose. The mechanism has a total weight of 298 g and includes a VectorNav VN-200 Inertial Navigation System to measure linear and angular velocities. The CAD files for replication can be found here ().
[IMAGE OMITTED. SEE PDF]
Appendix B - Flapping-Wing Aerial Robots
All data used in this work was collected on E-Flap[5] and Hybrid (ref. [87] without propellers). Both platforms mounted DAVIS346 event camera and a Khadas VIM3 single-board computer (SBC). Their main specifications are presented in Table A1 and their mass distribution in Figure A2.
Table A1 Specifications of the flapping-wing robot platforms used in the experimental evaluation of this work.
| E-Flapa) | Hybridb) | |
| Used in | boards_indoors | boards_outdoors |
| human_indoors | gps_outdoors | |
| Weight [kg] | 0.510 | 0.930 |
| Payload [kg] | 0.520 | 0.300 |
| Wingspan [m] | 1.5 | 1.5 |
| Battery voltage [V] | 16.5 | 16.8 |
| Battery capacity [mAh] | 450 | 450 |
| Max. velocity [m s−1] | 5.63 | 5.00 |
| Max. flapping freq. [Hz] | 5.5 | 3.0 |
| Flapping amplitude [°] | 30–50 | 35–35 |
[IMAGE OMITTED. SEE PDF]
Appendix C - Datasets
In this section, we describe the datasets () that contain some of the sequences used for the evaluation of the event-based processing algorithms in Section 5: TORRICELLI dataset recorded on the indoor testbed of the GRVC Robotics Lab (15 × 21 × 8 m), and SAETA dataset recorded on two outdoor airfields near Seville (Spain). The dataset contains events, grayscale images, and IMU measurements from a DAVIS346 monocular camera (346 × 260 resolution, APS at 30 fps), RGB images from an ELP monocular camera (640 × 480 resolution, 30 fps), IMU measurements from a VectorNav VN200 inertial navigation system, robot poses from an OptiTrack motion capture system (28 cameras), and IMU and GPS measurements from a Matek H743 Mini V3 autopilot (Arduplane firmware, GPS receiver). The sequences of the datasets were captured onboard the flapping-wing aerial platforms described in Appendix B. The list of sequences is presented in Table A2.
Table A2 List of indoor and outdoor sequences.
| Sequence | Dataset | Duration [s] | Size [GB] | DAVIS346 | ELP | VN200 | GPS | MOCAP |
| boards_indoors_1 | TORRICELLI | 72.398 | 2.358 | ✓ | ✓ | ✓ | ⨯ | ✓ |
| boards_indoors_2 | TORRICELLI | 60.398 | 1.911 | ✓ | ✓ | ✓ | ⨯ | ✓ |
| human_indoors_1 | TORRICELLI | 52.498 | 1.678 | ✓ | ✓ | ✓ | ⨯ | ✓ |
| human_indoors_2 | TORRICELLI | 68.096 | 2.159 | ✓ | ✓ | ✓ | ⨯ | ✓ |
| boards_outdoors_1 | SAETA | 70.995 | 2.434 | ✓ | ✓ | ✓ | ⨯ | ⨯ |
| boards_outdoors_2 | SAETA | 55.198 | 1.939 | ✓ | ✓ | ✓ | ⨯ | ⨯ |
| gps_outdoors_1 | SAETA | 160.092 | 1.616 | ✓ | ⨯ | ⨯ | ✓ | ⨯ |
| gps_outdoors_2 | SAETA | 153.487 | 1.427 | ✓ | ⨯ | ⨯ | ✓ | ⨯ |
Dataset Format
Sequences are provided in rosbag format. All data are precisely timestamped following the POSIX time standard. We use standard ROS types: sensor_msgs/Image for images, sensor_msgs/Imu for IMU measurements, sensor_msgs/NavSatFix for GPS measurements, and geometry_msgs/PoseStamped for mocap pose. Events use the dvs_msgs/EventArray type. The list of topics is presented in Table A3.
Table A3 List of ROS topics.
| Topic | Description | Freq. [Hz] |
| /dvs/events | DAVIS346 events | 30 |
| /dvs/image_raw | DAVIS346 grayscale images | 40 |
| /dvs/imu | DAVIS346 IMU | 1000 |
| /elp/image_raw | ELP RGB images | 30 |
| /vn/imu | VectorNav VN200 IMU | 200 |
| /mocap/pose | OptiTrack pose | 120 |
| /mavros/imu | Autopilot IMU | 10 |
| /mavros/gps | Autopilot GPS | 3 |
Intrinsic IMU Calibration
The IMU calibration (noise densities and biases) is computed using the Allan variance method () from a 2 h sequence where the IMUs were placed on a vibration-isolated platform. The calibration results for the DAVIS346 IMU (davis_imu.yaml), the VectorNav VN200 IMU (vn_imu.yaml), and the Matek H743 Mini V3 autopilot IMU (mavros_imu.yaml) are provided. The original rosbag files (davis_imu.bag, vn_imu.bag, and mavros_imu.bag) used to obtain the results are also provided.
Intrinsic Camera Calibration
We used the pinhole camera model (parameters: focal length fu, fy and principal point pu, pv) with the radial-tangential distortion (parameters: radial k1, k2 and tangential p1, p2). Cameras were calibrated with Kalibr ()[154,155] using a 6 × 6 AprilGrid (specifications in aprilgrid.yaml). Calibration results are provided as yaml files. The rosbag files (calibration_indoors.bag, calibration_outdoors.bag, and calibration_outdoors_gps.bag) used for calibration are also provided in case another calibration method is preferable.
Extrinsic Calibration
Camera-camera and camera-IMU calibrations were also computed with Kalibr. The values of ITD, VTD, ATD, and ETD (see Figure A3) are provided in the corresponding yaml files. The rest of the transformations can be computed as a composition of the provided matrices. The camera-imu temporal shifts were also computed by Kalibr. The same rosbag files used for the camera intrinsic calibration can be used if another calibration method is preferable.
[IMAGE OMITTED. SEE PDF]
Acknowledgements
This work was funded by the European Research Council as part of the GRIFFIN ERC Advanced Grant 2017 (Action 788247). Partial funding was obtained from the Plan Estatal de Investigación Científica y Técnica y de Innovación of the Ministerio de Universidades del Gobierno de España (FPU19/04692). The authors thank Mario Hernández for his help with the design of the experimental setups and José Manuel Carmona for his support during the flapping-wing robot flight experiments. The authors extend their gratitude to Geert Folkertsma from the University of Twente, Abdessattar Abdelkefi from New Mexico State University, Hoang Vu Phan from the École Polytechnique Fédérale de Lausanne, Qiang Fu from the University of Science and Technology Beijing, and Christophe de Wagter and Guido de Croon from the Technical University of Delft for providing valuable details and specifications regarding their flapping-wing platforms. The authors also thank Guillermo Gallego from the Technical University of Berlin for his assistance in gathering specifications of event cameras. R.T. also thanks Sara Ruiz-Moreno for her collaboration and valuable advice.
Conflict of Interest
The authors declare no conflict of interest.
Author Contributions
Raul Tapia: conceptualization (lead); investigation (lead); methodology (lead); supervision (lead); visualization (lead); writing—original draft (lead); writing—review & editing (lead). Javier Luna-Santamaria: conceptualization (supporting); investigation (supporting); methodology (supporting); writing—original draft (supporting); writing—review & editing (supporting). Ivan Gutierrez Rodriguez: conceptualization (supporting); investigation (supporting); methodology (supporting); writing—original draft (supporting); writing—review & editing (supporting). Juan Pablo Rodríguez-Gómez: conceptualization (supporting); investigation (supporting); methodology (supporting); writing—original draft (supporting); writing—review & editing (supporting). José Ramiro Martínez-de Dios: conceptualization (supporting); supervision (supporting); writing—original draft (supporting); writing—review & editing (supporting). Anibal Ollero: funding acquisition (lead); supervision (supporting); writing—review & editing (supporting).
Data Availability Statement
The data that support the findings of this study are available in the supplementary material of this article.
D. Mackenzie, Science 2012, 335, 1430.
G. C. H. E. Croon, Sci. Rob. 2020, 5, eabd0233.
A. Ollero, M. Tognon, A. Suarez, D. Lee, A. Franchi, IEEE Trans. Rob. 2022, 38, 626.
J. P. Rodríguez‐Gómez, R. Tapia, J. L. Paneque, P. Grau, A. Gómez Eguíluz, J. R. Martínez‐de Dios, A. Ollero, IEEE Rob. Autom. Lett. 2021, 6, 1066.
R. Zufferey, J. Tormo‐Barbero, M. M. Guzmán, F. J. Maldonado, E. Sanchez‐Laulhe, P. Grau, M. Pérez, J. A. Acosta, A. Ollero, IEEE Rob. Autom. Lett. 2021, 6, 3097.
G. C. H. E. Croon, K. M. E. Clercq, R. Ruijsink, B. D. W. Remes, C. Wagter, Int. J. Micro Air Veh. 2009, 1, 71.
F. Garcia Bermudez, R. Fearing, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems St. Louis, MO, October 2009, pp. 5027–5032.
D. A. Olejnik, B. P. Duisterhof, M. Karásek, K. Y. W. Scheper, T. Van Dijk, G. C. H. E. Croon, Unmanned Syst. 2020, 08, 287.
Y. Jin, Y. Ren, T. Song, Z. Jiang, G. Song, in Int. Conf. on Computing, Networks and Internet of Things, Xiamen, China May 2023, pp. 278–285.
S. Tijmons, G. C. H. E. Croon, B. Remes, C. Wagter, R. Ruijsink, E. J. Kampen, Q. Chu, in Advances in Aerospace Guidance, Navigation and Control, Springer 2013, pp. 463–482.
C. Wagter, S. Tijmons, B. D. W. Remes, G. C. H. E. Croon, in IEEE Int. Conf. on Robotics and Automation Hong Kong, China May 2014, pp. 4982–4987.
S. Tijmons, G. C. H. E. Croon, B. D. W. Remes, C. De Wagter, M. Mulder, IEEE Trans. Rob. 2017, 33, 858.
S. Tijmons, C. De Wagter, B. Remes, G. C. H. E. De Croon, Aerospace 2018, 5, 69.
K. Y. Scheper, M. Karásek, C. De Wagter, B. D. Remes, G. C. De Croon, in IEEE Int. Conf. on Robotics and Automation, Brisbane, QLD, Australia May 2018, pp. 5546–5552.
A. Gómez Eguíluz, J. P. Rodríguez‐Gómez, R. Tapia, F. J. Maldonado, J. A. Acosta, J. R. Martínez‐de Dios, A. Ollero, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Prague, Czech Republic September 2021, pp. 1958–1965.
J. P. Rodríguez‐Gómez, R. Tapia, M. M. Guzmán Garcia, J. R. Martínez‐de Dios, A. Ollero, IEEE Rob. Autom. Lett. 2022, 7, 5413.
J. Ribeiro‐Gomes, J. Gaspar, A. Bernardino, Front. Rob. AI 2023, 10.
G. Gallego, T. Delbrück, G. Orchard, C. Bartolozzi, B. Taba, A. Censi, S. Leutenegger, A. J. Davison, J. Conradt, K. Daniilidis, D. Scaramuzza, IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 154.
B. Chakravarthi, A. A. Verma, K. Daniilidis, C. Fermuller, Y. Yang, Recent Event Camera Innovations: A Survey, http://arxiv.org/abs/2408.13627 (accessed: 2024).
R. Tapia, J. P. Rodríguez‐Gómez, J. A. Sanchez‐Diaz, F. J. Gañán, I. G. Rodríguez, J. Luna‐Santamaria, J. R. Martínez‐De Dios, A. Ollero, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems 2023, pp. 3025–3032.
M. A. Mahowald, C. Mead, Sci. Am. 1991, 264, 76.
R. Serrano‐Gotarredona, M. Oster, P. Lichtsteiner, A. Linares‐Barranco, R. Paz‐Vicente, F. Gomez‐Rodriguez, L. Camunas‐Mesa, R. Berner, M. Rivas‐Perez, T. Delbruck, S. C. Liu, R. Douglas, P. Hafliger, G. Jimenez‐Moreno, A. Civit Ballcels, T. Serrano‐Gotarredona, A. J. Acosta‐Jimenez, B. Linares‐Barranco, IEEE Trans. Neural Networks 2009, 20, 1417.
P. Lichtsteiner, T. Delbruck, in IEEE Workshop on Charge‐Coupled Devices and Advanced Image Sensors, Karuizawa, Nagano, Japan June 2005.
P. Lichtsteiner, T. Delbruck, in Research in Microelectronics and Electronics, IEEE Vol. 2 2005, pp. 202–205.
P. Lichtsteiner, C. Posch, T. Delbruck, in IEEE Int. Solid State Circuits Conf., San Francisco, CA February 2006, pp. 2060–2069.
P. Lichtsteiner, C. Posch, T. Delbruck, IEEE Journal of Solid‐State Circuits 2008, 43, 566.
T. Finateu, A. Niwa, D. Matolin, K. Tsuchimoto, A. Mascheroni, E. Reynaud, P. Mostafalu, F. Brady, L. Chotard, F. LeGoff, H. Takahashi, H. Wakabayashi, Y. Oike, C. Posch, in IEEE Int. Solid‐State Circuits Conf. 2020, pp. 112–114.
T. Stoffregen, H. Daraei, C. Robinson, A. Fix, in IEEE/CVF Winter Conf. on Applications of Computer Vision 2022, pp. 3937–3945.
C. Posch, D. Matolin, R. Wohlgenannt, in IEEE Int. Solid‐State Circuits Conf. 2010, pp. 400–401.
C. Posch, D. Matolin, R. Wohlgenannt, IEEE Journal of Solid‐State Circuits 2011, 46, 259.
T. Serrano‐Gotarredona, B. Linares‐Barranco, IEEE Journal of Solid‐State Circuits 2013, 48, 827.
C. Brandli, R. Berner, M. Yang, S. C. Liu, T. Delbruck, IEEE J. Solid‐State Circuits 2014, 49, 2333.
B. Son, Y. Suh, S. Kim, H. Jung, J. S. Kim, C. Shin, K. Park, K. Lee, J. Park, J. Woo, Y. Roh, H. Lee, Y. Wang, I. Ovsiannikov, H. Ryu, in IEEE Int. Solid‐State Circuits Conf. San Francisco, CA February 2017, pp. 66–67.
Y. Suh, S. Choi, M. Ito, J. Kim, Y. Lee, J. Seo, H. Jung, D. H. Yeo, S. Namgung, J. Bong, S. Yoo, S. H. Shin, D. Kwon, P. Kang, S. Kim, H. Na, K. Hwang, C. Shin, J. S. Kim, P. K. J. Park, J. Kim, H. Ryu, Y. Park, in IEEE Int. Symp. on Circuits and Systems, Seville, Spain October 2020, pp. 1–5.
O. Holešovský, V. Hlaváč, R. Škoviera, R. Vítek, in Computer Vision Winter Workshop, Rogaka Slatina, Slovenia February 2020.
O. Holešovský, R. Škoviera, V. Hlaváč, R. Vítek, Sensors 2021, 21, 1137.
J. Barrios‐Avilés, T. Iakymchuk, J. Samaniego, L. D. Medus, A. Rosado‐Muñoz, Electronics 2018, 7, 304.
A. Censi, E. Mueller, E. Frazzoli, S. Soatto, in IEEE Int. Conf. on Robotics and Automation, Seattle, WA, May 2015, pp. 3319–3326.
J. Cox, A. Ashok, N. Morley, Unconv. Imaging Adapt. Opt. 2020, 11508, 63.
C. Farabet, R. Paz, J. Perez‐Carrasco, C. Zamarreño, A. Linares‐Barranco, Y. LeCun, E. Culurciello, T. Serrano‐Gotarredona, B. Linares‐Barranco, Front. Neurosci. 2012, 6.
H. Rebecq, R. Ranftl, V. Koltun, D. Scaramuzza, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Long Beach, CA June 2019, pp. 3852.
H. Rebecq, R. Ranftl, V. Koltun, D. Scaramuzza, IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 1964.
J. P. Rodríguez‐Gómez, J. R. Martínez‐de Dios, A. Ollero, G. Gallego, IEEE Rob. Autom. Lett. 2024, 9, 8802.
F. Y. Hsiao, H. K. Hsu, C. L. Chen, L. J. Yang, J. F. Shen, J. Appl. Sci. Eng. 2012, 15, 213.
K. Y. Ma, P. Chirarattananon, S. B. Fuller, R. J. Wood, Science 2013, 340, 603.
F. J. Maldonado, J. Á. Acosta, J. Tormo‐Barbero, P. Grau, M. M. Guzmán, A. Ollero, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Las Vegas, Nevada October 2020, pp. 1385–1390.
G. C. H. E. Croon, E. Weerdt, C. Wagter, B. D. W. Remes, in IEEE Int. Conf. on Robotics and Biomimetics, Tianjin, China December 2010, pp. 1606–1611.
G. C. H. E. Croon, E. de Weerdt, C. De Wagter, B. D. W. Remes, R. Ruijsink, IEEE Trans. Rob. 2012, 28, 529.
P. E. J. Duhamel, N. O. Pérez‐Arancibia, G. L. Barrows, R. J. Wood, in IEEE Int. Conf. on Robotics and Automation, Saint Paul, MN May 2012, pp. 4228–4235.
P. E. J. Duhamel, N. O. Pérez‐Arancibia, G. L. Barrows, R. J. Wood, IEEE/ASME Trans. Mechatron. 2013, 18, 556.
S. Ryu, U. Kwon, H. J. Kim, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Daejeon, Korea (South) October 2016, pp. 5645–5650.
R. Tapia, J. R. Martínez‐de Dios, A. Ollero, IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 9630.
R. Benosman, C. Clercq, X. Lagorce, S. H. Ieng, C. Bartolozzi, IEEE Trans. Neural Networks Learn. Syst. 2014, 25, 407.
E. Pan, X. Liang, W. Xu, IEEE Sens. J. 2020, 20, 8017.
P. Zhang, H. Liu, Z. Ge, C. Wang, E. Y. Lam, IEEE Trans. Image Proc. 2024, 33, 2318.
S. Chen, M. Guo, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Long Beach, CA June 2019, pp. 1682–1683.
G. A. Folkertsma, W. Straatman, N. Nijenhuis, C. H. Venner, S. Stramigioli, IEEE Rob. Autom. Mag. 2017, 24, 22.
J. Gerdes, A. Holness, A. Perez‐Rosado, L. Roberts, A. Greisinger, E. Barnett, J. Kempny, D. Lingam, C. H. Yeh, H. A. Bruck, S. K. Gupta, Soft Rob. 2014, 1, 275.
A. Perez‐Rosado, H. A. Bruck, S. K. Gupta, J. Mech. Rob. 2016, 8.
A. E. Holness, H. A. Bruck, S. K. Gupta, Int. J. Micro Air Veh. 2018, 10, 50.
H. A. Bruck, S. K. Gupta, Biomimetics 2023, 8, 485.
Z. Jiao, L. Wang, L. Zhao, W. Jiang, Aerosp. Sci. Technol. 2021, 116, 106870.
A. Ramezani, X. Shi, S. J. Chung, S. Hutchinson, in IEEE Int. Conf. on Robotics and Automation, Stockholm, Sweden May 2016, pp. 3219–3226.
W. Yang, L. Wang, B. Song, Int. J. Micro Air Veh. 2018, 10, 70.
M. Hassanalian, A. Abdelkefi, M. Wei, S. Ziaei‐Rad, Acta Mech. 2017, 228, 1097.
H. Huang, W. He, J. Wang, L. Zhang, Q. Fu, IEEE/ASME Trans. Mechatron. 2022, 27, 5484.
N. Franceschini, F. Ruffier, J. Serres, Curr. Biol. 2007, 17, 329.
E. W. Hawkes, D. Lentink, J. R. Soc. Interface 2016, 13, 20160730.
F. Ruffier, Science 2018, 361, 1073.
M. Karásek, F. T. Muijres, C. De Wagter, B. D. W. Remes, G. C. H. E. de Croon, Science 2018, 361, 1089.
M. Keennon, K. Klingebiel, H. Won, in AIAA Aerospace Sciences Meeting Including the New Horizons Forum and Aerospace Exposition, Nashville, TN January 2012.
H. V. Phan, S. Aurecianus, T. Kang, H. C. Park, Int. J. Micro Air Veh. 2019, 11.
H. V. Phan, S. Aurecianus, T. K. L. Au, T. Kang, H. C. Park, IEEE Rob. Autom. Lett. 2020, 5, 5059.
R. J. Wood, IEEE Trans. Rob. 2008, 24, 341.
S. B. Fuller, A. Sands, A. Haggerty, M. Karpelson, R. J. Wood, in IEEE Int. Conf. on Robotics and Automation, Karlsruhe, Germany May 2013, pp. 1374–1380.
K. Y. Ma, S. M. Felton, R. J. Wood, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Vilamoura‐Algarve, Portugal October 2012, pp. 1133–1140.
E. F. Helbling, S. B. Fuller, R. J. Wood, in IEEE Int. Conf. on Robotics and Automation, Hong Kong, China May 2014, pp. 5516–5522.
R. Malka, A. L. Desbiens, Y. Chen, R. J. Wood, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Chicago, IL September 2014, pp. 2879–2885.
X. Yang, Y. Chen, L. Chang, A. A. Calderón, N. O. Pérez‐Arancibia, IEEE Rob. Autom. Lett. 2019, 4, 4270.
M. Guzmán, C. R. Páez, F. J. Maldonado, R. Zufferey, J. Tormo‐Barbero, J. Acosta, A. Ollero, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems 2021, pp. 6358–6365.
L. Calvente, J. Á. Acosta, A. Ollero, in Aerial Robotic Systems Physically Interacting with the Environment, Biograd na Moru, Croatia October 2021, pp. 1–6.
V. Perez‐Sanchez, A. E. Gomez‐Tamm, E. Savastano, B. C. Arrue, A. Ollero, Appl. Sci. 2021, 11, 2930.
E. Savastano, V. Perez‐Sanchez, B. Arrue, A. Ollero, IEEE Rob. Autom. Lett. 2022, 7, 8076.
A. Gómez Eguíluz, J. P. Rodríguez‐Gómez, J. L. Paneque, P. Grau, J. R. Martínez‐de Dios, A. Ollero, in Proc. of the Workshop on Research, Education and Development of Unmanned Aerial Systems, Cranfield, UK November 2019, pp. 335–343.
R. Zufferey, J. Tormo‐Barbero, D. Feliu‐Talegón, S. R. Nekoo, J. Á. Acosta, A. Ollero, Nat. Commun. 2022, 13, 7713.
S. R. Nekoo, D. Feliu‐Talegon, R. Tapia, A. C. Satue, J. R. Martínezde Dios, A. Ollero, Robotica 2023, 41, 3022.
D. Gayango, R. Salmoral, H. Romero, J. M. Carmona, A. Suarez, A. Ollero, IEEE Rob. Autom. Lett. 2023, 8, 4243.
R. Rashad, F. Califano, A. J. van der Schaft, S. Stramigioli, IMA J. Math. Control Inf. 2020, 37, 1400.
F. Califano, R. Rashad, A. Dijkshoorn, L. G. Koerkamp, R. Sneep, A. Brugnoli, S. Stramigioli, Annu. Rev. Control 2021, 51, 37.
D. Gehrig, D. Scaramuzza, Are High‐Resolution Event Cameras Really Needed? http://arxiv.org/abs/2203.14672 (accessed: 2022).
R. Tapia, A. C. Satue, S. R. Nekoo, J. R. Martínez‐de Dios, A. Ollero, in IEEE Int. Conf. on Robotics and Automation Workshops, London, United Kingdom May 2023.
F. Cladera, A. Bisulco, D. Kepple, V. Isler, D. D. Lee, in IEEE Int. Conf. on Image Processing, Abu Dhabi, United Arab Emirates October 2020, pp. 3084–3088.
D. R. Kepple, D. Lee, C. Prepsius, V. Isler, I. M. Park, D. D. Lee, in European Conf. on Computer Vision (Eds: A. Vedaldi, H. Bischof, T. Brox, J. M. Frahm), Glasgow, UK August 2020, pp. 500–516.
A. Bisulco, F. Cladera, V. Isler, D. D. Lee, in IEEE Int. Conf. on Robotics and Automation, Xi'an, China May 2021, p. 14098.
Z. Wang, F. Cladera, A. Bisulco, D. Lee, C. J. Taylor, K. Daniilidis, M. A. Hsieh, D. D. Lee, V. Isler, IEEE Rob. Autom. Lett. 2022, 7, 8737.
K. Kodama, Y. Sato, Y. Yorikado, R. Berner, K. Mizoguchi, T. Miyazaki, M. Tsukamoto, Y. Matoba, H. Shinozaki, A. Niwa, T. Yamaguchi, C. Brandli, H. Wakabayashi, Y. Oike, in IEEE Int. Solid‐State Circuits Conf. San Francisco, CA, February 2023, pp. 92–94.
A. Niwa, F. Mochizuki, R. Berner, T. Maruyarma, T. Terano, K. Takamiya, Y. Kimura, K. Mizoguchi, T. Miyazaki, S. Kaizu, H. Takahashi, A. Suzuki, C. Brandli, H. Wakabayashi, Y. Oike, in IEEE Int. Solid‐State Circuits Conf., San Francisco, CA, February 2023, pp. 4–6.
R. Tapia, A. Gómez Eguíluz, J. R. Martínez‐de Dios, A. Ollero, in IEEE Int. Conf. on Robotics and Automation Workshops, Virtual Conference, May 2020, pp. 1–3.
R. Tapia, J. R. Martínez‐de Dios, A. Gómez Eguíluz, A. Ollero, Auton. Rob. 2022, 46, 879.
E. Mueggler, H. Rebecq, G. Gallego, T. Delbruck, D. Scaramuzza, Int. J. Rob. Res. 2017, 36, 142.
A. Mitrokhin, C. Ye, C. Fermüller, Y. Aloimonos, T. Delbruck, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Macau, China November 2019, pp. 6105–6112.
L. Burner, A. Mitrokhin, C. Fermüller, Y. Aloimonos, EVIMO2: An event Camera Dataset for Motion Segmentation, Optical Flow, Structure from Motion, and Visual Inertial Odometry in Indoor Scenes with Monocular or Stereo Algorithms, http://arxiv.org/abs/2205.03467 (accessed: 2022).
S. Klenk, J. Chui, N. Demmel, D. Cremers, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Prague, Czech Republic September 2021, pp. 8601–8608.
J. Hidalgo‐Carrió, G. Gallego, D. Scaramuzza, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition, New Orleans, LA June 2022, pp. 5771–5780.
M. Gehrig, W. Aarents, D. Gehrig, D. Scaramuzza, IEEE Rob. Autom. Lett. 2021, 6, 4947.
L. Gao, Y. Liang, J. Yang, S. Wu, C. Wang, J. Chen, L. Kneip, IEEE Rob. Autom. Lett. 2022, 7, 8217.
P. Chen, W. Guan, F. Huang, Y. Zhong, W. Wen, IEEE Trans. Intell. Veh. 2024, 9, 407.
P. Chen, W. Guan, P. Lu, IEEE Rob. Autom. Lett. 2023, 8, 3661.
F. Barranco, C. Fermuller, Y. Aloimonos, T. Delbruck, Front. Neurosci. 2016, 10.
B. Rueckauer, T. Delbruck, Front. Neurosci. 2016, 10.
A. Z. Zhu, D. Thakur, T. Özaslan, B. Pfrommer, V. Kumar, K. Daniilidis, IEEE Rob. Autom. Lett. 2018, 3, 2032.
J. Delmerico, T. Cieslewski, H. Rebecq, M. Faessler, D. Scaramuzza, inIEEE Int. Conf. on Robotics and Automation, Montreal, QC, Canada May 2019, pp. 6713–6719.
K. Chaney, F. Cladera, Z. Wang, A. Bisulco, M. A. Hsieh, C. Korpela, V. Kumar, C. J. Taylor, K. Daniilidis, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Vancouver, BC, Canada June 2023, pp. 4016–4023.
C. Scheerlinck, H. Rebecq, T. Stoffregen, N. Barnes, R. Mahony, D. Scaramuzza, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Long Beach, CA June 2019, pp. 1684–1693.
T. Serrano‐Gotarredona, B. Linares‐Barranco, Front. Neurosci. 2015, 9.
S. Afshar, T. J. Hamilton, J. Tapson, A. van Schaik, G. Cohen, Front. Neurosci. 2019, 12.
A. Amir, B. Taba, D. Berg, T. Melano, J. McKinstry, C. Di Nolfo, T. Nayak, A. Andreopoulos, G. Garreau, M. Mendoza, J. Kusnitz, M. Debole, S. Esser, T. Delbruck, M. Flickner, D. Modha, IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Honolulu, HI July 2017, pp. 7388–7397.
C. Boretti, P. Bich, F. Pareschi, L. Prono, R. Rovatti, G. Setti, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Vancouver, BC, Canada June 2023, pp. 4065–4070.
H. Rebecq, D. Gehrig, D. Scaramuzza, in Conf. on Robot Learning, Zürich, Switzerland October 2018, pp. 969–982.
Y. Hu, S. C. Liu, T. Delbruck, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition Workshops, Nashville, TN June 2021, pp. 1312–1321.
D. Joubert, A. Marcireau, N. Ralph, A. Jolley, A. van Schaik, G. Cohen, Front. Neurosci. 2021, 15.
A. Ziegler, D. Teigland, J. Tebbe, T. Gossard, A. Zell, in IEEE Int. Conf. on Robotics and Automation, London, UK May 2023, pp. 11669–11675.
V. Vasco, A. Glover, C. Bartolozzi, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, Daejeon, Korea (South), October 2016, pp. 4144–4149.
E. Mueggler, C. Bartolozzi, D. Scaramuzza, in British Machine Vision Conf., London, HK, September 2017, pp. 1–11.
I. Alzugaray, M. Chli, IEEE Rob. Autom. Lett. 2018, 3, 3177.
C. Harris, M. Stephens, in Alvey Vision Conf. Manchester, UK September 1988, pp. 23.1–23.6.
C. Brändli, J. Strubel, S. Keller, D. Scaramuzza, T. Delbruck, in IEEE Int. Conf. on Event‐based Control, Communication, and Signal Processing, Krakow, Poland June 2016, pp. 1–7.
D. Reverter Valeiras, X. Clady, S. H. Ieng, R. Benosman, IEEE Trans. Neural Networks Learn. Syst. 2019, 30, 1218.
D. Gehrig, H. Rebecq, G. Gallego, D. Scaramuzza, Int. J. Comput. Vision 2020, 128, 601.
J. Y. Bouguet, Intel Corporation, Technical Report 5, 2001.
J. P. Rodríguez‐Gomez, A. Gómez Eguíluz, J. R. Martínez‐de Dios, A. Ollero, in IEEE Int. Conf. on Robotics and Automation, Paris, France. August 2020, pp. 8518–8524.
A. Lukežič, T. Vojíř, L. Čehovin Zajc, J. Matas, M. Kristan, Int. J. Comput. Vision 2018, 126, 671.
J. F. Henriques, R. Caseiro, P. Martins, J. Batista, in European Conf. on Computer Vision (Eds: A. Fitzgibbon, S. Lazebnik, P. Perona, Y. Sato, C. Schmid), Firenze, Italy October 2012, pp. 702–715.
D. S. Bolme, J. R. Beveridge, B. A. Draper, Y. M. Lui, IEEE/CVF Conf. on Computer Vision and Pattern Recognition, San Francisco, CA, USA June 2010, pp. 2544–2550.
H. Kim, S. Leutenegger, A. J. Davison, in European Conf. on Computer Vision (Eds: B. Leibe, J. Matas, N. Sebe, M. Welling), Amsterdam, The Netherlands October 2016, pp. 349–364.
H. Rebecq, T. Horstschaefer, G. Gallego, D. Scaramuzza, IEEE Rob. Autom. Lett. 2017, 2, 593.
A. Z. Zhu, N. Atanasov, K. Daniilidis, in IEEE/CVF Conf. on Computer Vision and Pattern Recognition, Honolulu, HI July 2017, pp. 5816–5824.
H. Rebecq, T. Horstschaefer, D. Scaramuzza, in British Machine Vision Conf. 2017, pp. 1–8.
A. R. Vidal, H. Rebecq, T. Horstschaefer, D. Scaramuzza, IEEE Rob. Autom. Lett. 2018, 3, 994.
F. Mahlknecht, D. Gehrig, J. Nash, F. M. Rockenbauer, B. Morrell, J. Delaune, D. Scaramuzza, IEEE Rob. Autom. Lett. 2022, 7, 8651.
X. Liu, H. Xue, X. Gao, H. Liu, B. Chen, S. S. Ge, IEEE Trans. Instrum. Meas. 2023, 72, 1.
A. Gupta, P. Sharma, D. Ghosh, D. Ghose, S. K. Muthukumar, in IEEE Int. Conf. on Electronics, Computing and Communication Technologies, Bangalore, India July 2023, pp. 1–6.
W. Guan, P. Chen, Y. Xie, P. Lu, IEEE Trans. Autom. Sci. Eng. 2024, 21, 6277.
S. Klenk, M. Motzet, L. Koestler, D. Cremers, in Int. Conf. on 3D Vision 2024, pp. 739–749.
W. Guan, P. Chen, H. Zhao, Y. Wang, P. Lu, Adv. Intell. Syst. 2024, 6, 2400243.
M. S. Lee, J. H. Jung, Y. J. Kim, C. G. Park, IEEE Rob. Autom. Lett. 2024, 9, 1003.
J. Lu, H. Feng, W. Liu, B. Hu, in IEEE Advanced Information Technology, Electronic and Automation Control Conf. 2024, Vol. 7, pp. 1510–1515.
R. Pellerito, M. Cannici, D. Gehrig, J. Belhadj, O. Dubois‐Matra, M. Casasco, D. Scaramuzza, Deep visual odometry with events and frames, in IEEE/RSJ International Conference on Intelligent Robots and Systems, Abu Dhabi, United Arab Emirates October 2024, pp. 8966–8973.
T. Qin, P. Li, S. Shen, IEEE Trans. Rob. 2018, 34, 1004.
A. Tomy, A. Paigwar, K. S. Mann, A. Renzaglia, C. Laugier, in IEEE Int. Conf. on Robotics and Automation, Philadelphia, PA, USA May 2022, pp. 933–939.
D. Gehrig, D. Scaramuzza, Nature 2024, 629, 1034.
F. Kaup, S. Hacker, E. Mentzendorff, C. Meurisch, D. Hausheer, NetSys Lab, Technical Report NetSys‐TR‐2018‐01 2018.
U. Iqbal, T. Davies, P. Perez, Sensors 2024, 24, 4830.
P. Furgale, J. Rehder, R. Siegwart, in IEEE/RSJ Int. Conf. on Intelligent Robots and Systems 2013, pp. 1280–1286.
J. Rehder, J. Nikolic, T. Schneider, T. Hinzmann, R. Siegwart, in IEEE Int. Conf. on Robotics and Automation, Stockholm, Sweden May 2016, pp. 4304–4311.
M. Guo, J. Huang, S. Chen, in IEEE Int. Symp. on Circuits and Systems, Baltimore, MD May 2017, p. 1.
S. B. Fuller, IEEE Rob. Autom. Lett. 2019, 4, 570.
X. Wu, W. He, Q. Wang, T. Meng, X. He, Q. Fu, IEEE Trans. Ind. Electron. 2023, 70, 8215.
P. V. C. Hough, US3 069 654A 1962.
© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.