1 Introduction
At 16:10 (JST) on 1 January 2024, shallow reverse faulting produced an 7.5 earthquake that propagated from the northernmost point of Suzu, Ishikawa Prefecture (Fig. ). The disaster poses unique challenges for the disaster geophysics community due to its context and outfall. The intraplate faulting occurred on the relatively inert western coast of Honshu Island, Japan, following a 3-year-long earthquake swarm . Affected areas were the Suzu, Noto, Wajima, Nanao, Shika, and Anamizu municipalities . The Prime Minister's Office of Japan has provided transcripts for several press conferences and emergency meetings reporting actions taken to address monitoring and relief operations. Initial reporting provided information on near-instant tsunami impacts around the main shock's epicenter (north Suzu), quickly followed by a comprehensive tsunami warning along the entire peninsula's coast. Subsequent statements confirmed the presence of catastrophic damage affecting infrastructure throughout the peninsula including ground shaking, land deformation, liquefaction, and landslides, causing varied damage to buildings, interrupting roads, and starting a fire. Later press releases, medical reports, and news outlets confirmed impacts to critical services, such as water supply, sewage system, power outages, and telecommunication service disruptions . The disaster and following outfall ultimately resulted in injuries and human casualties, the prevention of which represents an overarching focus of disaster research (preventable disaster deaths) .
The spatial distribution of infrastructure impacts is of particular importance to disaster research. Such data can inform emergency response studies (physical dynamics simulation, damage detection, damage estimation, evacuation simulation, etc.), long-term recovery studies (socioeconomic studies, disaster epidemiology, disaster prevention, probabilistic hazard, etc.), and ultimately the development of more informed codes and regulation. Disaster damage visual assessments are critical to develop a comprehensive corpus of disaster impacts to infrastructure and to inform studies such as the aforementioned.
Figure 1Seismic context of the 2024 Noto Peninsula earthquake showing the distribution of the earthquake swarm following the aftershocks .
[Figure omitted. See PDF]
These visual assessments can be carried out by an on-the-ground survey team, ensuring the highest degree of fidelity and granularity. However, such an investigation is often resource-intensive, carries inherent risk of harm, and may be highly invasive. Alternative methods generally employ remote sensing data and have historically been carried out by experts and institutions .
Human visual assessments have informed several studies that have contributed to a deeper understanding of seismic and tsunami building damage. , for example, conducted a limited-scope visual assessment of the 2011 Great East Earthquake and Tsunami using multi-source multi-modal imagery (modality refers to the viewing angle, including aerial or satellite orthophoto, standard aerial photography, ground level photo, or aerial oblique photos, among others; source refers to the type and capabilities of the sensor used to capture the imagery) to generate fragility curves of port structures in Ishinomaki.
More recently automated methods employing pre-trained machine learning models have been explored . Such methods generally leverage vertical imagery as input to a machine learning model to perform automatic classification of building damage. These automated image-based assessments carry inherent limitations beyond human interpretation, such as the capability to generalize between domains and hazard-induced damage (e.g., damage types to buildings from tsunami, earthquake, and fire all look different).
Figure 2Mismatch between original building polygons provided by the Geospatial Information Authority of Japan (GSI) and orthophoto imagery. Inconsistencies appear more prevalent in rural areas, where the building inventory is often inconsistent with or not representative of vertical imagery. As part of the classification process, newly identified buildings were added to the database, while existing building footprints were adjusted to match the orthophoto imagery. In cases of ambiguity, the original ambiguous building polygon is marked as 99 (missing or inconsistent). One or more new building polygons are drawn over the ambiguous polygon based on the most recent pre-event vertical imagery to more faithfully represent the pre-disaster condition; the new polygons are then classified as described in Sect. (basemap attribution: ©: Google Earth, 2024 Airbus, CNES/Airbus, Japan Hydrographic Association, Landsat/Copernicus, Maxar Technologies 2024).
[Figure omitted. See PDF]
Table 1Characteristics and date of each GSI vertical mosaic.
| Date taken | GSI mosaic name | Notes | 
| 2 Jan 2024 | Suzu | modest overcast (east), inland snow buildup (mild) | 
| Wajima-Naka | mostly overcast, inland snow buildup (modest) | |
| Wajima-Higashi | minimal overcast (east), inland snow buildup (mild) | |
| 5 Jan 2024 | Suzu | mild overcast (west), otherwise clear | 
| Nanao | minimal overcast (southwest), generally clear, heavy desaturation | |
| Anamizu | major overcast, central coast clear | |
| 11 Jan 2024 | Wajima-Naka | minimal overcast (southeast), snow buildup (modest) | 
| Anamizu | minimal overcast (center), inland snow buildup (modest) | |
| Wajima-Nishi | mild overcast (south), snow buildup (modest) | |
| 17 Jan 2024 | Nanao | clear, snow buildup (heavy), slight desaturation | 
| Wajima-Nishi | clear, inland snow buildup (modest), coast snow buildup (mild) | |
| Anamizu | clear, snow buildup (heavy) | 
Building damage across the Noto Peninsula is split between failure modes consequent to different, often compounding hazards. Vertical aerial imagery by , oblique imagery by © .
[Figure omitted. See PDF]
A preliminary investigation of the official building footprint inventory, provided by the Geospatial Information Authority of Japan (GSI), revealed large discrepancies with pre-event imagery (Fig. ). Moreover, the variable aerial survey periods, image capture quality, meteorological conditions (Table ), and different mechanisms driving building failure (such as fire, tsunami, earthquake; Fig. ) contributed to a visually fragmented and inconsistent domain.
Despite these challenges, the unprecedented availability of open-source and multi-source data provided us with a unique opportunity for a rapid visual damage assessment.  With the above considerations in mind, we opted for a manual approach to curate this dataset. We hope that this dataset will serve as a reference for future studies and as a benchmark for automated methods.  The primary sources for the investigation were post-disaster vertical imagery captured by GSI and made available online.  In addition, oblique imagery of select portions of Noto Peninsula was made available by Kokusai Kogyo (KKC) through the free version of their proprietary aerial survey database (for details regarding licensing, usage, and distribution, see Sect. ).  The post-disaster imagery data informed the classification of the public GSI building footprint inventory vector data.  Our criteria were developed iteratively in response to limitations presented by the data.  Following an initially limited-scope investigation, the assessment was made available to the public for a progressive appraisal at the online portal: 
Prove the feasibility of multi-modal, multi-source visual assessment methodologies,
Provide a measure of expected accuracy when employing such methods, and
Contribute a damage dataset of a profoundly complex disaster, with significant multi-hazard interactions and impacts.
Building damage visual assessment workflow illustrating the working group's approach to multi-source and multi-modal data, each stage of inclusion, and the expert-feedback-driven iterative validation process. Pre-event orthophoto by © Google Earth 2024, post-event aerial by , all oblique imagery courtesy of © .
[Figure omitted. See PDF]
2 Methods
Herein, we relate the methodology used to generate the dataset, including considerations, challenges, and limitations encountered during the creation of the dataset (Fig. ). The working group was formed in response to the disaster and included members from Tohoku University, the International Research Institute of Disaster Science (IRIDeS), and the Faculty of Social and Environmental Studies at Tokoha University, who provided a secondary survey sample to conduct the technical validation. The assessment was conducted by our internal working group, which included a mix of civil engineers, geophysicists, and disaster researchers. We conducted the assessment in a collaborative manner, with each member of the working group contributing to the assessment of different areas of the peninsula, followed by a quality control round to ensure consistency. The assessment progress was publicly documented through a web portal, which allowed for real-time feedback from the public and other researchers, as more parts of the peninsula were documented and served online. After the initial assessment, we conducted a secondary round of validation using limited-scope surveys by independent research teams and crowd-sourced feedback from the public.
2.1 Data sourcing
The initial review consisted of a general overview of available data from official sources. The Government of Japan provides basic geographic information through the GSI (
Criteria used for binary classification of the entire Noto Peninsula building damage visual assessment.
| Class | Database | Criterion | 
| label | value | |
| 0 | Survived | Damage does not appear to affect the bearing mode of the structure – includes the following. Partial damage of the roof requiring replacement or repair. Buildings in the vicinity of structurally unsound buildings but appear structurally sound. Undamaged buildings. | 
| 1 | Destroyed | Structurally unsound based on visual interpretation – includes the following. Partially or completely washed-away buildings. Partially, completely collapsed, or severely inclined buildings. Partially or completely buried buildings. Buildings burned to the degree that they are structurally unsound. | 
| 9 | Obstructedview | Building is marked by a footprint according to the GSI registry but is visually obstructed – includes the following. Buildings under cloud cover. Buildings under sunshade such that they are indistinguishable from their surroundings. Buildings under canopy cover such that structural features are indistinguishable. | 
| 99 | Missing orinconsistent | Buildings that whose GSI registry footprint is significantly inconsistent relative to available imagery – includes the following. Building footprints that do not match an existing building across pre-event and post-event imagery even when allowing for a degree of vertical shift. Building footprints that demarcate a non-existing building across pre-event and post-event imagery. | 
Comparison of popular reference damage scales for building damage visual assessment with approximate relative equivalences.
| Damage grade | Damage index | Damage index | Copernicus EMS | Present study | 
| acronyms | ||||
| D0 | 0.0 | Nd0 | Possibly damaged | Survived | 
| D1 | 0.1–0.2 | Md1 | Damaged | |
| D2 | 0.2–0.4 | Md2 | ||
| D3 | 0.4–0.6 | Ud3, Gd3, Ed3, Rd3, Sd3 | ||
| D4 | 0.6–0.8 | Ud4, Gd4, Ed4, Sd4 | Destroyed | Destroyed | 
| D5 | 0.8–0.9 | Ud5-, Ud5+, Gd5-, Gd5+, Sd5 | ||
| 0.9–1.0 | Cd5+ | 
2.2 General methods
We provide a short summary of the general methodology used to conduct the visual assessment (refer to Table  for precise criteria). 
Areas with oblique images were treated first.
Each aerial was screened for cloud cover and potentially obstructed buildings were marked as 9 unless the oblique imagery provided a clear view of the building, in which case the oblique was used to classify the building.
For each building we checked first against the pre-disaster aerial for footprint geometry consistency; in cases of mismatch or ambiguity we classified the existing polygon as 99 and added a new polygon (based on the pre-disaster aerial) to the database.
In the case of coherence between the footprint polygon and the pre-disaster aerial, we classified the building as 0 or 1 (survived/destroyed) using the post-disaster vertical imagery and corroborated against the oblique image if available and clear.
After each section was completed, we conducted a quality control round to ensure consistency across the assessment.
After the initial assessment, we conducted a secondary round of validation using limited-scope surveys by independent research teams and crowd-sourced feedback from the public.
The assessment began as a limited-scope pilot investigation for the tsunami-affected area but expanded to include the entire peninsula. Based on the pilot investigation, we conceived an initial “binary+” classification schema that was eventually formalized into the final classification system with minimal adjustment – these classes are defined as reported in Table .
We provide an approximate equivalence table between classification methods in Table .
Figure 5Oblique coverage by Kokusai Kogyo and inherent confidence of visual assessment. Inset basemap © Google Earth 2024.
[Figure omitted. See PDF]
This decision is in part supported by previous findings by , who mention that crowd sourcing yields bias towards edge classes (“no damage” and “destroyed”) in spite of middle classes (damage and/or possibly damaged). Where possible, the visual assessment is supported by oblique imagery, which proved invaluable in many instances. This was especially true for edge cases such as areas with poor visibility, densely packed areas wherein buildings collapsed vertically (“pancake” collapse), or overcast areas. Initially we considered a multi-class damage assessment to fully leverage the oblique imagery. However, a combination of cloud obfuscation in the vertical imagery and limited coverage of the oblique conditions did not allow a comprehensive assessment (Fig. ).
As mentioned in Sect. , significant discrepancies exist between the GSI building inventory and the pre-event orthophotos. These discrepancies are particularly pronounced in rural areas where the GSI building inventory is often missing or inconsistent with the orthophoto baseline. These mismatches range from minor misalignments to geometry changes (in the case of additions, refurbishing, or knock-down rebuild) and significant changes in the building footprint that are unattributable even when cross-checking the historical imagery. When reasonable, we attempted to adjust the GSI building footprints to match the orthophoto baseline; this is done in order to preserve the original GSI building metadata (see Sect. ). However, in most cases the ambiguity is significant enough that the building is marked as 99 (missing or inconsistent; Table ) and new buildings are drawn in its place.
Figure 6Examples of change in blue tarp coverage in Wajima between aerial imagery capture missions. The figure highlights challenges faced through potential issues with coverage, atmospheric conditions, and source mismatch. Aerial imagery courtesy of .
[Figure omitted. See PDF]
Since only 16 % of buildings lie within the viewing angle of oblique imagery a multi-class assessment was deemed less viable. make a case for the inclusion of blue-tarp-covered buildings as a separate class in their deep learning classification framework: they note that the presence of blue-tarp-covered structures correlated with moderate–heavy building damage classes. Although we initially considered including tarp-covered buildings as a distinct class, there are mismatches between the GSI-provided vertical images: this is observable for segments where overlapping orthophotos are available, such as Wajima. Figure provides an example of the mismatch in tarp presence between mission dates. In other instances, such as Anamizu-machi, spotty cloud cover makes the identification of tarp-covered buildings particularly challenging. Ultimately, a conservative approach was deemed preferable.
2.3 Tsunami damage assessment
The tsunami that impacted the eastern coast of the Noto Peninsula was purportedly generated in part by the rupturing of several offshore active faults. In addition, seismic activity may have aggravated submarine landslides in southern Toyama Bay, leading to subsequent tsunami amplification that was ultimately responsible for much of the damage experienced along the eastern coast of the Noto Peninsula .
Figure 7Estimated inundation area provided by GSI; backdrop: .
[Figure omitted. See PDF]
The estimated tsunami inundation extent lies almost entirely along the northeastern coast of the Noto Peninsula area and stretches from the northernmost point of the Suzu municipality to the Nanao municipality in the south. conducted several surveys of the tsunami inundation area and provided comprehensive information on the inundation and run-up heights of the tsunami. On the western coast, only a small extent on the northern portion of the Shika municipality was indicated as inundated (Fig. ).
Table 4Approach to mismatches between footprint polygons and the orthophoto baseline.
| Case | Action | |
| Polygon does not reflect the shape in the orthophoto | Adjust (add, split, merge) | |
| Polygon does not appear to correspond to a pre-event or post-event building | Mark building as 99 | |
| Polygon does not exist, building is evident on pre-disaster orthophoto | Polygon is manually added | 
The intersection between the estimated tsunami inundation and the GSI building inventory was the first portion of the damage assessment to be carried out as a preliminary measure. A total of 3261 building polygons were originally included. This pilot investigation served as the basis for the initial criteria, which was then expanded to include the entire peninsula. Notably, this is where it was originally noted that the GSI building inventory was often inconsistent with the pre-event orthophoto baseline. In these cases we handled mismatches with the heuristic noted in Table ; this was later folded into the final classification system and process heuristic explained at the top of this section.
2.4 Earthquake damage assessment
The scope of the visual assessment was expanded upon completion of the tsunami assessment. The criteria were adjusted to include modes of damage exogenous to tsunami-induced failure: including considerations for landslide displacement and burial, as well as fire damage. Moreover, concessions were made for sunshade and buildings under canopy, conditions that seldom affect urbanized coastal settlements where tree cover is diminished. The final damage inventory for the whole domain resolves to 140 208 buildings, 25 685 of which were digitized manually. The large proportional disparity in digitized buildings between the tsunami-affected areas (4.1 %) and the entire domain (18.3 %) is largely due to vast portions of building footprints in the countryside being mismatched.
2.5 Crowd-sourced feedback
An initial version of the database was made public for viewing on 11 February 2024 
A second set of review information was made available by limited-scope surveys (conducted by research teams) that provided photo evidence to assist the technical validation process. This data informed a quantitative statistical analysis of error margins for the assessment.
Table 5Details regarding table attributes contained in the GeoPackage dataset.
| Attribute | Type length | Valid entries | Description | 
| [1–140 208] | Unique identifier for the building (original). | ||
| GSI serializationstandard or “manual” | Serial feature identifier from the original xml file ; | ||
| [0, 1, 9, 99] | Damage class attributed as part of this assessment, as per Table . | ||
| [0, 1, 9, 99] | Damage class after technical validation (Sect. ) attributed as part of this assessment. | ||
| array or NULL | Oblique image source number from KKC inventory (where available); see Sect. for access to the KKC repository. | ||
| Prefecture-City-Town(Japanese) | Municipality name from e-Stat* . | ||
| [single, multi] | Confidence level of the assessment as per Fig. based on oblique coverage. | ||
| [0, 1] | Whether building intersects fire-impacted polygon | ||
| [0, 1] | Whether building intersects slope failure polygons | ||
| [0, 1] | Whether building intersects tsunami inundation polygons | ||
| Real | Modified Mercalli Index inherited from the layer | ||
| Vector geometry of the building footprint . | 
* Available at the following. 
GSI mesh tiles considered for the assessment , available at: 
[Figure omitted. See PDF]
3 Data description
In this section we describe the structure of the dataset, technical notes, attributes, and secondary sources.  The database is stored as the GeoPackage  
The dataset uses coordinate reference system (CRS) 
Dataset validation areas are estimated from imagery provided by an independent survey team. A path was fit through the location metadata of each photo. We assume a 40 range buffer around the path to be a reasonable visible area for the survey team, judging by the photographic evidence provided. Inset basemap courtesy of © Google Earth 2024.
[Figure omitted. See PDF]
4 Technical validation
The database was split into working subsections to annotate using single-source or multi-source remote sensing imagery – we refer to this process as human annotation . Once completely classified, each subsection was reviewed by a different team member and integrated into the live database. Our iterative validation was twofold: through our open web API, we collected voluntary requests for correction, each submission requiring photographic evidence. Each building for which a correction was submitted was given a new validated damage class (Table ) with the new classification provided that the submitted evidence conformed to our binary damage classification (Table ).
Data provided by two independent on-site photographic surveys, by Tokoha University and Tohoku University, were used to validate portions of the database similar to how crowd-sourced data were handled. The surveys provide coherent coverage of four major settlements: Wajima, Suzu, Anamizu, and Monzenmachi (Wajima), as well as scattered inland rural settlements (Fig. ).
Since the data are unbiased with respect to the database (i.e., all damaged buildings were documented along the survey path irrespective of the damage assessment class), the coverage was used to statistically impute the accuracy of human annotation: each photo was taken from ground level and geotagged, forming a dense set of nodes. An approximate path was generated using a range-limited nearest-neighbor algorithm. Finally, the intersection between the building database and a 40 m buffer (reasonable field of view, assumed from photo inspection) around the paths was taken as the surveyed extent. Our initial labels (human annotation) are taken as the estimates and measured against the surveyed (corrected) ground truth , and we report standard classification metrics in Table .
Table 6Classification statistics for independently surveyed areas, showing the approximate accuracy of the visual assessment against comprehensive ground documentation.
| Class | Precision | Recall | Samples | |
| Survived | 0.95 | 0.99 | 0.97 | 1666 | 
| Destroyed | 0.99 | 0.84 | 0.91 | 559 | 
The harmonic score between survived and destroyed classes is 0.939, suggesting high confidence in the assessment. A spatial representation of the survey coverage is given in Fig. ; notably, a large portion of the surveyed areas is outside of multi-source coverage, suggesting that despite the limitations described above, the proposed visual assessment framework is robust. We hope that this exercise in crowd-sourced and survey validation will permit further statistical investigations into the features and limitations of manual image-based rapid building damage visual assessments.
5 Discussion
The unique nature of the disaster is reflected in its varied impacts on buildings, such as ground shaking, subsidence, uplifting, tsunami surge, soil liquefaction, landslide, and fire, among others (Fig. ). The dataset provides a comprehensive visual assessment of building damage across the Noto Peninsula, including all the aforementioned impacts. With this contribution, we aim to provide a reference for future studies and a benchmark for automated methods.
To guarantee a high degree of consistency across all working members, our classes needed to be as clear-cut as they were manageable. A potentially useful third class would necessarily split the “survived” class, analogous to the scale provided by Copernicus Emergency Management Service (CEMS) . Roof damage and horizontally displaced rubble are generally the only visible signs of a damage spectrum between ideal “no-damage” and “destroyed” classes. Any potential third class would be predicated on the presence of these defining characteristics.
Regrettably, timing and weather conditions severely limited the return period of sufficiently clear or redundant vertical imagery (Table ).  Major seasonal pressure systems accompanied modest seasonal snowfall across the peninsula over the first 3 weeks of the year.  The cloud cover and snow buildup effectively impeded the classification of 9456 buildings (criteria details are given in Table ).  Notable among the mosaics described in Table  is the 
The winter season poses particular challenges for image classification – much of the natural environment tends towards darker, less saturated colors due to a combination of factors: houses in the Noto Peninsula generally feature traditional black roof tiles. Overcast weather can decrease color saturation by reducing available light (hence reflection). Relative darkening can reduce the contrast between dark roofs and the background environment in both cities (concrete and asphalt colors) and the countryside (deciduous vegetation tends towards browns and greys). Many of these challenges can hamper the visibility of roof cracks, missing tiles, exposed roof beams, and scattered rubble that may be used to distinguish between classification grades.
The Copernicus guidelines for CEMS make a case to diverge from EMS-98 , citing that “such methodologies are fundamentally designed for ground-based field assessments, and thus are not intentionally tailored to be used with remotely sensed images”.  Moreover, EMS-98 only considers masonry and reinforced concrete buildings, which are inappropriate for a context such as Japan where wooden buildings are overwhelmingly prevalent.  Okada et al.  provide a damage index (and equivalent grades) better suited to the context of the Noto Peninsula; however, this index is also conceived for ground-based assessments and relies on accurately assessing the condition of load-bearing walls and pillars.  In principle, the CEMS index provides the most appropriate framework for our use case; however, CEMS is not designed with a consideration for multi-source, multi-modal data.  Obliques can allow for vastly more granular classification contingent on the viewing angle and distance from the target building.  The following items contributed to our final decision to use a binary classification. 
Only 16 % of buildings lie within the viewing angle of oblique imagery.
While oblique imagery provides significant redundancy, not all buildings visible in the frame are clear; draw distance and image resolution are significantly more variable than in vertical imagery where the distance to the target is more consistent.
The failure modes vary depending on the hazard (e.g., tsunami, landslide, fire; Fig. ), and hence some sort of equivalence is needed to compare the different failure modes relative to the same scale.
Oblique cover is split between failure modes: for example, in Suzu, earthquake and tsunami damage is present, while in Wajima, earthquake and fire damage is present. Finally, along the north coast between Wajima and Suzu, the majority of the damage is caused by landslides and slope failure.
Ultimately, we valued consistency and comparability over the potential for a conditional, more granular classification. For the purposes of this project a binary classification was deemed preferable – a breakdown of how our assessment relates to popular reference scales is given in Table . We fully endorse and encourage the use of this dataset by the research community and beyond as the starting point for more granular and detailed assessments of the damage now that significantly more information is available.
5.1 On multi-hazard failure modes
The dataset can inform studies that aim to understand the different multi-hazard failure modes given the different impacts listed above – explore multi-hazard damage detection models but focus on aggregating each hazard discretely by type.
Figure 10Different impacts across contiguous areas of the Noto Peninsula illustrate how multiple hazards may manifest across a single event with extreme proximity. Basemap courtesy of © Google Earth 2024, obliques by © .
[Figure omitted. See PDF]
However, as Fig. illustrates, multi-hazard failures not only occur within the same domain, but can also be present in contiguous sections of the same town. In the figure, we show how earthquake damage is often compounded by fire, landslide, or tsunami damage; in cases of more populated areas, multiple hazards are present at once, as can be seen in Wajima where fire damage, landslide damage, and earthquake damage are all present.
5.2 Machine learning applications
In the field of disaster geo-informatics, our dataset can serve as training data for machine learning tasks. In its current form, the dataset can be used to test pre-trained models such as those proposed by , , and . In this context our dataset offers a new, valuable, out-of-domain test set . A speculative framework, specifically focused on the multi-hazard nature of the Noto earthquake disaster discussed above, is illustrated in Fig. .
Figure 11A speculative framework that might be used to investigate multi-hazard failure modes (for illustrative purposes). Basemaps courtesy of © Google Earth 2024.
[Figure omitted. See PDF]
Combined with population data, our database can enable more granular quantitative research into injury and mortality.
Figure 12(a) Histogram of aggregated building damage. (b) Empirical fragility function (red solid line) for earthquake-affected buildings relative to PGV. We provide two wood building fragility functions for “major” and “moderate+” damage classes (dotted line and dashed line, respectively) proposed by for buildings built between 2001 and 2016 and affected by the 2016 Kumamoto earthquake.
[Figure omitted. See PDF]
5.3 Statistical approaches and baseline model
To encourage the research community's engagement, we provide a statistical baseline of the damage across the non-inundated portion of the Noto Peninsula dataset (Fig. ).
Figure 13Composite of exacerbating hazards (in addition to seismic impact). Inundation, fire area, and slope failure/sedimentation extents are provided by the . Peak ground acceleration estimates provided by in collaboration with .
[Figure omitted. See PDF]
We propose an aggregated seismic empirical fragility function relative to the peak ground velocity (PGV) registered during the event. Importantly, this fragility function is built on the subset of data not affected by aggravating hazards (inundation, fire, or landslide) illustrated in Fig. .
Hence we assume that the damage is solely due to seismic shaking. We fit the aggregated data using a lognormal distribution (Eq. ) and estimate the parameters using ordinary least squares (, ). 1
As a frame of reference, we report two fragility functions proposed by for new wooden buildings affected by the Kumamoto earthquake in 2016. Our baseline fragility function suggests that buildings in the Noto Peninsula were similarly vulnerable as wood buildings built between 2001 and 2016 and destroyed in the Kumamoto earthquake.
6 Data availability
The database is provided as a standard GeoPackage containing a single vector layer accessible through any software implementing the Geospatial Data Abstraction Library (GDAL/OGR) such as QGIS or ArcGIS. Each entry is represented by a building footprint with seven attributes summarized in Table . 
The database is available in its most updated version from our public repository at 10.5281/zenodo.11055711 .
Epicenter and intensity contours are available at the USGS event page (
Earthquake swarm data are available through the Japan meteorological Agency (JMA)'s website (
Post-event raster orthophotography, inundation, fire, and slope failure vector extents are available through the GSI's dedicated Noto Peninsula earthquake page (https://www.gsi.go.jp/BOUSAI/20240101, last access: 25 April 2024, Japanese).
Oblique imagery is provided by KKC . For disaster events, KKC may make their products available for free through the BOIS portal (
7 Conclusions
We present a comprehensive building damage database for the Noto Peninsula earthquake of 2024, developed through a multi-source, multi-modal visual assessment of building damage.
The particular circumstances of this event, timeliness of data availability, degree of coverage, and access to in situ survey information presented a singular opportunity to develop and validate this new dataset through a unique framework.
Providing this dataset offers the opportunity to study impacts of multi-hazard disasters on building damage. Figure illustrates how different hazards manifested across contiguous areas of the Noto Peninsula. Understanding the different impacts may provide valuable insights for disaster response and recovery planning. Future studies may leverage our dataset to develop novel multi-hazard models that can predict building damage across different impacts (a speculative framework is shown in Fig. ). With this contribution we hope to enrich the global corpus of disaster building damage datasets.
We provide the hand-curated building inventory as a GeoPackage through the public repository at 10.5281/zenodo.11055711 . Each building was classified into four classes: survived, destroyed, obstructed view through human inspection, and missing or inconsistent.  Limited-scope validation was conducted through crowd-sourced community feedback through our online portal and independent survey data collected by experts in the field.  In its immediate form, the dataset may be used to do the following: 
Train site-specific statistical and machine learning models for building damage assessment.
Test domain adaptation frameworks for building damage assessment by testing pre-trained models on our new out-of-domain dataset as illustrated by .
Fine-tune pre-trained models on our dataset to improve performance across datasets as shown in .
Develop novel multi-hazard models that can predict building damage across different impacts.
In combination with additional data sources, such as population data and post-disaster information, our dataset can inform further investigation into disaster logistics, evacuation, injury, and mortality. We hope that this dataset will serve as a reference for future studies on building damage assessment, disaster response, and recovery planning.
Author contributions
RV: writing (original draft preparation), validation, visualization. RV and SW: methodology, software, formal analysis, data curation, investigation. CYH, JM, XD, SI, KW, and YE: data curation. BA, EM, and AM: conceptualization, investigation, supervision, field survey, writing (review and editing). SK: conceptualization, supervision, field survey, writing (review and editing), funding acquisition.
Competing interests
The contact author has declared that none of the authors has any competing interests.
Disclaimer
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
Special issue statement
This article is part of the special issue “Methodological innovations for the analysis and management of compound risk and multi-risk, including climate-related and geophysical hazards (NHESS/ESD/ESSD/GC/HESS inter-journal SI)”. It is not associated with a conference.
Acknowledgements
This study was supported by JSPS KAKENHI (Grants-in-Aid for Scientific Research; 21H05001, 22K21372, and 22H01741), JST SICORP (JPMJSC2311), and the SIP of CSTI (JPJ012289).
Financial support
This study was supported by JSPS KAKENHI (Grants-in-Aid for Scientific Research; 21H05001, 22K21372, and 22H01741), JST SICORP (JPMJSC2311), and the SIP of CSTI (JPJ012289).
Review statement
This paper was edited by Kirsten Elger and reviewed by two anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2025. This work is published under https://creativecommons.org/licenses/by/4.0/ (the "License"). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
We present a building damage dataset following the 2024 Noto Peninsula earthquake. The database was compiled from freely available, multi-source, remote sensing data, verified through opt-in crowd-sourced information. The dataset consists of georeferenced polygons representing the pre-event building footprints of 140 208 structures. Each building was classified through visual inspection using pre-disaster and post-disaster vertical, oblique, survey, and verifiable news reporting imagery. Entries were validated using voluntary submission data sourced through a web API hosting a live version of the database. We calculate classification metrics for a subset of the database where ground survey photographs were provided by independent surveyors. An average 
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details
 ; Tanaka, Satoshi 4 ; Koshimura, Shunichi 2
 
; Tanaka, Satoshi 4 ; Koshimura, Shunichi 2 1 Department of Civil and Environmental Engineering, Tohoku University, Aoba 468-1, Aramaki, Aoba-ku, Sendai, 980-8572, Japan
2 International Research Institute of Disaster Science (IRIDeS), Tohoku University, Aoba 468-1, Aramaki, Aoba-ku, Sendai, 980-8572, Japan
3 Department of Civil Engineering and Architecture, School of Engineering, Aoba-6-6-06 Aramaki, Aoba Ward, Sendai, Miyagi, 980-8572, Japan
4 Faculty of Social and Environmental Studies, Department of Social and Environmental Studies, Tokoha University, Yayoi-cho 6-1, Suruga-ku, Shizuoka city, 422-8581, Japan





