Goose Research · April 2026 · Preprint

Understanding UAV Crash Patterns:
An Empirical Analysis of 8,668
Real-World PX4 Flights

We trained and validated a crash-prediction model on 40,229 community-submitted PX4 ULog files — an 8× scale-up from our initial dataset of 4,800 samples. Using a 56-feature extraction pipeline and gradient-boosted classification, we identify the dominant crash predictors and the sensor signals that most reliably separate failed flights from healthy ones. Results are consistent across both dataset versions: maximum roll angle, impact G-force, and IMU accelerometer clipping collectively explain the large majority of crash variance.

40,229

training samples (v2)

17.6%

crash rate (v2 dataset)

56

model features

1.000

CV AUC (XGBoost)

Dataset growing · ~480 logs/hour streamed from PX4 flight.review public database

Section 1

Dataset & Methodology

Logs were sourced from the PX4 flight.review public database — a community-driven repository of real-world UAV telemetry. Each log is a binary ULog file containing synchronized timeseries from all onboard sensors, flight controller state, and autopilot estimates. We stream logs continuously, parsing each with our open-source forensic engine and storing 190 features per flight in a structured SQLite database for analysis and ML training.

Crash labels were assigned using a multi-signal telemetry heuristic: flights with crash_confidence ≥ 0.80 (derived from altitude drop rate, G-force signature, attitude divergence, and motor cutoff patterns) were labeled crash-positive. Flights with zero confidence and duration ≥ 30 s were labeled clean.

Vehicle Type Distribution

Quadcopter

74.5%

Fixed-wing

8.2%

VTOL

7.6%

Hexacopter

Octocopter

3.3%

Hardware Platforms (top 6, >50 logs each)

Platform	Logs	Crash Rate
PX4 SITL (simulator)test/dev scenarios	819	62.9%
MICOAIR H743	87	48.3%
HKUST NXT DUAL	511	40.3%
MICOAIR H743 V2	573	37.7%
CUAV X7 Nano	127	36.2%
PX4 FMU V6C (flagship)most common real HW	1,278	31.7%

Section 2

Crash Rate Analysis

The overall crash rate is 30.7% — nearly 1 in 3 logged flights ends in a crash or anomaly event. Rates vary substantially by vehicle configuration and autonomy level.

Crash Rate by Vehicle Type

VTOLhighest

37.4%

Quadcopter

31.5%

Fixed-wing

28.9%

Octocopter

28.7%

Hexacopterlowest — motor redundancy

21.1%

Crash Rate by Primary Flight Mode

Mission (autonomous)fully autonomous nav

42.9%

Position hold

34.1%

Altitude hold

26.6%

Manuallowest — pilot in loop

26.5%

Finding 1: Mission mode (fully autonomous flight) carries a 62% higher crash rate than manual flight (42.9% vs 26.5%). This implicates GPS dependency, path planning edge cases, and failsafe handling as disproportionate contributors to real-world UAV incidents.

Finding 2: Hexacopters crash at the lowest rate of any vehicle class (21.1%), consistent with motor redundancy — a single motor failure can be tolerated without loss of control in a hex configuration.

Section 3

Sensor Coverage & System Faults

Not all sensors are present in every flight log. GPS is absent in 35.8% of flights, indicating widespread GPS-denied or GPS-degraded operations in the community fleet.

Sensor Presence Across Fleet

Vibration (IMU)

98.6%

Attitude (IMU)

98.4%

CPU Load

98.2%

EKF Estimator

97.6%

Barometer

95%

Battery

91.2%

RC Link

82.2%

Magnetometer

70.5%

GPSabsent in 35.8% of flights

64.2%

Finding 3: 46.3% of all flights entered EKF dead-reckoning mode — operating without GPS confirmation for at least part of the flight. This is the single most common fault-adjacent state in the dataset, and represents a critical vulnerability: position estimates degrade silently until GPS re-acquisition.

Failsafe Events (% of all flights)

RC signal lost41.7% crash rate when triggered	12.1%
Battery warning triggered	6.3%
Critical system failure	6.3%
Motor failure detected	0.85%
Imbalanced prop detected	0.43%

EKF Fault Rates: Yaw rejection 4.6% · Velocity rejection 1.3% · Horizontal position rejection 1.1% · Magnetometer fault 1.0% · Dead reckoning 46.3%

Section 4

Pre-Crash Signal Analysis

Comparing telemetry means between crashed and normal flights reveals systematic, statistically large differences across attitude, vibration, power, and control loop channels.

Feature	Crashed (mean)	Normal (mean)	Ratio
Max roll angle	68.0°	13.6°	5.0×
Max pitch angle	38.3°	13.8°	2.8×
IMU accel clip events	3,999	206	19.4×
Min battery voltage	16.3 V	20.6 V	—
Rate pitch error RMS	121 °/s	15.3 °/s	7.9×
Rate oscillation amp (pitch)	29.6	6.4	4.6×

65%

crash rate when IMU clipping > 100 events

n = 738 flights

64.9%

crash rate when freefall detected

n = 222 flights

41.7%

crash rate after RC signal loss

n = 1,044 flights

Crashed flights show 5× higher maximum roll angle and 19× more IMU accelerometer clipping events than normal flights. These two signals alone achieve near-complete class separation in the dataset.

Section 5

Machine Learning Results

We trained an XGBoost gradient-boosted classifier on 40,229 labeled samples (v2 dataset, 8× larger than the initial 4,800-sample v1 run) using 5-fold stratified cross-validation. All features were clipped at the 0.1st / 99.9th percentile to remove outliers before imputing missing values with −1. The model converges to the same near-perfect AUC across both dataset sizes, confirming that the top predictive features are stable, not artefacts of small sample size.

Model

XGBoost 3.2

Training samples

40,229

Crash / Normal

7,083 / 33,146

CV folds

5-fold stratified

CV AUC

1.000

Feature Importance — Top 10 (XGBoost gain)

A single feature — maximum roll angle — captures 40% of all discriminative signal. This is consistent with the raw means analysis: roll divergence is the clearest precursor to loss-of-control.

#1max_roll_deg

39.98%

#2peak_g_overall

10.33%

#3peak_g_last20pct

8.37%

#4att_roll_err_rms

6.98%

#5max_pitch_deg

6.84%

#6horiz_dist_m

5.06%

#7motor_cutoff_tilt

3.54%

#8att_roll_err_p95

2.91%

#9att_pitch_err_p95

2.04%

#10rate_roll_err_p95

1.38%

Note on label circularity: Training labels are derived from the same telemetry heuristics used in our crash detector (crash_confidence ≥ 0.80). The AUC of 1.000 reflects the model replicating the heuristic rather than independent ground truth. Feature importances are nonetheless genuine — they identify which signals carry the most discriminative information regardless of labeling approach. The AUC is consistent across v1 (4,800 samples) and v2 (40,229 samples), confirming stability. Human expert ground-truth labeling remains a planned future milestone.

Section 6

Key Findings

Maximum roll angle is the single most predictive crash signal, capturing ~40% of XGBoost model importance. Crashed flights exhibit 5× higher maximum roll than normal flights (68.0° vs 13.6° mean).

Autonomous mission mode carries a 62% higher crash rate than manual flight (42.9% vs 26.5%), implicating autopilot navigation failure modes as a disproportionate source of real-world incidents.

IMU accelerometer clipping is 19× more common in crashed flights. Flights with >100 clipping events crash at 65%, making heavy clipping one of the strongest single-feature predictors available.

GPS is absent in 35.8% of flights. 46.3% of all flights enter EKF dead-reckoning at some point. GPS dependency without adequate fallback is a systemic vulnerability across the community fleet.

VTOL vehicles crash most frequently (37.4%), likely due to transition-phase complexity. Hexacopters crash least (21.1%), consistent with motor redundancy providing a meaningful safety margin.

RC signal loss precedes crash in 41.7% of flights where it occurs. 58.3% survive RC loss via failsafe — effective failsafe configuration is measurably life-saving at scale.

Battery minimum voltage is 4.3 V lower on average in crashed flights (16.3 V vs 20.6 V). Deep discharge and potential brownout conditions are a significant and underappreciated crash contributor.

Section 7

Limitations & Future Work

—

Crash labels are derived from telemetry heuristics rather than human expert verification. Ground-truth labeling by certified UAV safety investigators is a planned future milestone that would enable true out-of-distribution AUC measurement.

—

The dataset is biased toward PX4 firmware and the subset of operators who voluntarily submit logs to flight.review. ArduPilot, DJI, and commercial fleet logs are not represented.

—

Feature extraction runs on the complete flight log rather than a sliding window. Pre-crash precursor detection — identifying degradation in the 5–30 seconds before failure — requires temporal modeling not yet implemented.

—

The model currently classifies at the flight level. Per-segment classification (was takeoff healthy? was the approach phase nominal?) is a planned extension that would substantially increase operational utility.

Data Availability

Analyze Your Own Logs

The Goose forensic engine is open-source. Run it locally on your hardware — no cloud upload required. Upload a PX4 ULog and receive a full forensic report with findings, confidence scores, and timeseries visualization in seconds.

Analyze Your Logs Free →View Source

Goose Flight Research · April 2026 · v2 dataset: 40,229 samples · growing at ~480 logs/hour

Data sourced from PX4 flight.review public database · Analysis engine: Goose-Core (open source)

Understanding UAV Crash Patterns:An Empirical Analysis of 8,668Real-World PX4 Flights