Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

2024-05-09Unverified0· sign in to hype

Debabrata Pal, Anvita Singh, Saumya Saumya, Shouvik Das

Unverified — Be the first to reproduce this paper.

Abstract

The intrinsic capability of the Human Vision System (HVS) to perceive depth of field and failure of Instrument Landing Systems (ILS) stimulates a pilot to perform a vision-based manual landing over an autoland approach. However, harsh weather creates challenges, and a pilot must have a clear view of runway elements before the minimum decision altitude. To aid in manual landing, a vision-based system trained to clear weather-induced visual degradations requires a robust landing dataset under various climatic conditions. Nevertheless, to acquire a dataset, flying an aircraft in dangerous weather impacts safety. Also, this system fails to generate reliable warnings, as localization of runway elements suffers from projective distortion while landing at crosswind. To combat, we propose to synthesize harsh weather landing images by training a prompt-based climatic diffusion network. Also, we optimize a weather distillation model using a novel diffusion-distillation loss to learn to clear these visual degradations. Precisely, the distillation model learns an inverse relationship with the diffusion network. Inference time, pre-trained distillation network directly clears weather-impacted onboard camera images, which can be further projected to display devices for improved visibility.Then, to tackle crosswind landing, a novel Regularized Spatial Transformer Networks (RuSTaN) module accurately warps landing images. It minimizes the localization error of runway object detector and helps generate reliable internal software warnings. Finally, we curated an aircraft landing dataset (AIRLAD) by simulating a landing scenario under various weather degradations and experimentally validated our contributions.

Tasks

All Language Modeling Language Modelling Self-Supervised Learning

Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

Abstract

Tasks

Reproductions