Smart grid monitoring electrical systems and Industrial IoT sensors fusion with vision Transformers
DOI:
https://doi.org/10.7492/qy7pvv45Abstract
In this research, a sophisticated monitoring platform of smart grids is introduced, which combines the measurements of the electrical system with the streams of IoT sensors and visual data using a Multi-Modal Cross-Attention Vision Transformer (MM-ViT). The proposed model that is developed in PyTorch is a combination of time-series electrical measurements, including voltage, current, harmonics, and temperature, as well as RGB and thermal images of grid assets. The cross-attention mechanism allows the model to acquire complementary relationships between sensor behavior and the conditions of visual equipment, which improves fault detection, anomaly detection, and health assessment of equipment. Experimental findings show that the MM-ViT is much more successful compared to traditional sensor-only, vision-only, and hybrid deep learning models, as it is more accurate, with fewer false alarms, and more stable in terms of forecasting. The results show the promise of transformer-based multi-modal fusion to provide more confident situational awareness and proactive decision-making in the contemporary smart grid infrastructures. The work adds a scalable and powerful method of next-generation power system monitoring.














