Geographical Information Systems (GIS) are being used to store and visualize spatial data. In history, scientists used spatial data and analysis to reveal important information. Such analysis can identify the source of disease, pollution in an area, and consumer profiling. One of the earliest GIS applications is John Snow’s spatial analysis on the 1854 Broad Street Cholera Outbreak. He mapped the roads, water lines, and clusters of cholera cases. Thus, he noticed the cholera cases were highest along the water lines. Today, many GIS tools, such as QGIS and ArcGIS, help us to visualize and analyze spatial data. Using Machine Learning techniques, we can improve the results of a spatial analysis further, and we can predict the next stop of a disease.


This project aims to investigate two datasets. The first dataset is a COVID-19 dataset that contains time series and confirmed the number of cases per country. The second dataset is an air traffic dataset from The OpenSky Network 2020 with all public flight information from 01.01.2019 to 31.05.2020. The student will analyze datasets individually and their combination. Then, the student should look for a connection between them, i.e., by training a neural network to determine the spread of disease.