Due to privacy concerns, it is difficult to gather data. Thus, some researchers suggest that a synthetic dataset can be generated using a realworld dataset. In this project, we aim to generate synthetic data using the Breadcrumbs dataset.


This project consists of three steps:

  1. Implementation (or adaptation) of the synthetic data generation code proposed by Kulkarni et al. [3]. 
  2. Evaluation of the synthetic data generation using the Breadcrumbs dataset.
  3. Improvements to the synthetic data generation code.

This subject is ONLY a Semester Project, and it is a guided project. 

If you want to learn more about the Breadcrumbs Dataset, we suggest you read the Breadcrumbs Paper [1] and Breadcrumbs Dataset Description [2].


Students must be confident with their algorithms and machine learning. Preferred programming languages are Java and Python. 


[1] Vaibhav Kulkarni, Arielle Moro, Bertil Chapuis, and Benoît Garbinato. 2017. Extracting Hotspots without A-priori by Enabling Signal Processing over Geospatial Data. In Proceedings of the 25th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems (SIGSPATIAL ’17). Association for Computing Machinery, New York, NY, USA, Article 79, 1–4. DOI:

[2] Breadcrumbs Dataset Description,

[3] Vaibhav Kulkarni and Benoît Garbinato. 2017. Generating synthetic mobility traffic using RNNs. In Proceedings of the 1st Workshop on Artificial Intelligence and Deep Learning for Geographic Knowledge Discovery (GeoAI ’17). Association for Computing Machinery, New York, NY, USA, 1–4.