Accelerated Distributed Platform for Spatial Queries

Supervisor: Iman Elghandour
Semester: Fall 2019

It is now common to query terabytes of spatial data. Several new frameworks extend distributed computing platforms such as Hadoop and Spark to enable them to efficiently process spatial queries by providing (1) mechanisms to efficiently store spatial data and index them ; and (2) packages of built in spatial operations for these platforms. Meanwhile, it is now common to accelerate Hadoop and Spark using accelerators such as GPUs and FPGAs. The objective of this master thesis is to build a framework that efficiently executes spa- tial queries on a an extended implementation of Spark that is enabled to run its tasks on GPUs.

Deliverables of the master thesis project
  • An overview of Spatial queries and frameworks for processing big spatial data.
  • A study of best approaches to represent spatial data while it is queried by Spark and GPUs.
  • An implementation of common spatial operations and computational geometry algo- rithm on GPUs and Spark.
  • An experimental validation of the developed system.