Low-latency Inference inside a Smart Storage Node

Supervisors: Zsolt István
Semester: Spring 2021
Tags: FPGA, Data Management, MachineLearning

(This project will be carried out in collaboration with Xilinx Research Labs in Dublin)

Machine Learning operators are becoming increasingly commonly used in data management systems and, in this project, we will explore the challenges and benefits of integrating inference operators from FINN [1] within a so-called Smart Storage system [2]. Both the inference and data management aspects will be handled by an FPGA in order to provide small energy footprint and guarantee high access bandwidth at the same time (see details below). During this project, explore questions such as: What are the data management challenges of organizing the key-value contents to make inference possible with low latency? What are the integration challenges at the FPGA circuit level of this functionality in a Smart Storage node?

For inference, we will rely on FINN [1], an experimental framework from Xilinx Research Labs to explore deep neural network inference on FPGAs. It specifically targets quantized neural networks, with emphasis on generating dataflow-style architectures customized for each network.

For the Smart Storage Node we will use Caribou [2]. Caribou nodes are built with FPGAs and each node stores key-value pairs in main memory and exposes a simple interface over TCP/IP that software clients can connect to. Caribou is “smart” because it is possible to offload filtering into the storage nodes. The nodes can also perform scans on the data. In this design filtering is a combination of regular expression matching and predicate evaluation. Different types of processing can, however, easily be added to the processing pipeline. Caribou is “distributed” because it runs on multiple FPGAs that replicate the data using a leader-based consensus protocol that is both low latency and high throughput. Caribou provides a “storage service” because it stores key-value pairs in a Cuckoo hash table and implements slab-based storage allocation.

Student Profile
Skills needed: VHDL/Verilog coding, Debugging FPGA projects at least in simulation, Ideally some Go and Python experience
Skills to be acquired: Designing HW/SW systems, Working with network-facing FPGA designs, Possibly HLS coding

[1] []
[2] []