Systems

Benchmarking Druid and Presto

In recent years, the proliferation of internet technology has created a surge in machine generated events. These events generally have three parts – timestamp, dimensions and metrics. For example – advertising impression data with dimensions like publisher, gender, country etc and metrics like clicks, price etc. Individually these events contain minimal useful information and are of low value. Earlier, companies were willing to discard this data due to the time and resources required to extract any meaning out of it.

SurfStore

Getting people to agree on something is hard. Getting machines connected through an asychronous network to agree on something is harder. RAFT is an easy-to-understand consensus protocol which uses leader election and log replication to achieve consensus. In this project, a cloud-based file storage system was built that can survive server failure, datacenter failure, and network failures. Leader election and log replication from the RAFT Consensus protocol were implemented to maintain a consistent state across servers.

Comparing Cloud Models

A comparative study of Virtual Machines, Containers and Serverless

Load Value Prediction

Predicting values loaded by machine intructions to aid Instruction Level Parallelism.

Malware Detection using Machine Learning

Predicting whether an execuatable is malware or benign.