Alluxio


Alluxio is an open-source virtual distributed file system. Initially as research project "Tachyon", Alluxio was created at the University of California, Berkeley's AMPLab as Haoyuan Li's Ph.D. Thesis, advised by Professor Scott Shenker & Professor Ion Stoica. Alluxio sits between computation and storage in the big data analytics stack. It provides a data abstraction layer for computation frameworks, enabling applications to connect to numerous storage systems through a common interface. The software is published under the Apache License.
Data Driven Applications, such as Data Analytics, Machine Learning, and AI, use APIs provided by Alluxio to interact with data from various storage systems at a fast speed. Popular frameworks running on top of Alluxio include Presto, Apache Spark, Apache Hive, and TensorFlow, etc.
Alluxio can be deployed on-premise, in the cloud, or a hybrid cloud environment. It can run on bare-metal or in a containerized environments such as Kubernetes, Docker, Apache Mesos.

History

Alluxio was initially started by Haoyuan Li at UC Berkeley's AMPLab in 2013, and open sourced in 2014. Alluxio had in excess of 1000 contributors in 2018, making it one of the most active projects in the data eco-system.

Enterprises that use Alluxio

The following is a list of notable enterprises that have used or are using Alluxio: