A hardware–software blueprint for flexible deep learning specialization
Thierry Moreau, Tianqi Chen, Luis Vega, Jared Roesch, Eddie Yan, Lianmin Zheng, Josh Fromm, Ziheng Jiang, Luis Ceze, Carlos Guestrin, Arvind Krishnamurthy
July 2019Abstract
This article describes the Versatile Tensor Accelerator (VTA), a programmable DL architecture designed to be extensible in the face of evolving workloads. VTA achieves “flexible specialization” via a parameterizable architecture, two-level Instruction Set Architecture (ISA), and a Just in Time (JIT) compiler.

Ph.D. Student
His research center around co-designing efficient algorithms and systems for machine learning.