1

Exploring the Memorization Generalization Continuum in Deep Learning

Human learners appreciate that some facts demand memorization whereas other facts support generalization. For example, English verbs have irregular cases that must be memorized (e.g., go-went) and regular cases that generalize well (e.g., …

Just in Time Dynamic Batching

Batching is an essential technique to improve computation efficiency in deep learning frameworks. While batch processing for models with static feed-forward computation graphs is straightforward to implement, batching for dynamic computation graphs …

TVM: An Automated End-to-End Optimizing Compiler for Deep Learning

There is an increasing need to bring machine learning to a wide diversity of hardware devices. Current frameworks rely on vendor-specific operator libraries and optimize for a narrow range of server-class GPUs. Deploying workloads to new platforms – …

Learning to Optimize Tensor Programs

We introduce a learning-based framework to optimize tensor programs for deep learning workloads. Efficient implementations of tensor operators, such as matrix multiplication and high dimensional convolution are key enablers of effective deep learning …

Efficient Deep Learning Inference on Edge Devices

Deploying deep learning (DL) models on edge devices is getting popular nowadays. The huge diversity of edge devices, with both computation and memory constraints, however, make efficient deployment challenging. In this paper, we propose a two-stage …