Design Patterns for Production NLP Systems

by  on

This post is an excerpt from the final chapter of our upcoming book on Deep Learning and NLP with PyTorch . The book is still a draft under review so your comments on this section are appreciated!  ..Continue Reading

Differentiable Dynamic Programs and SparseMAP Inference

by  on

Two exciting NLP papers at ICML 2018! ICML 2018 accepts are out, and I am excited about two papers that I will briefly outline here. I think both papers are phenomenally good and will bring back structured prediction in NLP to modern deep learning architectures.  ..Continue Reading

When Men and Women talk to Siri

by  on

Different gender error rates for speech products exist mainly because of 1) our lack of better models even when the data is balanced, 2) inherent hardness of the problem ..Continue Reading

Everything is a Model

by  on

I review a recent systems paper from Google, why it is a wake-up call to the industry, and the recipe it provides for nonlinear product thinking. ..Continue Reading

Two Recent Results in Transfer Learning for Music and Speech

by  on

In this post, I want to highlight two recent complementary results on transfer learning applied to audio -- one related to music, another related to speech.  ..Continue Reading

When (not) to use Deep Learning for NLP

by  on

We are preparing for the second edition of our PyTorch-based Deep Learning for NLP training. It's a two-day affair, crammed with a lot of learning and hands-on model building where we get to play the intricate dance of introducing the topics from the ground up while still making sure folks are not far from the state-of-the-art. Compared to our first attempt at NYC this year, we are adding new content, changing existing content to explain some basic ideas well. One subtopic I am quite excited to add is a discussion of "When to use Deep Learning for NLP and when not to". This post will be expanding on that.  ..Continue Reading

The Two Tribes of Language Researchers

by  on

Sometimes it's useful to put people in boxes to understand where they are coming from and the conversations they like to have. Let's talk about my tribe -- the NLP folks. ..Continue Reading

A Billion Words and The Limits of Language Modeling

by  on

In this post, I will talk about Language Models, when (and when not) to use LSTMs for language modeling, and some state of the art results. ..Continue Reading

Is BackPropagation Necessary?

by  on

In the previous post , we saw how the backprop algorithm itself is a bottleneck in training, and how the Synthetic Gradient approach proposed by DeepMind reduces/avoids network locking during training. While very clever, there is something unsettling about the solution. It seems very contrived, and definitely resource intensive.  For example, a simple feed forward network under the scheme has a Rube-Goldbergesque feel to it. image courtesy: Jaderberg et al (2016) . A fully unlocked feed forward net using DNI. Every time you see a solution that looks unnatural, you want to go back and ask are we solving the right problem or are we even asking the right question. Naturally, this begs the question: Is backprop is the right way to train neural networks? To answer look at this, let's step back a few paces. All machine learning algorithms are solving one kind of optimization problem or another. A majority of those optimization problems (esp. those involving real-world tasks) are non ..Continue Reading

Synthetic Gradients .. Cool or Meh?

by  on

The promise of backpropagating with estimated gradients ..Continue Reading

Copyright © 2021. Delip Rao