Archive | Machine Learning

When Men and Women talk to Siri

Just adding a short note to elaborate on the Twitter conversation I recently had on differential error rates in speech products for male and female speakers. TLDR: different gender error rates for speech products exist mainly because of 1) our lack of better models even when the data is balanced, 2) inherent hardness of the […]

Continue Reading 0

Everything is a Model

TLDR: I review a recent systems paper from Google, why it is a wake-up call to the industry, and the recipe it provides for nonlinear product thinking. Here, I will be enumerating my main takeaways from a recent paper, “The Case for Learned Index Structures” by Tim Kraska, Alex Beutel, Ed Chi, Jeffrey Dean, and […]

Continue Reading 0

Is BackPropagation Necessary?

In the previous post, we saw how the backprop algorithm itself is a bottleneck in training, and how the Synthetic Gradient approach proposed by DeepMind reduces/avoids network locking during training. While very clever, there is something unsettling about the solution. It seems very contrived, and definitely resource intensive.  For example, a simple feed forward network under the […]

Continue Reading 6

Synthetic Gradients .. Cool or Meh?

Synthetic what now? DeepMind recently published about Synthetic Gradients. This post is about that — what they are, and does it make sense for your average Deep Joe to use it. A Computational Graph is the best data structure to represent deep networks. (D)NN training and inference algorithms are examples of data flow algorithms, and […]

Continue Reading 12

Gradient Noise Injection Is Not So Strange After All

Yesterday, I wrote about a gradient noise injection result at ICLR 2016, and noted the authors of the paper, despite detailed experimentation, were very wishy washy in their explanation of why it works. Fortunately, my Twitter friends, particularly Tim Vieira and Shubhendu Trivedi, grounded this much better than the authors themselves! Shubhendu pointed out Rong Ge (of MSR) […]

Continue Reading 0

Should you get the new NVIDIA DGX-1 for your startup/lab?

NVIDIA announced DGX-1, their new “GPU supercomputer”. The spec is impressive. Performance, even more so (training AlexNet in 2 hours with 1 node). Costs $129K. Running this would take around 3KW. That’s like keeping an oven going. The cheapest (per hour) best config you can currently get from AWS is g2.8xlarge: So for $129K you […]

Continue Reading 0

© 2016 Delip Rao. All Rights Reserved.