# Archive | Data Science

## Synthetic Gradients .. Cool or Meh?

Synthetic what now? DeepMind recently published about Synthetic Gradients. This post is about that — what they are, and does it make sense for your average Deep Joe to use it. A Computational Graph is the best data structure to represent deep networks. (D)NN training and inference algorithms are examples of data flow algorithms, and […]

## The Unreasonable Popularity of TensorFlow

In this post, I will look at how TensorFlow has gained momentum over competing projects. Unless you’re living away from all of this on a beach (or under a rock if you wish), you already know TensorFlow is a Computational Graph framework, and you hear it being tossed around in the context of Deep Learning/Neural Networks. I […]

## Gradient Noise Injection Is Not So Strange After All

Yesterday, I wrote about a gradient noise injection result at ICLR 2016, and noted the authors of the paper, despite detailed experimentation, were very wishy washy in their explanation of why it works. Fortunately, my Twitter friends, particularly Tim Vieira and Shubhendu Trivedi, grounded this much better than the authors themselves! Shubhendu pointed out Rong Ge (of MSR) […]

Results in Deep Learning never cease to surprise me. One ICLR 2016 paper from Google Brain team suggests a simple 1-line code change to improve your parameter estimation across the board — by adding a Gaussian noise to the computed gradients. Typical SGD updates parameters by taking a step in the direction of the gradient (simplified): […]

## Should you get the new NVIDIA DGX-1 for your startup/lab?

NVIDIA announced DGX-1, their new “GPU supercomputer”. The spec is impressive. Performance, even more so (training AlexNet in 2 hours with 1 node). Costs $129K. Running this would take around 3KW. That’s like keeping an oven going. The cheapest (per hour) best config you can currently get from AWS is g2.8xlarge: So for$129K you […]

## Science, Practice, and Frustratingly Simple Ideas

Yesterday, I wrote (excitedly) about stochastic depth in neural networks. The reactions I saw for that paper ranged from, “dang! I should’ve thought of that” to, umm, shall we say annoyed? This reaction is not surprising at all. The idea was one of those “Frustratingly Simple” ideas that worked. If you read the paper, there […]

## Stochastic Depth Networks will Become the New Normal

.. in deep learning that is. Update: This post apparently made a lot of people mad. Check out my next post after this :-) Everyday a half dozen or so new deep learning papers come out on ArXiv, but very few catch my eye. Yesterday, I read about “Deep Networks with Stochastic Depth“. I think, like dropout […]

## Chances are your Models are Racist, Sexist, or both

At Joostware, I build ML and NLP products. Joostware is an entirely bootstrapped product development studio garage, and consulting is a big part of the business. I work with clients (mostly early startups) on their next big idea, and help them mature those ideas and bring them into reality. One amazing thing I notice, increasingly, […]

## Word Embedding as a Learning To Rank Problem

I’m writing more about word embeddings. Weird. I know. They are frustratingly useful in product development, and opaque when it comes to understanding the gotchas. So every time I come across a paper that improves my understanding I get all excited. Like when Levy et al showed how the competing embedding methods were pretty much […]

## Swivel by Google — a bizarre word embedding paper

A new word embedding paper came out of Google that promises to look at things missed by Word2Vec and Glove, provide a better understanding, and a better embedding. All word embedding schemes try to map a discrete word to a real vector. So if is your vocabulary, a word embedding is a function : Of course, the […]