Abuse ΘϜ Notation

Optimal Bike-pooling for a Common Destination

You are an intern at a prestigious AI lab. Assume there are researchers in this lab. The lab is located at the origin of a metric space. The researchers live at various points in this space. Everyday, these researchers travel to the lab from their respective homes and back. Each researcher owns an eco-friendly motor bike which can carry at most 2 passengers. They decide to bike-pool to work such that the total distance traveled is minimized. They entrust you to come up with an algorithm which can find the optimal sharing scheme.

Read More

Posterior Sampling for RL

This post is about a simple yet very efficient Bayesian algorithm for minimizing cumulative regret in Episodic Fixed Horizon Markov Decision Processes(EFH-MDP). First introduced as a heuristic by Strens under the name of “Bayesian Dynamic Programming”, it was later shown by Osband et.al that it is in fact an efficient RL algorithm. They also gave it a more informative name - Posterior Sampling for Reinforcement Learning (PSRL).

Read More

Markov Decision Processes

This blog post is about Episodic Fixed Horizon Markov Decision Processes (EFH-MDP). MDPs are a very generic framework and can be used to model complex systems in manufacturing, control and robotics to state a few. These are also used to model environments in Reinforcement Learning (RL).

Read More

Multi Armed Bandits and Exploration Strategies

This blog post is about the Multi Armed Bandit(MAB) problem and about the Exploration-Exploitation dilemma faced in reinforcement learning. MABs find applications in areas such as advertising, drug trials, website optimization, packet routing and resource allocation.

Read More

A Derivation of Backpropagation in Matrix Form

Backpropagation is an algorithm used to train neural networks, used along with an optimization routine such as gradient descent. Gradient descent requires access to the gradient of the loss function with respect to all the weights in the network to perform a weight update, in order to minimize the loss function. Backpropagation computes these gradients in a systematic way. Backpropagation along with Gradient descent is arguably the single most important algorithm for training Deep Neural Networks and could be said to be the driving force behind the recent emergence of Deep Learning.

Read More

Formulating Linear Programs

Linear Programming(LP) is a widely used technique in algorithm design and analysis, especially for getting good approximations for NP hard problems. The first step in LP would be to formulate the the problem as an optimization problem with linear constraints where the quantity we are trying to optimize is also linear. It is a good excersize to formulate simple problems as linear programs. Linear programming problems typically expressed in their canonical form look like this:

Read More

Ratio of Heads and Tails

This is my first blog post. In this blog, I would be writing about Algorithms, Machine Learning, Optimization and Mathematics. I wanted to begin with something simple, so this post would be about a problem from Probability.

Read More