Online Portfolio Selection - Cover's Universal Portfolio

In my last blog, I mentioned that a desirable feature of an online portfolio selection algorithm is minimax regret optimality, i.e., we want an algorithm Alg satisfying:

Alg=argminAlgmax[r]1TRT(Alg,[r]1T)

Alg would be the best possible algorithm against any sequence of returns r1,,rT. Cover’s Universal Portfolio algorithm is precisely one such candidate for Alg. It was introduced by Thomas M. Cover in a pioneering paper called Universal Portfolios. This was before the theory of online learning and regret analysis. So, the article itself does not use the term regret.

Cover’s Algorithm

Cover’s idea is to assign a non-negative weight wt1(x) to every point xΔn, based on the returns r1,,rt1. Then pick the point xt as a weighted average of the points in Δn. Since this is a convex set, the weighted average is:

xt=xΔnxwt1(x)dxxΔnwt1(x)dx

Intuitively, if a point x produces high returns, it should be weighted more heavily. Cover’s choice for weight function is the simplest possible one. He picks:

wt1(x)=i=1t1(rix)

So, Cover states his algorithm as follows:

xt=xΔnxi=1t1(rix)dxxΔni=1t1(rix)dx

Another way to state the same algorithm is by considering the density pt1(x)i=1t1(rix). In closed form, this would be:

pt1(x)=i=1t1(rix)xΔni=1t1(rix)dx=wt1(x)xΔnwt1(x)dx

Then xt is simply the expectation:

xt=xΔnxpt1(x)dx=Ept1[X]

Interpretation as Continuous Exponential Weights

The Hedge algorithm due to Freund and Schapire, also known as Exponential Weights, is one of the most influential algorithms in Online Learning. It has applications in Machine Learning, Boosting, Bandits, Game Playing, solving Linear Programs, Approximation Algorithms, and many more. See the survey by Arora, Hazan, and Kale for more details.

The weights for Cover’s algorithm can be recast as exponential weights using a simple trick:

wt1(x)=i=1t1(rix)=exp(i=1t1log(rix))=exp(i=1t1fi(x))

For convenience, we define the convex loss function fi(x)=log(rix). The density pt1(x) is simply the exponential weights density:

pt1(x)=exp(i=1t1fi(x))xΔnexp(i=1t1fi(x))dx

Since Exponential Weights is a form of Online Mirror Descent/Follow the Regularized Leader with Negative Entropy regularizer, the density Pt1 can be expressed as the solution to the optimization problem:

pt1=argmaxpμ(Δn)H(p)i=1t1fi,p

Here μ(Δn) indicates probability measures on Δn and H(p) is the differential entropy of p:

H(p)=Ep[log(p(X))]=xΔnp(x)logp(x)dx

fi,p is short form for:

fi,p=Ep[f(X)]=xΔnfi(x)p(x)dx

Rewriting it in a more familiar minimization form, we have:

pt1=argminpμ(Δn)i=1t1fi,pH(p)=argminpμ(Δn)Ep[i=1t1fi(X)+log(p(X))]

Interpretation as a Bayesian Update

Define the liklihood L(rx)(rx) and p0(x) as the uniform distribution over Δn. Then, the density p1(x) can be interpreted as the bayesian update:

p1(x)=p(xr1)L(r1x)p0(x)(r1x)

Similarly, p2(x) is:

p2(x)=p1(xr2)L(r2x)p1(x)(r1x)(r2x)

Therefore, we have:

pt(x)=pt1(xrt)L(rtx)pt1(x)i=1t(rix)

Running Cover’s Algorithm

Computing xt exactly involves evaluting two n-dimensional integrals xΔnxi=1t1(rix)dx and xΔni=1t1(rix)dx and finding their ratio.

Case of two stocks

Consider the case of just two stocks. If we invest x proportion of wealth in stock 1, then stock 2 naturally gets 1x proportion of wealth. So, in this simple case, the return of portfolio x on the return vector (r[1],r[2]) is given by:

r[1]x+r[2](1x)=r[2]+x(r[1]r[2])

So, we can compute the following:

xt=01xi=1t1(ri[2]+x(ri[1]ri[2]))dx01i=1t1(ri[2]+x(ri[1]ri[2]))dx

Cover’s portfolio vector is (xt,1xt). The integrands in the numerator and denominator are polynomials of degree t and t1, respectively. They can be computed using an off-the-shelf polynomial library (like numpy.polynomial). In the experiments section, I show how to implement this.

General case

In higher dimensions, computing xt exactly would be prohibitively expensive. The only tractable route uses Markov Chain Monte Carlo techniques to sample from pt1(x). Then approximate the expectation Ept1[X]=xt1 to compute the portfolio.

We previously noted that the Cover’s density has a particular form.

pt1(x)exp(i=1t1fi(x))

Where ft(x)=log(rix) is convex. These densities occur quite often, and efficiently sampling from them is a problem of great interest called Log-Concave Sampling. The first polynomial time algorithm for Cover’s universal portfolio was given by Kalai and Vempala using this technique. Their runtime is O(T14n4). Since then, there have been several improvements in log-concave sampling. I’m especially excited about a particular kind of MCMC called Mirror Langevin Monte Carlo, which is suitable for constrained sampling. As this is an active area of research, I do not know how well these methods works, but I plan to implement them in the future.

Regret

Cover showed that the regret of his algorithm on any sequence of returns is bounded by:

sup[r]1TRT(Cover,[r]1T)(n1)log(T+1)

Moreover, we have a matching lower bound, showing that Cover’s algorithm is minimax regret optimal. The worst-case regret for any algorithm has to be at least:

Ω(nlogT)

Experiments

I implemented the two-stock version of Cover’s algorithm in the following gist. I used the numpy.polynomial package for manipulating polynomials and evaluating their integrals.

We use a different benchmark here, Uniform Constantly Rebalanced Portfolios (UCRP). This strategy rebalances daily to the portfolio vector of equal weight (0.5,0.5).

There are three different experiments on both synthetic and real stock data. In all cases, the y-axis is log-wealth.

Synthetic data

  • Two different stocks: I use log-normal returns with means (0.005,0.0) and variance 0.1.
  • Stock and Cash: I use log-normal returns with mean 0.005 and variance 0.1. Cash has a constant return of 1.
  • Stock and it’s inverse: I use log-normal returns with mean 0.0 and variance 0.1. If the stock has a percentage change x then the inverse has a percentage change x. The relation between x and r is given by x=(r1)×100. Let r be the return of the inverse. So, r satisfies x=(r1)×100. Substituting this, we see that r=2r. In this case, it is easy to see that UCRP has a return of 1.
  • Correlated Stocks: I use log-normal return with mean 0 and correlation coefficient ρ.

Real data

As a form of risk control, I assume the portfolio is closed at the end of the day, and the return is close/open. So nothing is held overnight.

  • Two different stocks: I consider meme stocks AMC and GME.
  • Stock and Cash: I consider Bitcoin and cash.
  • Stock and it’s inverse: I consider TQQQ and SQQQ. Due to the way leveraged ETFs work, even UCRP loses money here.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

References

Written on November 27, 2022