Online Portfolio Selection - Soft Bayes

I presented a few online portfolio selection algorithms in my previos blogs (1, 2, 3, 4, 5). Here is a quick summary of the algorithms discussed so far:

Cover’s Universal Portfolio Algorithm

xt=xΔnxt1s=1(rsx)dxxΔnt1s=1(rsx)dxxt=xΔnxt1s=1(rsx)dxxΔnt1s=1(rsx)dx

For any sequence of returns r1,,rT(0,1]n, the regret of Cover’s algorithm is O(nlogT) and the per iteration run time is O(n4T14).

Follow-The-Leader

xt=argmaxxΔnt1s=1log(rsx)

For any sequence of returns r1,,rT(0,1]n, the regret of FTL is O(nˆr2log(T)) and the per iteration run time is O(n3T). Here ˆr=mini,trt[i].

Note that the FTL algorithm does not need to know ˆr aprior and it is never used in the algorithm. However, the regret bound depends on ˆr. This is called a return-dependent regret bound. In the case of Cover’s algorithm, the regret bound is return-independent. It is possible to convert a return-dependent regret bound to a return-independent regret bound using the standard universalization trick. This yields a regret of O(n(log(T))1/3T2/3).

Exponentiated Gradient

xt+1[j]=xt[j]exp(ηrt[j]rtxt)nk=1xt[k]exp(ηrt[k]rtxt)

For any sequence of returns r1,,rT(r,1]n, the regret of Exponentiated Gradient with η=rlog(n)/t is O(1rTlog(n)) and the per iteration run time is O(n).

Exponentiated Gradient needs to know $r$ aprior to achieve this regret. We can obtain a return-dependent regret bound by using adaptive algorithms like AdaHedge.

For any sequence of returns r1,,rT(0,1]n, the regret of AdaHedge is O(1ˆrTlog(n)) and the per iteration run time is O(n).

By using the standard universalization trick, we can convert the return-dependent regret bound to a return-independent regret bound. This yields a regret of O(n1/2log(n)1/4T3/4).

Smooth Selection

xt=argmaxxΔnt1s=1log(rsx)+ni=1log(x[i])

Smooth Selection is similar to FTL, but with an additional logarithmic barrier term. It has a similar regret bound and run time as FTL.

Online Newton Step and Follow the Approximate Leader

Let ft(x)=log(rtx). So, ft(x)=rtrtx. The ONS algorithm is:

xt=argminxΔnϵ2x22+t1s=1(fs(xs)+fs(xs)(xxs)+r8(fs(xs)(xxs))2)

The FTAL algorithm is:

xt=argminxΔnt1s=1(fs(xs)+fs(xs)(xxs)+r8(fs(xs)(xxs))2)

FTAL is the same as ONS with ϵ=0.

For any sequence of returns r1,,rT(r,1]n, the regret of ONS with ϵ=n/r and FTAL is O(nrlog(T)). The per iteration run time is O(n3). Both ONS and FTAL need to know r aprior to achieve this regret.

In a recent paper, I showed that an FTAL with adaptive curvature has a return-dependent regret bound.

xt=argminxΔnt1s=1(fs(xs)+fs(xs)(xxs)+rsxs2(fs(xs)(xxs))2)

For any sequence of returns r1,,rT(0,1]n, the regret of FTAL with adaptive curvature is O(nˆrlog(T)). The per iteration run time is O(n3). Here ˆr=mini,trt[i].

By using the standard universalization trick, we can convert the return-dependent regret bound to a return-independent regret bound. This yields a regret of O(nTlog(T)).

Soft Bayes

The Soft Bayes algorithm with a constant learning rate η[0,1/2] is:

xt+1=xt(1η+ηrtrtxt)

If we know T in advance, we can pick η=log(n)/(Tn) and get a regret bound of O(Tnlog(n)) for any sequence of returns r1,,rT(0,1]n. The per iteration run time is O(n).

If we do not know T in advance, the Soft Bayes algorithm needs to be modified to use a changing learning rate. The update is:

xt+1=xt(1ηt+ηtrtrtxt)ηt+1ηt+(1ηt+1ηt)1n

Using the learning rate ηt=log(n)/(2tn), we get a regret bound of O(Tnlog(n)) for any sequence of returns r1,,rT(0,1]n. The per iteration run time is O(n).

Experiments

To compare the performance of these algorithms, I ran them on datasets used in the book Online Portfolio Selection: Principles and Algorithms. Their code, writtein in MATLAB is available here. I converted all the datasets to CSV format and rewrote the code in Python. The code is available here.

The code for the Soft Bayes algorithm is below:


def SoftBayes(df):
    rows, cols = df.shape

    weights = pd.DataFrame(index=df.index, columns=df.columns)

    x_t_plus_1 = np.ones(cols)/cols

    for t in range(rows):
        weights.iloc[t] = x_t_plus_1

        eta_t  = np.sqrt(np.log(cols)/ (2*cols * (t+1)))
        eta_t_plus_1 = np.sqrt(np.log(cols)/ (2*cols * (t+2)))

        x_t_plus_1 = x_t_plus_1 * (1-eta_t + eta_t*df.iloc[t]/ np.dot(df.iloc[t],x_t_plus_1))*eta_t_plus_1/eta_t + (1-eta_t_plus_1/eta_t)*1/cols
    return weights

The wealth of various algorithms on the datasets is shown below.

Dataset BCRP UCRP SoftBayes AdaEG FTAL FTRL
msci 1.505525 0.926836 0.926564 0.92754 0.77702 0.908417
djia 1.239825 0.812726 0.811729 0.815449 0.632082 0.735799
nyse-o 282.407002 27.075246 27.061927 27.110306 116.549524 110.307681
nyse-n 126.632982 31.551706 31.489244 31.711363 30.623937 40.214315
sp500 4.307386 1.648714 1.646224 1.655227 1.511053 1.45872
tse 6.816451 1.595225 1.594725 1.596756 0.901531 0.882498

The wealth of these algorithms for the NYSE-O dataset is plotted below:

NYSE-O

Written on December 31, 2023