Online Portfolio Selection - Soft Bayes

I presented a few online portfolio selection algorithms in my previos blogs (1, 2, 3, 4, 5). Here is a quick summary of the algorithms discussed so far:

Cover’s Universal Portfolio Algorithm

\[x_t = \frac{\int_{x \in \Delta_n} x \prod_{s=1}^{t-1} (r_s^\top x) dx}{\int_{x \in \Delta_n} \prod_{s=1}^{t-1} (r_s^\top x) dx}\]

For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of Cover’s algorithm is \(O(n \log T)\) and the per iteration run time is \(O(n^4 T^{14})\).

Follow-The-Leader

\[x_{t} = \arg\max_{x\in \Delta_n} \sum_{s=1}^{t-1} \log(r_s^\top x)\]

For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of FTL is \(O\left(\frac{n}{\hat{r}^2} \log(T)\right)\) and the per iteration run time is \(O(n^3 T)\). Here \(\hat{r} = \min_{i,t} r_t[i]\).

Note that the FTL algorithm does not need to know \(\hat{r}\) aprior and it is never used in the algorithm. However, the regret bound depends on \(\hat{r}\). This is called a return-dependent regret bound. In the case of Cover’s algorithm, the regret bound is return-independent. It is possible to convert a return-dependent regret bound to a return-independent regret bound using the standard universalization trick. This yields a regret of \(O(n (\log(T))^{1/3} T^{2/3})\).

Exponentiated Gradient

\[x_{t+1}[j] = \frac{x_{t}[j]\exp(\eta \frac{r_{t}[j]}{r_{t}^\top x_{t}})}{\sum_{k=1}^n x_{t}[k]\exp(\eta \frac{r_{t}[k]}{r_{t}^\top x_{t}})}\]

For any sequence of returns \(r_1,\ldots,r_T \in (r,1]^n\), the regret of Exponentiated Gradient with \(\eta = r\sqrt{\log(n)/t}\) is \(O\left(\frac{1}{r} \sqrt{T \log(n)}\right)\) and the per iteration run time is \(O(n)\).

Exponentiated Gradient needs to know $r$ aprior to achieve this regret. We can obtain a return-dependent regret bound by using adaptive algorithms like AdaHedge.

For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of AdaHedge is \(O\left(\frac{1}{\hat{r}} \sqrt{T \log(n)}\right)\) and the per iteration run time is \(O(n)\).

By using the standard universalization trick, we can convert the return-dependent regret bound to a return-independent regret bound. This yields a regret of \(O\left(n^{1/2}\log(n)^{1/4} T^{3/4}\right)\).

Smooth Selection

\[x_{t} = \arg\max_{x\in \Delta_n} \sum_{s=1}^{t-1} \log(r_s^\top x) + \sum_{i=1}^n \log(x[i])\]

Smooth Selection is similar to FTL, but with an additional logarithmic barrier term. It has a similar regret bound and run time as FTL.

Online Newton Step and Follow the Approximate Leader

Let \(f_t(x) = -\log(r_t^\top x)\). So, \(\nabla f_t(x) = -\frac{r_t}{r_t^\top x}\). The ONS algorithm is:

\[x_{t} = \arg\min_{x\in \Delta_n} \frac{\epsilon}{2} \|x\|_2^2 + \sum_{s=1}^{t-1}\left( f_s(x_s) + \nabla f_s(x_s)^\top (x-x_s) + \frac{r}{8}(\nabla f_s(x_s)^\top (x-x_s))^2 \right)\]

The FTAL algorithm is:

\[x_{t} = \arg\min_{x\in \Delta_n} \sum_{s=1}^{t-1} \left(f_s(x_s) + \nabla f_s(x_s)^\top (x-x_s) + \frac{r}{8}(\nabla f_s(x_s)^\top (x-x_s))^2 \right)\]

FTAL is the same as ONS with \(\epsilon = 0\).

For any sequence of returns \(r_1,\ldots,r_T \in (r,1]^n\), the regret of ONS with \(\epsilon=n/r\) and FTAL is \(O\left(\frac{n}{r} \log(T)\right)\). The per iteration run time is \(O(n^{3})\). Both ONS and FTAL need to know \(r\) aprior to achieve this regret.

In a recent paper, I showed that an FTAL with adaptive curvature has a return-dependent regret bound.

\[x_{t} = \arg\min_{x\in \Delta_n} \sum_{s=1}^{t-1} \left(f_s(x_s) + \nabla f_s(x_s)^\top (x-x_s) + \frac{r_s^\top x_s}{2}(\nabla f_s(x_s)^\top (x-x_s))^2 \right)\]

For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of FTAL with adaptive curvature is \(O\left(\frac{n}{\hat{r}} \log(T)\right)\). The per iteration run time is \(O(n^{3})\). Here \(\hat{r} = \min_{i,t} r_t[i]\).

By using the standard universalization trick, we can convert the return-dependent regret bound to a return-independent regret bound. This yields a regret of \(O\left(n \sqrt{T \log(T)}\right)\).

Soft Bayes

The Soft Bayes algorithm with a constant learning rate \(\eta \in [0,1/2]\) is:

\[x_{t+1} = x_t\left(1-\eta+\eta \frac{r_t}{r_t^\top x_t}\right)\]

If we know \(T\) in advance, we can pick \(\eta = \sqrt{\log(n)/(Tn)}\) and get a regret bound of \(O\left(\sqrt{Tn\log(n)}\right)\) for any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\). The per iteration run time is \(O(n)\).

If we do not know \(T\) in advance, the Soft Bayes algorithm needs to be modified to use a changing learning rate. The update is:

\[x_{t+1} = x_t\left(1-\eta_t+\eta_t \frac{r_t}{r_t^\top x_t}\right) \frac{\eta_{t+1}}{\eta_t} + \left(1-\frac{\eta_{t+1}}{\eta_t}\right)\frac{1}{n}\]

Using the learning rate \(\eta_t = \sqrt{\log(n)/(2tn)}\), we get a regret bound of \(O\left(\sqrt{Tn\log(n)}\right)\) for any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\). The per iteration run time is \(O(n)\).

Experiments

To compare the performance of these algorithms, I ran them on datasets used in the book Online Portfolio Selection: Principles and Algorithms. Their code, writtein in MATLAB is available here. I converted all the datasets to CSV format and rewrote the code in Python. The code is available here.

The code for the Soft Bayes algorithm is below:


def SoftBayes(df):
    rows, cols = df.shape

    weights = pd.DataFrame(index=df.index, columns=df.columns)

    x_t_plus_1 = np.ones(cols)/cols

    for t in range(rows):
        weights.iloc[t] = x_t_plus_1

        eta_t  = np.sqrt(np.log(cols)/ (2*cols * (t+1)))
        eta_t_plus_1 = np.sqrt(np.log(cols)/ (2*cols * (t+2)))

        x_t_plus_1 = x_t_plus_1 * (1-eta_t + eta_t*df.iloc[t]/ np.dot(df.iloc[t],x_t_plus_1))*eta_t_plus_1/eta_t + (1-eta_t_plus_1/eta_t)*1/cols
    return weights

The wealth of various algorithms on the datasets is shown below.

Dataset BCRP UCRP SoftBayes AdaEG FTAL FTRL
msci 1.505525 0.926836 0.926564 0.92754 0.77702 0.908417
djia 1.239825 0.812726 0.811729 0.815449 0.632082 0.735799
nyse-o 282.407002 27.075246 27.061927 27.110306 116.549524 110.307681
nyse-n 126.632982 31.551706 31.489244 31.711363 30.623937 40.214315
sp500 4.307386 1.648714 1.646224 1.655227 1.511053 1.45872
tse 6.816451 1.595225 1.594725 1.596756 0.901531 0.882498

The wealth of these algorithms for the NYSE-O dataset is plotted below:

NYSE-O

Written on December 31, 2023