Online Portfolio Selection - Soft Bayes
I presented a few online portfolio selection algorithms in my previos blogs (1, 2, 3, 4, 5). Here is a quick summary of the algorithms discussed so far:
Cover’s Universal Portfolio Algorithm
\[x_t = \frac{\int_{x \in \Delta_n} x \prod_{s=1}^{t-1} (r_s^\top x) dx}{\int_{x \in \Delta_n} \prod_{s=1}^{t-1} (r_s^\top x) dx}\]For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of Cover’s algorithm is \(O(n \log T)\) and the per iteration run time is \(O(n^4 T^{14})\).
Follow-The-Leader
\[x_{t} = \arg\max_{x\in \Delta_n} \sum_{s=1}^{t-1} \log(r_s^\top x)\]For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of FTL is \(O\left(\frac{n}{\hat{r}^2} \log(T)\right)\) and the per iteration run time is \(O(n^3 T)\). Here \(\hat{r} = \min_{i,t} r_t[i]\).
Note that the FTL algorithm does not need to know \(\hat{r}\) aprior and it is never used in the algorithm. However, the regret bound depends on \(\hat{r}\). This is called a return-dependent regret bound. In the case of Cover’s algorithm, the regret bound is return-independent. It is possible to convert a return-dependent regret bound to a return-independent regret bound using the standard universalization trick. This yields a regret of \(O(n (\log(T))^{1/3} T^{2/3})\).
Exponentiated Gradient
\[x_{t+1}[j] = \frac{x_{t}[j]\exp(\eta \frac{r_{t}[j]}{r_{t}^\top x_{t}})}{\sum_{k=1}^n x_{t}[k]\exp(\eta \frac{r_{t}[k]}{r_{t}^\top x_{t}})}\]For any sequence of returns \(r_1,\ldots,r_T \in (r,1]^n\), the regret of Exponentiated Gradient with \(\eta = r\sqrt{\log(n)/t}\) is \(O\left(\frac{1}{r} \sqrt{T \log(n)}\right)\) and the per iteration run time is \(O(n)\).
Exponentiated Gradient needs to know $r$ aprior to achieve this regret. We can obtain a return-dependent regret bound by using adaptive algorithms like AdaHedge.
For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of AdaHedge is \(O\left(\frac{1}{\hat{r}} \sqrt{T \log(n)}\right)\) and the per iteration run time is \(O(n)\).
By using the standard universalization trick, we can convert the return-dependent regret bound to a return-independent regret bound. This yields a regret of \(O\left(n^{1/2}\log(n)^{1/4} T^{3/4}\right)\).
Smooth Selection
\[x_{t} = \arg\max_{x\in \Delta_n} \sum_{s=1}^{t-1} \log(r_s^\top x) + \sum_{i=1}^n \log(x[i])\]Smooth Selection is similar to FTL, but with an additional logarithmic barrier term. It has a similar regret bound and run time as FTL.
Online Newton Step and Follow the Approximate Leader
Let \(f_t(x) = -\log(r_t^\top x)\). So, \(\nabla f_t(x) = -\frac{r_t}{r_t^\top x}\). The ONS algorithm is:
\[x_{t} = \arg\min_{x\in \Delta_n} \frac{\epsilon}{2} \|x\|_2^2 + \sum_{s=1}^{t-1}\left( f_s(x_s) + \nabla f_s(x_s)^\top (x-x_s) + \frac{r}{8}(\nabla f_s(x_s)^\top (x-x_s))^2 \right)\]The FTAL algorithm is:
\[x_{t} = \arg\min_{x\in \Delta_n} \sum_{s=1}^{t-1} \left(f_s(x_s) + \nabla f_s(x_s)^\top (x-x_s) + \frac{r}{8}(\nabla f_s(x_s)^\top (x-x_s))^2 \right)\]FTAL is the same as ONS with \(\epsilon = 0\).
For any sequence of returns \(r_1,\ldots,r_T \in (r,1]^n\), the regret of ONS with \(\epsilon=n/r\) and FTAL is \(O\left(\frac{n}{r} \log(T)\right)\). The per iteration run time is \(O(n^{3})\). Both ONS and FTAL need to know \(r\) aprior to achieve this regret.
In a recent paper, I showed that an FTAL with adaptive curvature has a return-dependent regret bound.
\[x_{t} = \arg\min_{x\in \Delta_n} \sum_{s=1}^{t-1} \left(f_s(x_s) + \nabla f_s(x_s)^\top (x-x_s) + \frac{r_s^\top x_s}{2}(\nabla f_s(x_s)^\top (x-x_s))^2 \right)\]For any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\), the regret of FTAL with adaptive curvature is \(O\left(\frac{n}{\hat{r}} \log(T)\right)\). The per iteration run time is \(O(n^{3})\). Here \(\hat{r} = \min_{i,t} r_t[i]\).
By using the standard universalization trick, we can convert the return-dependent regret bound to a return-independent regret bound. This yields a regret of \(O\left(n \sqrt{T \log(T)}\right)\).
Soft Bayes
The Soft Bayes algorithm with a constant learning rate \(\eta \in [0,1/2]\) is:
\[x_{t+1} = x_t\left(1-\eta+\eta \frac{r_t}{r_t^\top x_t}\right)\]If we know \(T\) in advance, we can pick \(\eta = \sqrt{\log(n)/(Tn)}\) and get a regret bound of \(O\left(\sqrt{Tn\log(n)}\right)\) for any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\). The per iteration run time is \(O(n)\).
If we do not know \(T\) in advance, the Soft Bayes algorithm needs to be modified to use a changing learning rate. The update is:
\[x_{t+1} = x_t\left(1-\eta_t+\eta_t \frac{r_t}{r_t^\top x_t}\right) \frac{\eta_{t+1}}{\eta_t} + \left(1-\frac{\eta_{t+1}}{\eta_t}\right)\frac{1}{n}\]Using the learning rate \(\eta_t = \sqrt{\log(n)/(2tn)}\), we get a regret bound of \(O\left(\sqrt{Tn\log(n)}\right)\) for any sequence of returns \(r_1,\ldots,r_T \in (0,1]^n\). The per iteration run time is \(O(n)\).
Experiments
To compare the performance of these algorithms, I ran them on datasets used in the book Online Portfolio Selection: Principles and Algorithms. Their code, writtein in MATLAB is available here. I converted all the datasets to CSV format and rewrote the code in Python. The code is available here.
The code for the Soft Bayes algorithm is below:
def SoftBayes(df):
rows, cols = df.shape
weights = pd.DataFrame(index=df.index, columns=df.columns)
x_t_plus_1 = np.ones(cols)/cols
for t in range(rows):
weights.iloc[t] = x_t_plus_1
eta_t = np.sqrt(np.log(cols)/ (2*cols * (t+1)))
eta_t_plus_1 = np.sqrt(np.log(cols)/ (2*cols * (t+2)))
x_t_plus_1 = x_t_plus_1 * (1-eta_t + eta_t*df.iloc[t]/ np.dot(df.iloc[t],x_t_plus_1))*eta_t_plus_1/eta_t + (1-eta_t_plus_1/eta_t)*1/cols
return weights
The wealth of various algorithms on the datasets is shown below.
Dataset | BCRP | UCRP | SoftBayes | AdaEG | FTAL | FTRL |
---|---|---|---|---|---|---|
msci | 1.505525 | 0.926836 | 0.926564 | 0.92754 | 0.77702 | 0.908417 |
djia | 1.239825 | 0.812726 | 0.811729 | 0.815449 | 0.632082 | 0.735799 |
nyse-o | 282.407002 | 27.075246 | 27.061927 | 27.110306 | 116.549524 | 110.307681 |
nyse-n | 126.632982 | 31.551706 | 31.489244 | 31.711363 | 30.623937 | 40.214315 |
sp500 | 4.307386 | 1.648714 | 1.646224 | 1.655227 | 1.511053 | 1.45872 |
tse | 6.816451 | 1.595225 | 1.594725 | 1.596756 | 0.901531 | 0.882498 |
The wealth of these algorithms for the NYSE-O dataset is plotted below: