Out-of-Sample Performance
Every number on this page comes directly from the most recent bi-weekly pipeline run. No synthesized backtest, no carried-forward figures. The panel below is short — this is a live research process with a small OOS window, not a polished marketing backtest.
HYPOTHETICAL PERFORMANCE RESULTS have many inherent limitations. No representation is being made that any account will or is likely to achieve profits or losses similar to those shown. Past performance, whether actual or hypothetical, is not indicative of future results.
Read full disclaimerEvery CAGR, Sharpe, MaxDD, and Calmar number below is from a hypothetical long-short equities portfolio — long the top-quintile of predictions, short the bottom-quintile, equally weighted, monthly rebalance. No leverage, no stock borrow cost, no slippage.
The options overlay shown per ticker on the Signal Dashboard (optROC, atmROC, otm20ROC, putCrROC) is a forward Monte Carlo from today's IV surface — not a historical backtest.
We attempted a real historical backtest against IBKR's historical options API on 2026-04-24. It was structurally blocked: IBKR's contract-resolution endpoint does not reliably return already-expired option contracts even with includeExpired=True, so we could not retrieve premiums for the historical entry dates that the walk-forward OOS panel needs. The options overlay is therefore a forward-only track: the first realised P&L row lands on live_vs_backtest.csv around 2026-05-23, and a full year of forward data is complete by 2027-04. No historical options CAGR / Sharpe / MaxDD is published here until forward data is deep enough to be statistically meaningful.
Walk-Forward OOS Panel — Equities L/S
Monthly OOS Information Coefficient
19 months · mean = 0.027Information Coefficient = rank correlation between predicted 1-month forward return and realised cross-sectional return. Positive bars mean the ranking was directionally useful that month; negative bars mean it was not. With 19 data points, individual months carry little statistical weight.
Hypothetical L/S Equities Portfolio
Long top-quintile / short bottom-quintile of the model's cross-sectional ranking, equally weighted within each leg, rebalanced monthly. No leverage, no execution costs, no stock borrow cost. Stocks only — the options overlay is not in these numbers. Point estimates only; with 19 OOS months the confidence intervals on every number below are wide.
Cumulative Path
start = 1.00This path is the live walk-forward, not a smoothed backtest. Treat the headline numbers as point estimates with wide confidence intervals — the bootstrap 95% CI on Sharpe spans [−0.37, +1.62]. The significance battery below shows DSR and stationary bootstrap still failing at 5% with n=42.
Regime-Conditional Performance
Point-in-time VIX| Regime | Months | % of panel | Mean monthly L/S | Annualised Sharpe | Hit rate |
|---|---|---|---|---|---|
| Moderate | 19 | 100% | +1.86% | +0.74 | 53% |
VIX regime at the end of each prediction month (not just the current snapshot). The headline Sharpe is a weighted average across these regimes — forward-looking expected Sharpe depends on which regime you're actually in. If the next 12 months are mostly Moderate, the realised Sharpe will be closer to the Moderate bucket's number than to the panel-wide headline.
Significance Battery
Deflated Sharpe
FailBailey & López de Prado (2014). Corrects the Sharpe ratio for multiple-testing bias across the search space.
Stationary Bootstrap
FailPolitis & Romano (1994). Two-sided test for mean monthly IC ≠ 0, robust to autocorrelation.
Purged Hold-Out
InconsistentCPCV-lite: drop the last k months, re-run the walk-forward, and require the surviving IC to match the full-panel sign and ≥ 50% of its magnitude.
Factor Information Coefficients
4 kept · 7 dropped| Factor | Mean IC | t-stat | p-value | n months | Significant | Kept |
|---|---|---|---|---|---|---|
| low_vol_60d | -0.112 | -1.70 | 0.101 | 25 | no | ✓ |
| beta_residual | +0.102 | 1.84 | 0.079 | 25 | yes | ✓ |
| sector_neutral_momentum | +0.092 | 1.66 | 0.110 | 25 | no | ✓ |
| momentum_12_1 | +0.088 | 1.38 | 0.179 | 25 | no | ✓ |
| price_acceleration | -0.033 | -0.58 | 0.569 | 25 | no | ✗ |
| volume_trend | +0.031 | 0.77 | 0.447 | 25 | no | ✗ |
| ret_5d | -0.030 | -0.55 | 0.588 | 25 | no | ✗ |
| momentum_3_1 | +0.028 | 0.47 | 0.641 | 25 | no | ✗ |
| volume_shock | -0.022 | -0.53 | 0.603 | 25 | no | ✗ |
| short_term_reversal | -0.022 | -0.37 | 0.711 | 25 | no | ✗ |
| volatility_ratio | +0.021 | 0.74 | 0.468 | 25 | no | ✗ |
Each factor is tested cross-sectionally per month over a 25-month panel. “Kept” factors survive the IC screen (p < 0.10 or |mean IC| ≥ 0.05) and enter the ensemble. “Significant” means p < 0.05 from a t-test on the monthly IC series. At this sample size, expect most factors to fall short of individual significance even when the ensemble has predictive value.
Note on low_vol_60d: the factor is defined as the negative of 60-day realised vol so that, under the classical low-vol anomaly, high factor values would map to low-vol stocks and positive forward returns. The panel IC over 2022–2026 came out negative — i.e. in this mega-cap sample high-vol names outperformed, driven by the post-2022 AI tech rally. The ensemble learns the realised sign from the panel; the IC screen kept the factor on the magnitude criterion (|IC| ≥ 0.05), not the direction criterion. Treat this as regime-specific: in a future period where low-vol mean-reverts to the historical anomaly, the ensemble would need to retrain.
Current Ensemble
Model Weights
Source: inverse_oos_mae
Regime & Universe
See the current signal snapshot, or read the methodology behind every stage.