Methodology

Complete transparency into the V11 pipeline. Every step documented, every assumption stated, every result out-of-sample.

Pipeline Overview

1

Universe & Data Collection

46 US mega-cap equities from the S&P 100, filtered for options liquidity (500+ contracts daily OI). Price, volume, and fundamental data sourced from standard market data providers with daily reconciliation.

2

Factor Construction

Nine cross-sectional factors computed per stock. Each undergoes IC testing; factors with |IC| < 0.01 are pruned. Remaining factors are winsorized at 5th/95th percentiles and z-score normalized.

3

Adaptive Ensemble

Ridge, ElasticNet, and Gradient Boosting trained on the factor matrix. Weights determined by inverse OOS MAE with regime adjustment. James-Stein shrinkage applied to blended predictions.

4

Monte Carlo & Vol Surface

80,000-path GBM simulation per stock. Model-implied vol compared against market ATM IV to compute the volatility edge metric.

5

Options Strategy Selection

Five strategies evaluated per stock (stock, ATM call, OTM Δ20 call, bull call spread, put credit spread). Re-priced with the evidence-based spread model (R² = 0.405). Selection follows conviction × vol-edge matrix.

6

Portfolio Construction

Drawdown-adjusted Kelly sizing. Intra-sector diversification constraints. Options allocation capped at 30% (HIGH) and 15% (MED).

Factor Definitions

#FactorDescription
1Momentum (12-1)Trailing 12-month return excluding the most recent month
2Short-Term Reversal1-month lagged return capturing mean-reversion
3Earnings SurpriseLatest quarterly EPS surprise as % of consensus
4ROE ChangeYear-over-year change in return on equity
5Analyst RevisionNet 90-day EPS estimate revisions normalized by price
6Volatility Ratio30d / 90d realized volatility ratio
7Volume Trend20-day volume SMA relative to 60-day baseline
8Beta ResidualCAPM beta-adjusted residual return
9IV Percentile30-day IV percentile rank over 252 trading days

Ensemble Architecture

32%
Ridge

L2-regularized linear baseline. Stable, low variance. Dominates in low-vol regimes.

31%
ElasticNet

L1+L2 for automatic feature selection. Identifies predictive factor subsets.

38%
Gradient Boosting

Non-linear tree ensemble. Captures factor interactions. Elevated in high-VIX.

Evidence-Based Spread Model

Specification

AlgorithmRidge Regression
Training Data282 real observations
Goodness of FitR² = 0.405
Cost ApplicationHalf-spread per leg
Strategies Eliminated14 / 46

Key Finding

Bid-ask spreads vary dramatically across tickers and moneyness. OTM options on lower-liquidity names can have spreads exceeding 15% of premium, making apparently profitable strategies deeply unprofitable after execution costs.

Conviction × Vol-Edge Matrix

TierCriteriaVol CheapVol Rich
HIGHPred > 30%, EC > 85%OTM Call (Δ20)ATM Call
MEDPred 15–30%, EC 65–85%ATM CallPut Credit
LOWPred < 15% or EC < 65%Stock onlyStock only

This methodology document describes quantitative research techniques. Not investment advice.