parameter exploration with quant_rv and heatmap

For v1.2.0 we take a step back from 1.1.0 to meet some of the new goal requirements right off the bat, and to play explore. In particular, we remove the code to test QQQ (or other ETFs) and related vars. Next we change code to make it easy to explore parameters (like the volatility threshold) to see how it works and what it actually does. Finally we craft a heatmap tool to help us explore the volatility space more thoroughly and try to understand it better. In particular, let’s find out with examples how backtesting (and optimizing parameters) can be quite misleading.

So here’s the new code for 1.2.0. In tune with the new goals, we’ve (1) removed the leverage variable and (2) removed the QQQ loading code. To make simple explorations easier, we’ve saved the original strategy code by itself (for posterity) but eliminated it from the plotting and instead plot two independent strategies (in the comments, # Strategy 1 and # Strategy 2) that are each set up with three vars: one to control the lookback period for the volatility calculation, one to set a threshold volatility level for the strategy signal, and a label for the plot legend. I’m using labels like “rv20d15” where 20d means a 20 day lookback period, and 15 is the volatility threshold (0.15) used for the signal generation. And “rv” is realized volatility, as you have already realized. You can label as you like, or make it autogenerate labels that include your lookback and vol thresholds (R’s paste() is your friend there, see an example at the bottom of this code).

### quant_rv v1.2.0 by babbage9010 and friends
# new setup to compare two strategies and parameters
### released under MIT License

# Step 1: Load necessary libraries and data
library(quantmod)
library(PerformanceAnalytics)

date_start <- as.Date("2006-07-01")
date_end <- as.Date("2019-12-31")
symbol_bench1  <- "SPY"  # benchmark for comparison
symbol_signal1 <- "SPY"  # S&P 500 symbol (use SPY or ^GSPC)
symbol_trade1  <- "SPY"  # ETF to trade

data_spy <- getSymbols(symbol_bench1, src = "yahoo", from = date_start, to = date_end, auto.assign = FALSE)
prices_benchmark <- Ad(data_spy) #SPY ETF, Adjusted(Ad) for the benchmark
prices_signal1 <- Cl(data_spy) #SPY ETF, Close(Cl) for the signal (realized vol)
prices_trade1 <- Op(data_spy) #SPY data, Open(Op) for our trading

# Step 2: Calculate ROC series
roc_signal1 <-   ROC(prices_signal1, n = 1, type = "discrete")
roc_benchmark <- ROC(prices_benchmark, n = 1, type = "discrete")
roc_trade1 <-    ROC(prices_trade1, n = 1, type = "discrete")

# Step 3: Develop the trading strategies
# Strategy 1: A benchmark strategy
lookback_period1 <- 20
threshold1 <- 0.15
label_strategy1 <- "Strategy 1: rv20d15"
volatility1 <- runSD(roc_signal1, n = lookback_period1) * sqrt(252)
signal_1 <- ifelse(volatility1 < threshold1, 1, 0)
signal_1[is.na(signal_1)] <- 0

# Strategy 2: The one plotted first, with Daily Returns
lookback_period2 <- 22
threshold2 <- 0.17
label_strategy2 <- "Strategy 2: rv22d17"
volatility2 <- runSD(roc_signal1, n = lookback_period2) * sqrt(252)
signal_2 <- ifelse(volatility2 < threshold2, 1, 0)
signal_2[is.na(signal_2)] <- 0

# Step 4: Backtest the strategies
returns_strategy1 <- roc_trade1 * Lag(signal_1, 2)
returns_strategy1 <- na.omit(returns_strategy1)
returns_strategy2 <- roc_trade1 * Lag(signal_2, 2)
returns_strategy2 <- na.omit(returns_strategy2)

# Calculate benchmark returns
returns_benchmark <- roc_benchmark 
returns_benchmark <- Lag(returns_benchmark, 2)
returns_benchmark <- na.omit(returns_benchmark)

# Step 5: Evaluate performance and risk metrics
# add an "exposure" metric (informative, not evaluative)
exposure <- function(vec){ sum(vec != 0) / length(vec) }
comparison <- cbind(returns_strategy2, returns_benchmark, returns_strategy1)
colnames(comparison) <- c(label_strategy2, "Benchmark SPY total return", label_strategy1)
stats_rv <- rbind(table.AnnualizedReturns(comparison), maxDrawdown(comparison))
charts.PerformanceSummary(comparison, main = "Realized Vol Strategies vs S&P 500 Benchmark")
exposure_s2 <- exposure(returns_strategy2)
exposure_s1 <- exposure(returns_strategy1)
print( paste("Exposure for Strategy 2:", exposure_s2) ) 
print( paste("Exposure for Strategy 1:", exposure_s1) ) 

and here’s a plot with two strategies on it, the original rv20d15 and one with a very slightly longer lookback and higher threshold, rv22d17.

You can now use this code and explore the two degrees of freedom we’ve used so far, two strategies at a time. There’s nothing sacred about a 20 trading day (~30 calendar day) lookback period. Try longer and shorter periods with the same signal threshold, or vary the threshold with the lookback constant, or just play.

There is one distinct, overriding pattern you should note with a little exploration, and that is part of the logical basis for this quant_rv strategy: low threshold values yield lower annual returns than higher threshold values. It’s (fortunately) a bit more nuanced than that, but that’s the gist of it, and the reason is simply that with low vol thresholds, the strategy is only “in” the market a fraction of the trading days, and higher thresholds mean this fraction gets larger (more trading days occur below the raised threshold). This fraction of invested trading days is typically called the “exposure” of a strategy, and for some reason I can’t find this simple statistic anywhere in the PerformanceAnalytics library, so I wrote a simple function(1) for it myself in the 1.2.0 code above. It’s just the count of invested days divided by the count of investable days, and the code prints out the exposure of the two strategies into the Console of R Studio. With a really high vol threshold, quant_rv approximates a buy-and-hold strategy in the invested ETF. With a really low threshold, quant_rv cherry picks market days where the volatility is likely to be very low, and hence more likely to be a positive (up) day.

Part of the nuance to this trend is that it’s not exactly linear, at least to my exploration so far. There’s a bit of a sweet spot, which I can demonstrate hint at with a simple plot. I’ve modified the above code to plot six return strategies, four of them with increasing threshold values calculated on the market Open, plus the benchmark (Adjusted, to include dividends), plus a second benchmark with is SPY without dividends (just as our strategies are calculated without dividends, as mentioned in a previous post, I think this one). Here’s that plot.

The lowest returns are from the two strategies with the lowest thresholds, and the lowest market exposure. The red one (Strategy 2: rv20d15) is our original starting place, and has about 60% market exposure, while the black one (#1: rv20d10) has only 34% exposure over this time period. What’s interesting is that as the vol threshold gets to around 20% (0.20), the strategies have returns that are in the ballpark of the buy-and-hold strategy sans dividends. At lower market exposure (77% for rv20d20 and 92% for rv20d30 in the plot). That means they also have lower variance, and that means they’ll have higher Sharpe ratios (similar returns with lower variance => higher Sharpe, lower risk profile).

Can I really tell all that from this one plot of a few strategies, no. Pleeeze. I did a lot of playing. And then I made a heatmap(2). Or two. Here they are:

So, I made a for loop, and nested another one, and inside it we calculate the strategies for each and every combination of the vol thresholds and lookback periods you see here (takes about 7 minutes on my M1 MacBook Air, not optimized I’m sure). The heatmap on the left shows the annual returns for these strategies, with the lowest returns (lightest shades) corresponding to the lowest vol thresholds (also would be the lowest exposures, but I didn’t make a heatmap of those). Then I made another heatmap of the Sharpe ratio for each strategy. They look broadly similar, but there are definite differences.

First, the annual return pattern. This is cool, just like we said, the returns are lowest with the strictest (lowest) vol thresholds, gradually rising toward the right (higher thresholds) but they seem to peak (reddest) in the 0.30-0.55 range, and drop fractionally toward the highest values (where exposure is nearly 100%). What’s cool about that? What’s not cool about higher returns with lower market exposure/variance? In fact, if you look at the orangey area as a whole, it’s a nice big blob covering the right half of the heatmap. The Sharpe ratio heatmap is similar, a big orange blob (with some outliers, more on that later), a few reddish spots, but definitely a blurry Sharpe high in the 10-60 days vs 0.15-0.45 threshold range.

There’s a number of things to like about this, but the main thing is that it’s a nice, persistent pattern across a broad parameter space, indicating it’s a real phenomenon that should provide us a small but real strategy edge. This is the logical basis I was hoping for when I got started in this direction.

I’d like to digress for a moment here to talk about strategy optimization. Pleeeze. There’s all kinds of software out there faster and better than this for optimizing 17 different degrees of freedom to find you the most amazing strategy backtest you could ever hope for, and it’s (mostly) all crap. Look at that Sharpe map again, and find the darkest red, up there toward the upper left corner. Let’s explore that a bit. We’ll look at two adjacent strategies and compare them using our code to see what’s going on.

Ok, you really can’t see the details there (run it yourself), BUT you can see that the rv70d06 strategy (which has a Sharpe ratio of 0.72) has daily returns that are all in one cluster, in late 2017-early 2018, when market vol had collapsed, the market melted upward and all was prelude to the February volpocalypse when the XIV short vol futures fund collapsed. Exposure for this one strategy is only 1% of this time period. The other strategy, with just a fraction higher threshold, has similar returns but 4% exposure, and this increases the variance and drops the Sharpe ratio dramatically.

NOBODY would look twice at that strategy anyway, with such a tiny market exposure, except maybe a naïve machine learning demo that was optimizing for Sharpe ratio. But let’s look at another outlier on the Sharpe heatmap, the one down in the lower center. Here’s a plot with two strategies very close.

Boom, a couple of days difference in late 2008 is all it takes to make a big difference in the outcome for these two strategies. Strategy 1 rv4d21 hit the Great Financial Crisis tumble perfectly, and even Strategy 2 did much better than the benchmark. Every strategy with outperformance relative to its near neighbors has a similar tale, either avoiding big losses or picking up some good gains that the others missed.

None of these simple strategies is close to good enough for meeting our goals. This post was purely about exploring the performance of simple realized volatility strategies to get a feel for how they behave, and whether there’s potential for us here. I think there is, based on the patterns I see in the heatmaps. More soon.

Footnotes:
(1) “simple” function is right, and it’s probably not very robust, but it seems good enough for us, for now. In particular, it doesn’t account for NAs in the return series, or returns that are now zeros that were formerly NAs, and probably doesn’t properly handle the lookback_period days at the beginning of the time period where we don’t have returns. So it’s more precise than it is accurate, but for a time series of > 1000 days, seems close enough for jazz. Anybody want to code up a better one?

(2) About my heatmap code, I’m putting it into GitHub soon (as a very non-optimized demo version 0.0.1), but also should note that after I started using it, a couple different strange errors started appearing in the R Studio console. One is this one, a known issue with R Studio that they’re working on:

Error in exists(cacheKey, where = .rs.WorkingDataEnv, inherits = FALSE) : invalid first argument

R Studio sez: “❗❗ Fix starts with build 2023.06.1+437 ❗❗
but I say “Just ignore it (code actually runs fine despite it) or save your work and restart R Studio for now.”

Error 2: Randomly shows up after I’ve run heatmap, sometimes after another heatmap run, sometimes after a quant_rv run, and produces wonky scaling on the quant_rv plots.

Error in par(op) : invalid value specified for graphical parameter "pin"

I’ve searched for the problem and the most informational post was this one from seven years ago: https://stackoverflow.com/questions/31793271/high-resolution-heatmap-in-r, but it didn’t help (other than to know how to type par() in the console to see that my pin and plt params were funky. Resetting them to known working values didn’t fix the problems. It goes away if I restart R Studio so that’s what I do.

EDIT (17 Aug 2023):
FIXED! I think, for now. Error #2 that is. In a later post, I started using multiple plots (like 4 or 8 graphs on one plot) and used this par() function to set it up:
par(mfrow = c(4, 2))
That sets up a four-row, two-column plot for the eight charts. And then the other plots started showing up small again. Finally I realized OH! The small charts are just getting sized into the same 4×2 frame, just taking up the space of 1/8th of the frame. So I added
par(mfrow = c(1, 1))
to the end of my script, and that resets the frame back to the normal full window frame I wanted. Yay! Hope someone finds this useful.



One response to “parameter exploration with quant_rv and heatmap”

  1. […] Parameter exploration with quant_rv and heatmap [Babbage9010] […]

    Like

Leave a comment

Blog at WordPress.com.

Design a site like this with WordPress.com
Get started