We just published our **monthly newsletter** (a few days late, but better-late-than-never, right?). And already we’ve gotten comments in on two things: #1 is our puzzle, but a close #2 is my commentary on annualized standard deviation. And, as I point out, the recent source for this discussion is a question that came up at last month’s **Performance Measurement Think Tank**.

Given the comments, I thought I’d continue the discussion here, with an example that I sent to one of the folks who chimed in. Consider the following:

**Composite’s 36-month annualized return = 11.14%****Benchmark’s 36-month annualized return = 10.65%****Composite’s non-annualized standard deviation = 2.36%****Benchmark’s non-annualized standard deviation = 2.47%****Composite’s annualized standard deviation = 8.17%****Benchmark’s annualized standard deviation = 8.54%**

**Annualized Standard Deviation Question #1**

How do you interpret the annualized standard deviations? What meaning do you draw from them?

**Annualized Standard Deviation Question #2**

Comparing the annualized standard deviation values with their respective non-annualized, do you have any different interpretation?

**Annualized Standard Deviation Question #3**

Standard deviation is associated with a normal distribution; we typically require at least 30 values in our distribution to have any statistical significance, so the 36 monthly returns meet and exceed this level. And even though returns are not usually normally distributed, they’re close enough that we can still draw inferences from the numbers. This includes the fact that the average return, +/- one standard deviation will capture roughly two-thirds of the distribution. And so, the composite’s average monthly return, +/- its non annualized standard deviation will capture two-thirds (or roughly 24) of the 36 monthly returns. Can we make any similar assessment using the annualized standard deviation?

**Why do we annualize standard deviation?**

What’s the point in annualizing it in this context? In my view, none, as I am not aware of any.

As you probably know, this statistic is now required for both the composite and its benchmark for GIPS(R) (Global Investment Performance Standards) compliant firms. And I recall someone suggesting that firms should also display their 36-month annualized return along with it. And so, I’ve done that above. What meaning does this provide?

Again, I am not aware of any. There is no relation between the annualized standard deviation and the annualized return.

Why do we annualize standard deviation? I believe because we tend to annualize statistics. But I believe we should be able to draw the same conclusions from a risk perspective by comparing non-annualized composite and benchmark standard deviations as we do by comparing their annualized values.

**The roles of Standard Deviation**

We cannot lose sight of the fact that standard deviation, within the context of GIPS compliance, serves two purposes:

**As an acceptable measure of dispersion: here, the average account annual return, +/- one standard deviation has relevance, as it gives us a sense of how “tight” the distribution of annual returns is. It lets us know if the manager was consistent in their management of the accounts within the composite’s strategy, or if they were “all over the board.”****As a proxy for risk: The Spaulding Group’s research has found that standard deviation is the #1 risk measure. What does it measure? Volatility (or, if you prefer, variability). It’s been used as a risk measure for more than 50 years. It’s what is used by CAPM and the Sharpe ratio, as well as information ratio and other statistics. With this approach, we could care less what the average return is: it has no meaning. If we know only the composite’s standard deviation (annualized or not), it means nothing. But, when we compare it to the benchmark’s, we are able to decide if more (for comparatively higher standard deviation) or less (for a lower standard deviation relative to the benchmark’s) risk was taken.**

Let’s consider what I propose as answers to the above questions:

**Q#1 Answer**

The annualized standard deviation, like the non-annualized, presents a measure of volatility. Since the composite has a lower value than the benchmark, we conclude that less risk was taken. And while Bill Sharpe used non-annualized values in his eponymously named risk-adjusted measure, it is quite common to employ annualized values, and so, the annualized standard deviation would be plugged into the denominator. (i.e., we can annualize the statistics and divide, or divide the un-anualized values and then annualize the result)

**Q#2 Answer**

NO! The composite’s non-annualized standard deviation, like the annualized, is lower, so we interpret this to mean that less risk was taken.

**Q#3 Answer**

No, we cannot. That was one of my points in the newsletter, as well as an article I wrote for ** The Journal of Performance Measurement(R)**.

We have to keep in mind the difference:

**Are we using standard deviation as a measure of dispersion? Then knowing the average annual return, and having the standard standard deviation, we can identify the span across which roughly two-thirds of our returns fit.****If we’re using it as a risk measure, then it’s strictly about volatility, and annualizing doesn’t seem to make much of a difference when it comes to comparing the composite and benchmark values.**

Note: recall that we are measuring the *dispersion of annual returns* within the context of GIPS’s dispersion; we aren’t annualizing a monthly standard deviation: the standard deviation is of annualized returns. Contrast this with what we do with risk, where we’re measuring standard deviation of 36 monthly returns. Here is where we annualize the result.

**Have a different view? Another opinion? Some insights you want to share?**

Please chime in! I would very much like to see other views on this.

The real important point that I wanted to make is that we need to know whether we’re using the statistic as a measure of dispersion (where comparing standard deviation to the distribution’s mean has value) or volatility (where it doesn’t).

1) Annualization is a way of standardizing on a measure to make comparisons easier. If I say that the average male height is 5.5 feet in some country and you say it is 66 inches, we are both saying the same thing. If you then said that the standard deviation was 6 inches and I said it was .5 feet, again we would be saying the same thing but both be internally inconsistent in our measurements. Annualizing has become a standard in the investment industry. To be consistent with the scale for returns and to be consistent across firms, it makes sense to annualize standard deviations. Given that it is only a linear transformation, you would not expect to draw any conclusions different than what would have been drawn from the comparison portfolio to benchmark monthly standard deviations.

2) Please define what test for significance you are using for saying that less than 30 observations are not significant. I know that confidence intervals can be calculated around a standard deviation, but am not aware of any significance testing.

3) Volatility is the measure that connects geometric average returns to arithmetic average returns. Assume you have 2 portfolios. Both have an average return of 1% per month. One has a standard deviation of 0 (P1) or 1% every month and the other is 6% one month followed by -4% the following and consequently has a standard deviation of 5 (P2). The annual return for P1 is 12.7 while the annual return for P2 is 11.0. This difference is directly related to the difference in volatility. For normal distributions, it has been shown that the average geometric return is approximately equal to the arithmetic average return less 1/2 the variance. (Obviously, neither P1 or P2 are normally distributed. ) Given this, the variance of returns is extremely important to understanding expectation of terminal wealth and should be of great interest to investors.

Steve,

I appreciate your rather detailed response. I can’t address everything right now, but will at least touch on a bit of it.

As I just pointed out to Carl, while I agree that we annualize for comparability reasons, would we really look at the annualized standard deviations and try to compare them to the annualized returns? What conclusion could we draw?

I think the key question remains: can we draw any different conclusions by comparing the composite and benchmark’s annualized standard deviations as we do with their non-annualized? I think not. But, perhaps we can.

As for the need for 30, it’s a statistical guideline: I’ll dig it out of one of my stat books and share it shortly.

Best wishes,

Dave

David,

Why do we annualised risk is a good question.

Technically to do it all we have to assume that the returns are independent of each other – actually we know they are not so the calculation itself (multiplying by the square root of periodicity) is not valid.

Yet we all do it – and to the extend we all do it consistently it’s probably OK – at least we are comparing like with like. I guess we do it because we tend to use annualised returns and therefore it makes sense to use annualised risk

Best regards

Carl

Carl,

As always, thanks for chiming in.

The point about “comparing like with like” is what I am curious about, as there really is no relationship between a composite’s 3-year annualized return and its 3-year annualized standard deviation. Of, perhaps one might suggest we compare it against the most recent one year period’s return. But is there really anything to be gained from comparing them? How does one compare them? I think the comparison is solely between the composite’s and benchmark’s 3-year standard deviation, and whether that number is annualized or not, the comparison will be the same: that is, they will maintain their relative size differences (this is, I believe, a mathematical certainty).

The challenge that our Performance Measurement Think Tank member brought up was the same as I did in my article: can we in any way look at the distribution of returns for the 36-month period and relate them to the annualized standard deviation, as we do with dispersion, and the answer is “no.” But a bigger question: would we want to? I tried to address this by saying that unlike dispersion, where the distribution of returns relative to its mean has some value, volatility is quite different.

Further discussion, perhaps in person, or perhaps over dinner, would be worthwhile!

Best wishes,

Dave

David,

Thank you for bringing this up, I probably would not have tried to understand the “why” of it without the article.

I do respectfully disagree that there is no point to annualizing the standard deviation when we are trying to provide a measure of risk/volatility/variability. When provided, the annualized standard deviation it is provided along with calendar year returns (so annual returns) for all managers. If a non-annualized standard deviation of 36 monthly returns is provided, we have the standard deviation scaled to a one month return rather than scaled to an annual return. That is fine if all the potential client is doing is comparing risk to a benchmark, but not sufficient if the potential client wants to get a rough idea of the return to risk trade-off that is characteristic of the portfolio. If you annualize the standard deviation, you can deal with both questions at the same time. This is why having the 3-year annualized return along with the 36-month standard deviation is desirable, since it makes this return to risk estimate even less “rough”.

Ultimately, the best case would be to have the non-annualized standard deviation for a statistically significant number of annual returns rather than monthly. However, that long of a track record would exclude many products. Then you would have an annually scaled standard deviation with annual returns so both comparisons could be made.

I realize I am putting aside the non-normal distribution of returns because standard deviation is still the most widely used measure and I have not yet seen a viable, better alternative.

I forgot to mention that I do recognize that many would not believe that using the 36 month annualized standard deviation and the annual returns to get a rough idea of return to risk profile is a valid measure of return to risk, and I agree. However, it is something that potential clients do. Sharpe ratios or estimates of them for arbitrary trailing periods are commonly used. I wish that there were a way to provide those over economically significant time periods rather than trailing time periods, but I haven’t thought or heard of a good way to identify those significant time periods and have everyone agree with them or have a pre-defined way of identifying them.

Vinay, I’m not actually saying NOT to (though I guess the implication is probably there … a bias, perhaps) but more of a “WHY?” The inquiry that I received at our recent Think Tank was “how do we interpret it?,” and it was because we tend to want to add and subtract one standard deviation, to capture two-thirds of the distribution. However, the mistake in this case is that we’re not looking at the distribution (for the 36-month, ex post standard deviation) in the same way as we do for “internal dispersion.”

But since we’re looking at volatility / variability, and the returns we’re looking at are actually monthly, then it probably makes more sense to see a monthly standard deviation.

However, if you prefer annual, it’s fine: the comparison between the benchmark and portfolio will be proportionately the same (monthly vs. annualized), so the same conclusion(s) should be drawn. Just don’t try to compare that figure to the 36-month annualized returns!

σ as we know is also used in Ex-Ante. So say non-annualised SD 2% (often just called volatility). Parametric VaR 95% would be 1.645*2%=3.29% or $3,250 for a $100,000 position. All fine and roughly comparable to an historical VaR calculation. You can then annualise σ or VaR (makes no difference which) by * t ^(1/2). This now gives a whopping VaR of $52,019.

But what if it’s a volatile stock and SD is 7% …? Annualised VaR is now 130% ie more than your position. Is annualised σ a valid measure in this situation? In extreme situations you might go over 100% in ex post as well.

Paul, interesting points.

Yes, standard deviation IS used in ex ante risk, too. To annualize and project a loss greater than 100% would probably cause some to strongly reconsider their portfolio’s makeup.

Don’t see how you’re getting your results, though. Annualizing 7% yields 24.2%. Perhaps I’m missing something.

David

This area needs a bit of clarification of terms and calculations, both Ex-Post and Ex-Ante.

I have always found the standard used by Carl in his book, Chapter 4, to be the best way of standardising – which is the idea of annualising – which is to multiply σ by √t where t = 250/#observations even if simplified to √12 for monthly or √4 for quarterly.

However, there are many out there who disregard the number of observations and just multiply whatever σ they have by √250 regardless, which is about 15.81 which is how I got the 130%. You have multiplied by √12 .. I see no basis in GIPS for doing this and the 3rd edition 2012 GIPS handbook provides no examples I can see.

The 36 months in GIPS as I see it can be treated as √250/36 or √250/375. What is your view?

Paul,

Multiplying a series of monthly standard deviations by the square root of 12 (i.e., the square root of time) is quite standard. I am not familiar with the notion of taking the number of observations into consideration, and don’t necessarily think it’s “the best way.” I do not know where Carl got this from; would have to review this part of his book to see if he cites something or if it’s his own creation.

Where does “250” come from, for example?

Using √12 for monthly or √4 for quarter has been done for decades, I believe.

Granted, there are some (e.g., Paul Kaplan of Morningstar) who soundly dismiss this approach, as it only applies to an arithmetic, not geometric, series. I am exploring Paul’s argument in greater depth, and may report on it in a future post, newsletter, and/or article.

As always, thanks for chiming in!

David, 250 is a ‘sort of’ accepted standard for the number of business days in a year.

Paul, I suspected it might be something like this. However, why would we use business days?

I did a post some time ago about a vendor we encountered who annualizes rates of return using trade days: I came up with 10 reasons why this made no sense. Not sure this application does, either. Again, I’ll need to see Carl’s write up on this to get a better understanding. Thanks!

Paul/David

It’s just the number of observations in the annual period

4 quarters

12 months

52 weeks

255 to 260 business days – number of business days vary of course in different markets – some firms might assume a higher range up to 260 to avoid underestimating risk.

It’s a very well established market standard – we all do it – but to repeat technically we have to assume returns are independent and we know they are not – so we shouldn’t really

Hope that helps

Best regards

Carl

Thanks, Carl. Hopefully, not days, as they’re TOO NOISY. As for “we shouldn’t, really,” I believe you are correct, but also, “we all do it.”

Appreciate you chiming in! Expect to see you in Boston!

Dave

David, Carl – I still think the logic behind this is dead flaky. I have spoken to others since and multiplying by SQRT12 has become a sort of industry standard.

But how can you equate say 24 observations in a month with 12 observations in a year as per GIPS by just multiplying both by SQRT 12?

Paul, “flaky” may, in deed, be an appropriate term for this method. The area is most undoubted worthy of some academic (or near-academic) research, to demonstrate this and to identify the appropriate methodology.

Fundamentally, someone needs to answer the question “what does it mean to annualize a statistic?” For returns, the geometric approach can be proven solid. But how does one do that with standard deviation? An project worthy of someone’s (es’) time. I’ll add it to my list. Thanks for your comments.

Dave

A lesson in regression should be helpful. Once again, you need to consider they ‘why’ of providing standard deviation/variance (which has it’s roots in the sum of squared errors (SSE))

I’m not sure how seriously I take someone with a nom de plume of “Whacko,Jacko,” but I will trust that the person behind it has at least some knowledge in this area; and no doubt, you are correct. Thanks for chiming in.

Whacko (I agree their name lacks instant credibility) is correct in their logic for why the numbers are multiplied by the square root of 12. At the risk of saying the obvious, if we expressed everything is variance terms, and we want to convert from monthly to annual, we would simply multiply by 12. Since variance is an additive function, it is a simple transformation. (This is one reason why most risk attribution will look at contribution to tracking variance as compared to contribution to tracking error.) If we then convert this to a standard deviation, we would take the square root of the variance. This is equivalent to multiplying the standard deviation by the square root of 12.

Carl is also correct that there is an assumption of no serial correlation in the returns if you convert monthly to annual. Mark Kritzman from State Street quantified what he referred to as interval error at a recent conference that I attended (https://northinfo.com/documents/738.pdf). This assumption has been shown to be inaccurate and therefore introduces error into the number. While you could keep everything in monthly terms, it becomes a trade off between this error and a common timing convention. Mathematicians might argue the other way, but I applaud that a decision was made to force consistency.

Steve,

Thanks, and thanks for sharing the paper for Mark (I’ll review it when I return home from Vienna); we may reach out to see if he’d like to speak at PMAR next year.

Forcing consistency has benefits, no doubt; but with no explanatory power, there’s something lacking. Most investment firms, for example, consistently use TWRR to calculate sub-portfolio return; however, in my view, as well as that of a growing number of more enlightened folks, IRR (MWRR) should be used. To be consistently wrong is not a good thing.

I agree with Carl, too, on the his points. But trying to interpret is problematic.

Paul Kaplan of Morningstar wrote an article for JPM a couple years ago challenging the use of the square root of 12 to annual risk measures; someone else wrote a similar paper in the current (Spring) issue, which I will shortly read. This speaks to your point about Mathematicians and their arguments, though I think statisticians are probably more appropriate critics.

Sometimes we do things for expediency sake; the annualization (*SQRT(12)) is just one example. Yes, we can argue that it’s flawed, for one reason or another. But, is it worth the effort to do something else? I’m not sure: it’s probably worth some discussion. Perhaps that’s something we’ll take up, too, at PMAR 2018!