VaR is a way of measuring the likelihood that a portfolio will suffer a large loss in some period of time, or the maximum amount that you are likely to lose with some probability (say, 99%). It does this by: (1) looking at historical data about asset price changes and correlations; (2) using that data to estimate the probability distributions of those asset prices and correlations; and (3) using those estimated distributions to calculate the maximum amount you will lose 99% of the time. At a high level, Nocera’s conclusion is that VaR is a useful tool even though it doesn’t tell you what happens the other 1% of the time.

naked capitalism already has one withering critique of the article out. There, Yves Smith focuses on the assumption, mentioned but not explored by Nocera, that the events in question (changes in asset prices) are normally distributed. To summarize, for decades people have known that financial events are not normally distributed – they are characterized by both skew and kurtosis (see her post for charts). Kurtosis, or “fat tails,” means that extreme events are more likely than would be predicted by a normal distribution. Yet, Smith continues, VaR modelers continue to assume normal distributions (presumably because they have certain mathematical properties that make them easier to work with), which leads to results that are simply incorrect. It’s a good article, and you’ll probably learn something.

While Smith focuses on the problem of using the wrong mathematical tools, and Nocera mentions the problem of not using enough historical data – “All the triple-A-rated mortgage-backed securities churned out by Wall Street firms and that turned out to be little more than junk? VaR didn’t see the risk because it generally relied on a two-year data history” – I want to focus on another weakness of VaR: the fact that the real world changes.

Even leaving aside the question of which distribution (normal or otherwise) to use, VaR assumes the likelihood of future events is dictated by some distribution, and that that distribution can be estimated using past data. A simple example is a weighted coin that you find on the street. You flip it 1,000 times and it comes up heads 600 times, tails 400 times. You infer that it has a 60% likelihood of coming up heads; from that, you can calculate the probability distribution for how many heads will come up if you flip it 10 more times, and if you want to bet on those coin flips you can calculate your VaR. Your 60% is just an estimate – you don’t know that the true probability is 60% – but you can safely assume that the physical properties of the coin are not going to change, and you can use statistics to estimate how accurate your estimate is. But another way, your sample (the 1,000 test flips) is drawn from the same population as the thing you are trying to predict (the next 10 flips).

By contrast, imagine you have two basketball teams, the Bulls and the Knicks, who have played 1,000 games, and the Knicks have won 600. You follow the same methodology, bet a lot of money that the Knicks will win at least 5 of the next 10 games – and then the Bulls draft Michael Jordan. See the problem?

Now, are asset prices more like coin flips or like basketball times? On an empirical level, they may be more like coin flips; their probability distributions aren’t likely to change as dramatically as when the Bulls draft Jordan, or the Celtics trade for Kevin Garnett and Ray Allen. But on a fundamental level, they are more like basketball teams. The outcome of a coin flip is dictated by physical processes, governed by the laws of mechanics, that we know are going to operate the same way time after time. Asset prices, by contrast, are the product of individual decisions by thousands, millions, or even billions of people (when it comes to, say, wheat futures), and are affected as well by random shocks such as the weather. We have little idea what underlying mechanisms produce those prices, and all the simplifying assumptions we make (like rational profit-maximizing agents) are pure fiction. Whatever the underlying function for price changes is, if it winds up distributed in a manner similar to some mathematical function, it’s by accident; and more importantly, no one tells us when the function changes.

Going back to asset prices: To estimate the probability distribution of price changes, you need a sample that reflects your population of interest as closely as possible. Unfortunately, your sample can only be drawn from the past, and your population of interest is the future. So you really face two different risks. You face the risk that, in the current state of the world (assuming you can estimate that perfectly), an unlikely event will occur. You also face the risk that the state of the world will change. VaR, at best (assuming solutions to Smith’s criticisms), can quantify the first risk, not the second.

Let’s say you are just interested in your VaR for tomorrow. The chances that the real world will change significantly from today to tomorrow are small, but you still have the question of deciding how far back to draw your sample from. Is tomorrow’s behavior going to be most similar to the behavior over the last 30 days, the last 30 months, or the last 30 years? It depends on when the real world last changed – and you have no good way of knowing that (although there are statistical ways to guess). And when you try to look at your VaR for the next quarter, or year, you have the additional risk of the world changing under your feet.

To put it another way, what happened in the last two years? One explanation is that the models were intrinsically faulty (wrongly specified). One explanation is that the models didn’t go back far enough to incorporate data about steep falls in housing prices. And one explanation is that no amount of data would have helped, because the world changed.

I want to apply this thinking to a question that has annoyed me for years. You often hear personal finance types saying, “over every 30-year period, no matter what year you start in, stocks always outperform bonds.” Their data usually go back about 100 years. So this sounds like you have 70 data points (you don’t have the results for the last 30 starting years), right? Nope. If that were the case, you could start your 30-year period on every single trading day in those first 70 years, which would give you about 17,500 data points. Maybe you have 3 data points, because you have 3 non-overlapping (and hence arguably independent) 30-year periods. But this all assumes that during the 30 years starting right now, the stocks basketball team and the bonds basketball team have the same relative strengths that they did over the last 100 years, which is a big assumption. There are other reasons to believe stocks will have higher returns than bonds, but the fact that for ten years everyone has been assuming stocks must do better than bonds leads me to believe it may not happen this time – at least if you take, say, 2000 as your starting point. (I suppose I should mention that about 63% of my non-cash financial assets are in stocks, more if you include REITs.)

There was one part of Nocera’s article that I liked a lot:

At the height of the bubble, there was so much money to be made that any firm that pulled back because it was nervous about risk would forsake huge short-term gains and lose out to less cautious rivals. The fact that VaR didn’t measure the possibility of an extreme event was a blessing to the executives. It made black swans [unlikely events] all the easier to ignore. All the incentives — profits, compensation, glory, even job security — went in the direction of taking on more and more risk, even if you half suspected it would end badly. After all, it would end badly for everyone else too. As the former Citigroup chief executive Charles Prince famously put it, “As long as the music is playing, you’ve got to get up and dance.” Or, as John Maynard Keynes once wrote, a “sound banker” is one who, “when he is ruined, is ruined in a conventional and orthodox way.”

This, I think, is an accurate picture of what was going on. If you were a senior executive at an investment bank, even if you knew you were in a bubble that was going to collapse, it was still in your interests to play along, for at least two reasons: the enormity of the short-term compensation to be made outweighed the relatively paltry financial risk of being fired in a bust (given severance packages, and the fact that in a downturn all CEO compensation would plummet); and bucking the trend incurs resume risk in a way that playing along doesn’t.

If you were an individual trader, the incentives might have been the opposite: shorting the market was an opportunity to make a name for yourself and open your own hedge fund, while buying more mortgage-backed securities would just keep you in the same bonus tier as everyone else. But it’s the CEOs who called the shots, and their personal risk aversion was what mattered. Or, in the brilliant words of John Dizard (cited in the naked capitalism article):

A once-in-10-years-comet-wiping-out-the-dinosaurs disaster is a problem for the investor, not the manager-mammal who collects his compensation annually, in cash, thank you. He has what they call a “résumé put”, not a term you will find in offering memoranda, and nine years of bonuses.

For a complete list of Beginners articles, see the Financial Crisis for Beginners page.

Originally published at Baseline Scenario and reproduced here with the author’s permission.

Your article alludes to an important point about risk and its assessment via the widely-accepted VaR, Value at Risk: Asset prices are products of ongoing fluctuations of innumerable human decisions which are the outcomes of various individual paradigms, all very difficult to quantify mathematically.Looking at VaR from a systems science perspective, where the whole is greater than the sum of its parts, one could ask: “What is RISK? What is VALUE?”Value is not simply money/price. Value should be understood as a system’s emergence/system’s creativity. It inherently contains expressions/outcomes of efficiency, effectiveness, risk management, and proportionate cost that went into its creation, being it a product or a service.Risk is a component of value. Risk should not be separated from value but must be differentiated from uncertainty; the first is quantifiable by our senses, the second isn’t. Mathematical models incorporate our assumptions which may or may not be based on true sensory perception, hence the variability of VaR interpretations. An example: Crossing a street while looking and listening to the approaching traffic represents “risk,” as the probabilities of a safe passage are “calculated” by our situation-appropriate senses (vision and hearing). In this instance, to rely on the sense of touch, for example, would be inappropriate and our safety then would simply be a chance. On the other hand, the same initial scenario would turn into “uncertainty” if vision and hearing are eliminated from the decision tree and some other senses are used. In the absence of sensory quantification of risk, we could simply calculate the odds of getting hit at a certain street location or at a certain time. But, that may give us only the average probability and will say nothing about a specific instance; risk management lies in the specifics of an instance, not averages. As a general rule, uncertainty may be converted to risk through our quantifying senses, e.g. in the above example: using eyes and ears, not touch.Various senses have an uneven ability to sort out signals from the environment. The key to our understanding of what is arbitrarily happening outside of us, and even within us, is how we ‘understand’ what information our senses allowed to enter our awareness domain as well as being attentive to the used paradigm (by those who made the decisions and those who judged them). The incoming (filtered and compressed) signals eventually hit our cognitive threshold that, even among the brightest among us, seems to be set at a low capacity, about 25 bits per second, as compared to the totality of the available information.A full understanding of value, as derived within an applicable system and its cycles, would allow for more comprehensive understanding of all interacting components, including risk.