TFA Interview Guide on Statistics for Risk Professionals

Pankaj Maheshwari
Oct 20
16 min read

When Absolute Returns, Discrete Returns, and Continuous Returns Are Most Appropriate

Absolute Returns

Absolute returns are most appropriate when dealing with personal finances, where the focus is on accumulating specific dollar amounts rather than comparing performance, and represent an ideal use case. If your goal is to accumulate $100,000 for retirement, tracking absolute returns provides direct feedback on your progress.

Accounting and financial reporting contexts require absolute returns because financial statements must report actual monetary amounts rather than percentages. Similarly, tax calculations depend on absolute gains and losses measured in currency units.

For certain types of absolute return strategies in institutional portfolio management, where the manager's objective is to generate positive returns regardless of market conditions rather than to beat a benchmark, absolute returns provide an appropriate performance measure. These strategies are evaluated on their ability to generate positive dollar returns, not on their relative performance versus a benchmark index.

Discrete Returns

Discrete returns are most appropriate when dealing with discrete time intervals—daily, weekly, monthly, or annual returns calculated at specific points in time. They're ideal when you need to aggregate returns across different assets within a single time period, as in portfolio return calculations.

For performance reporting and benchmarking, discrete returns provide the standard measure. Investment managers are typically evaluated based on their discrete returns relative to benchmark indices, and these returns are what investors ultimately care about in practical terms.

In applications where the focus is on single-period analysis rather than multi-period compounding, discrete returns offer optimal simplicity without sacrificing meaningful information. They're also preferred when communicating with less sophisticated audiences who may not be familiar with logarithmic calculations.

Continuous Returns

Continuous returns are most appropriate when conducting theoretical analysis, particularly in continuous-time models. Any application of the Black-Scholes model or its extensions requires continuous returns due to the model's mathematical structure.

For time-series analysis and econometric modeling, continuous returns offer significant advantages. Their additivity simplifies regression analysis, and their better conformity to normal distributions facilitates statistical inference. When building forecasting models or conducting statistical tests, continuous returns often provide more reliable results.

In risk management applications, particularly those involving Value-at-Risk (VaR) calculations or scenario analysis, continuous returns offer both computational convenience and better statistical properties. Many risk models assume normally distributed returns, an assumption that works better with continuous returns.

When analyzing returns over multiple periods, continuous returns eliminate the complexities of multiplicative compounding. If you need to calculate average returns, aggregate returns over time, or decompose total returns into components from different periods, continuous returns provide mathematical simplicity.

Which Shock Type Should be Used for Equities? 'Discrete Proportional Shock' or 'Continuous Proportional Shock', and In Which Scenarios Would Each be Most Appropriate in Risk Management?

In risk management, the concept of "shocks" represents sudden changes in asset prices that affect portfolio values. The choice between discrete and continuous returns for modeling these shocks has important implications for risk measurement and stress testing.

Discrete Proportional Shocks are typically used when modeling risks that occur at discrete intervals- daily market closes, monthly rebalancing dates, or quarterly reporting periods. They're appropriate when you assume that returns are compounded periodically or you're analyzing how a portfolio would respond to specific percentage moves in asset prices.

For example, a stress test might examine how a portfolio would perform if stocks fell 20%, bonds fell 5%, and commodities rose 10%. These discrete percentage shocks directly map to how traders and portfolio managers think about market moves.

Discrete shocks are also natural for certain types of scenario analysis. If you're modeling the impact of a specific historical event (like the 2008 financial crisis), you would typically use the actual discrete returns that occurred during that event.

Continuous Proportional Shocks are preferred when building models that assume continuous price evolution or when statistical properties matter more than intuitive interpretation. Risk models based on Value-at-Risk (VaR) or Expected Shortfall (ES) often use continuous returns because they better approximate normal distributions.

For simulation-based risk models (Monte Carlo methods), continuous returns provide computational advantages. The additivity property simplifies the generation of multi-period scenarios, and the approximate normality facilitates the use of standard random number generators.

Continuous returns are also preferred when the risk model incorporates correlations among multiple assets. The mathematical properties of continuous returns make correlation matrices more stable and easier to work with than those based on discrete returns.

In derivatives pricing, particularly options, continuous returns are essentially mandatory due to the mathematical structure of pricing models. The Black-Scholes model and most other derivatives pricing frameworks assume that asset prices follow geometric Brownian motion with continuously compounded returns. Many other derivative models, including interest rate models (Vasicek, Cox-Ingersoll-Ross), commodity models, and credit default swap pricing models, similarly rely on continuous-time frameworks that require continuous returns.

When Would You Prefer the Median Over the Mean in Financial Risk Analytics?

In financial risk management, measures of central tendency are used to summarize patterns in returns, profit and loss (PnL), stress test outcomes, and exposure profiles. However, not all averages are equal, and in skewed or irregular data, the median often provides a more robust and realistic picture than the mean.

Mean (Arithmetic Average): Is the sum of all observations divided by the number of observations.
Median: Is the middle value when the data is sorted in ascending order.

The key difference is that the mean is sensitive to outliers, while the median is robust to them, not affected by extreme values (outliers). For example, A portfolio with the following 5-day daily returns (in %):

(-1.2, 0.3, 0.5, 0.4, 8.0)

Mean = 1.6% (inflated due to the outlier of +8.0%), Median = 0.4% (better reflects the "typical" daily return)

Interpretation:

The mean suggests the portfolio is performing extremely well, but this is misleading due to a single high-return investment. The mean can be distorted by infrequent, large events.
The median reflects a more stable, consistent performance, which is a more realistic view of the actual investment experience across most days because the median is resistant to distortion by outliers.

When to Prefer Median in Risk Management?

In non-normal return/PnL distributions, with positive or negative skew, the mean can overestimate or underestimate the expected losses. For instance, in credit risk modeling, a few borrowers may default with huge losses, pulling the average mean loss higher, but the median may show that most loans perform well, which is critical for understanding core credit portfolio behavior.
In Value-at-Risk (VaR) and Expected Shortfall (ES) risk modeling, particularly with the historical simulation method, distributions of losses often show high kurtosis (fat tails) or extreme losses, especially during financial crises. In such distributions, the mean is highly sensitive to tail events, but the median remains stable. For instance, when analyzing PnL during historical stress events, the median provides better insight into the typical stress outcome, while the mean might be driven by a single outlier.
In investment management, when reporting fund performance across time, or return distributions across investor accounts, the median return offers a clearer picture of what the “typical investor” experienced, particularly when a few accounts had extreme gains or losses due to leverage or tactical positioning, or managers want to reduce narrative bias caused by outliers. Median is useful for marketing material, client dashboards, and peer group comparison, especially in ESG and alternative investment portfolios.

While the median is preferred in skewed or extreme-market conditions, the mean is still necessary:

In regulatory capital models, particularly those under Basel III or Solvency II, often rely on expected values (means) to estimate required capital buffers.
- In credit risk models, regulators may require the use of Expected Credit Loss (ECL), which is calculated as: ECL = PD x LGD x EAD. Each component (Probability of Default, Loss Given Default, and Exposure at Default) is estimated using mean values derived from historical or modeled data.
- In market risk, the Internal Models Approach (IMA) may still involve expected shortfall (ES), which is itself an average of the tail distribution beyond a Value-at-Risk (VaR) threshold.
These frameworks demand consistency, so expected (mean) values provide a standardized and additive measure of risk for capital allocation across business lines and risk types.
In investment management, long-term strategic decisions, such as portfolio construction, retirement planning, or asset-liability modeling (ALM), depend heavily on the expected return of assets.
- The Capital Asset Pricing Model (CAPM) uses the expected return of a security to determine the required return.
- In utility theory, investors are assumed to make decisions based on the expected utility of returns, not the median or the most frequent outcome.
Monte Carlo simulations involve generating a large number of random scenarios to model uncertain variables, such as future asset/portfolio values, path-dependent option payoffs, or pension fund liabilities. Over thousands (or millions) of iterations, the law of large numbers ensures that the average (mean) of these simulations converges to the expected value. For instance, simulating the future value of a stock portfolio under a Geometric Brownian Motion (GBM) model will return a distribution of possible outcomes, and the mean of these outcomes is considered the expected value.

Thus, risk professionals must evaluate when central tendency is used for insight vs. decision-making.

Why is Range Not Always a Reliable Measure of Dispersion?

The range is the simplest measure of dispersion, calculated as the difference between the maximum and minimum values in a dataset:

Range = Maximum Value − Minimum Value

It provides a quick sense of the spread or span of the data.

While it's intuitive, the range is not robust because it only considers the two extreme values and ignores the distribution of the rest of the data. It tells you the distance between the furthest observations, but it tells you nothing about how values behave in between. Because it focuses on extremes, the range is heavily affected by outliers.

for example,

Dataset A: (10, 12, 13, 14, 15), Range = 15 − 10 = 5

Dataset B (with outlier): (10, 12, 13, 14, 100), Range = 100 − 10 = 90

Though the bulk of both datasets is similar, the presence of a single extreme value in Dataset B distorts the range, making it appear much more dispersed than it truly is.

Because of its limitations, the range is rarely used in professional investment or risk management settings. Instead, analysts prefer more robust measures of dispersion, such as:

Standard Deviation: Measures the average deviation from the mean.
Interquartile Range (IQR): Focuses on the middle 50% of the data, ignoring outliers.
Mean Absolute Deviation (MAD): Averages the absolute differences from the mean.

These alternatives account for the entire data distribution and are more effective in real-world, noisy financial datasets.

What is Standard Deviation, and Why Is It Used as a Measure of Risk in Finance?

Standard deviation is a statistical measure that quantifies the variability or dispersion of a dataset around its mean. In finance, it serves as the primary measure of investment risk because it captures the uncertainty of returns.

Mathematical Definition:

Sample Standard Deviation: SD = √[Σ(Xi - X̄)² / (n-1)]

What Standard Deviation Measures:

Standard deviation tells us the typical distance that returns deviate from their average. It answers the question: "How much do returns typically vary from what we expect?"

Example: Two stocks both average 10% annual return over 10 years:

Stock A: Returns range from 8% to 12% and the Standard Deviation 2%. Very predictable, low variability.

Stock B: Returns range from -20% to +40% and Standard Deviation: 15%. Highly unpredictable, high variability

Both have the same average return, but Stock B is much riskier.

Why Standard Deviation is a Measure of Risk:

Uncertainty Measurement: A higher standard deviation means greater uncertainty about what return you'll actually receive. You might get the average return, or you might get something very different.

Downside Potential: While standard deviation measures both upside and downside variability, assets with high standard deviation have greater potential for losses significantly below the mean.

Return Predictability: Low standard deviation means returns are more predictable and consistent. High standard deviation means returns are erratic and hard to forecast.

Empirical Rule (68-95-99.7): Assuming approximately normal distribution:
- 68% of returns fall within ±1 standard deviation of the mean.
- 95% fall within ±2 standard deviations of the mean.
- 99.7% fall within ±3 standard deviations of the mean.

Walk me through calculating the standard deviation step-by-step.

To calculate the standard deviation of an asset returns dataset, the following steps are typically involved:

Calculate the Mean (Average) Return: The mean represents the average value of all returns in the dataset.

Where:

'Returns' is an array of daily (or periodic) returns.

'Count of Returns' (or 'n') is the total number of return points in the dataset or the number of trading days.

Calculate the Deviation of Each Return Point from Its Mean: Each return point is subtracted from its mean to determine how much it deviates:

Calculate the Deviation of Each Data Point from the Mean

This results in the average value around which we'll measure deviations.

Square Each Deviation: Since some deviations will be negative and some positive, they might offset each other; squaring them ensures all values are positive:

The deviations (Returns - Mean Return) can be positive (for values above the mean) or negative (for values below the mean), and they always sum to zero. Squaring (Returns - Mean Return)^2 accomplishes two things: it eliminates negative values, and it gives greater weight to larger deviations. And this results in the total squared deviation across all data points.

Calculate the Variance: Variance is the average of the squared deviations and represents the dispersion of values around its mean, but in squared terms:

This formula captures the essence of measuring variability: for each data point, we calculate how far it deviates from the mean, square that deviation, and then average all these squared deviations.

However, in most practical financial applications, we don't have access to the entire population of possible returns; we only have a sample. This distinction is crucial because it affects how we calculate variance. The use of (n - 1) in the denominator instead of n is called Bessel's correction, and it provides an unbiased estimate of the population variance. The mathematical reason for this correction relates to the fact that we're using the sample mean rather than the true population mean, which introduces a downward bias that must be corrected. For large samples, the difference between dividing by n or (n - 1) becomes negligible, but for small samples, this correction is important.

Compute the Standard Deviation: Taking the square root of the variance converts the result back into the same unit as the original dataset:

Where:

'Returns' is an array of daily (or periodic) returns.

'Mean Return' is the average of all return points in the dataset.

'Count of Returns' is the total number of return points in the dataset or the number of trading days.

In financial applications, we almost always use the sample formula (n - 1) because we're working with historical return data that represents a sample from the broader population of possible future returns.

Sample Standard Deviation: SD = √[Σ(Xi - X̄)² / (n-1)]

This final step ensures that standard deviation can be compared directly with asset returns and other financial metrics.

What is the Relationship Between Variance and Standard Deviation?

Variance and standard deviation are both measures of dispersion that quantify how much individual data points deviate from the mean of a dataset. They are closely related mathematically, but serve different purposes in interpretation. The mathematical relationship is shown below:

Variance (σ²) is the simple average of the squared deviations from its mean. Standard deviation (σ) is the square root of that value, bringing it back to the original unit of measurement.

Why Use Standard Deviation Instead of Variance?

Variance is often used in covariance matrices, which are inputs to portfolio optimization and risk aggregation models. Standard deviation is directly used because of:

Interpretability: Standard deviation is expressed in the same units as the data or mean (dollars, returns, interest rates), making it easier to understand and communicate. For example, A volatility of 5% standard deviation is intuitive; a variance of 0.0025 isn’t.

Portfolio Risk Application: Standard deviation is used to quantify the risk or volatility of returns of a portfolio, and is often applied directly in Value-at-Risk (VaR) models, scenario design, and other risk metrics.

Variance, though useful in calculations (especially in optimization and analytical derivations), is more abstract because it is in squared units.

Why Do We Divide by (n-1) Instead of n When Calculating Sample Variance?

When estimating the variance of a population based on a sample, we use the following formula:

Dividing by 'n-1' instead of 'n' is known as applying Bessel’s correction, and it is used to remove the bias in the estimation of the true population variance.

When we compute variance from a sample, we're using the sample mean as a proxy for the true population mean. This introduces downward bias because the deviations are calculated from a mean that is itself estimated from the sample data. As a result, the sample variance tends to underestimate the actual population variance if we divide by 'n'. Dividing by 'n-1' adjusts for underestimation and results in an unbiased estimate of the population variance.

Suppose you take a sample of 5 asset returns from a larger population time series. If you divide by 5, the estimate will be slightly too low on average. By dividing by 4 (i.e., n-1), you inflate the variance just enough to offset that bias, giving you a more accurate estimate of the true underlying variability.

When Is This Important in Risk Management?

In backtesting investment or risk models, where only a finite set of historical returns is available.
In portfolio risk estimation, when you’re computing volatility or covariances from sampled return data.
In stress testing and VaR modeling, the statistical accuracy of inputs affects tail risk projections.

What is the difference between covariance and correlation?

Covariance and correlation both measure the relationship between two variables, but they differ in scale and interpretability.

Covariance measures the directional relationship between two variables in their original units. It tells us whether variables tend to move together (positive covariance) or in opposite directions (negative covariance), but its magnitude depends on the scale of the variables being measured. For example, if we're measuring stock returns in dollars versus percentages, we'll get vastly different covariance values.

Correlation is the standardized version of covariance, always bounded between -1 and +1. It's calculated by dividing covariance by the product of the two variables' standard deviations. This standardization removes the scale dependency, making correlation universally interpretable and comparable across different datasets. A correlation of +0.8 means the same thing whether we're comparing stock returns, temperatures, or any other variables.

The key advantage of correlation is that it provides both direction and strength of the relationship, while covariance only provides direction and an unstandardized magnitude.

Why is correlation preferred over covariance in portfolio management?

Correlation is preferred in portfolio management for several practical reasons:

Correlation is scale-independent. When comparing relationships between different asset pairs (stocks vs bonds, equities vs commodities), correlation provides a standardized measure that allows meaningful comparison. A correlation of 0.3 between two tech stocks can be directly compared to a correlation of 0.3 between stocks and bonds.

Correlation's bounded nature (-1 to +1) makes it intuitive to interpret. Portfolio managers can immediately understand that a correlation of 0.9 indicates a strong positive relationship, while 0.1 indicates a weak relationship. This clarity facilitates decision-making about diversification benefits.

Correlation matrices are easier to communicate to clients and stakeholders. When presenting portfolio construction decisions, explaining that two assets have a correlation of 0.3 is more meaningful than saying they have a covariance of 0.00045.

However, it's important to note that in the actual mathematical calculations of portfolio variance, we use the covariance matrix, not the correlation matrix. But for analysis, interpretation, and communication, correlation is the preferred measure.

What does a correlation coefficient of zero mean? Does it mean the variables are independent?

A correlation coefficient of zero indicates that there is no linear relationship between the two variables. This is a crucial distinction- it specifically means no linear relationship, not no relationship at all.

Zero correlation does NOT necessarily mean the variables are independent. Variables can have strong non-linear relationships while showing zero linear correlation. The classic example is Y = X². If we calculate the correlation between X and X², we might find it close to zero if X is symmetrically distributed around zero, even though Y is perfectly determined by X.

In finance, this distinction matters. Two assets might show low correlation during normal market conditions but exhibit strong relationships during extreme events. For instance, many asset pairs that appear uncorrelated in calm markets become highly correlated during crises, a phenomenon known as "correlation breakdown."

True independence is a stronger condition than zero correlation. If two variables are independent, they will have zero correlation, but zero correlation does not guarantee independence. Independence means the joint probability distribution equals the product of the marginal distributions, which is a more stringent requirement than merely having no linear relationship.

Walk me through how you would calculate the correlation coefficient between two stocks. What data would you need?

To calculate the correlation coefficient between two stocks, I would follow this systematic process:

Data Collection: I would need historical price data for both stocks over the same time period. Typically, I'd use daily closing prices for the past 1-3 years (250-750 trading days). The data must be aligned- same dates, accounting for any splits or dividends.

Calculate Returns: Convert prices to returns using either discrete returns: (P_t - P_{t-1})/P_{t-1}, or continuous returns: ln(P_t/P_{t-1}). For most applications, I'd use continuous returns as they have better statistical properties and are additive over time.

Calculate Means: Calculate the mean return for each stock:
1. Stock A Mean: μA = Σ(RA) / n
2. Stock B Mean: μB = Σ(RB) / n

Calculate Deviations: For each observation, calculate the deviation from the mean:
1. Stock A Deviations: (RAi - A)
2. Stock B Deviations: (RBi - μB)

Calculate Covariance
Cov(A,B) = Σ[(RAi - μA)(RBi - μB)] / (n - 1)

Calculate Standard Deviations
σA = √[Σ(RAi - μA)² / (n - 1)]
σB = √[Σ(RBi - μB)² / (n - 1)]

Calculate Correlation
ρ(A,B) = Cov(A,B) / (σ_A × σ_B)

In practice, I would use Excel's CORREL function or Python's pandas.corr() method, but understanding the underlying calculation is essential for interpreting results and troubleshooting issues.

How does correlation impact portfolio diversification? Explain with an example.

Correlation is the mathematical foundation of diversification benefits. The key insight is that portfolio risk depends not just on individual asset risks but critically on how assets move relative to each other.

Mathematical Foundation: For a two-asset portfolio, portfolio variance is:

σp² = w₁²σ₁² + w₂²σ₂² + 2w₁w₂σ₁σ₂ρ₁₂

The correlation term (ρ₁₂) in this equation determines the diversification benefit.

Example: Two assets with equal weight (50% each):

Asset 1: Expected Return = 12%, Standard Deviation = 20%
Asset 2: Expected Return = 10%, Standard Deviation = 20%

Perfect Positive Correlation (ρ = +1.0)
σp² = (0.5)²(0.2)² + (0.5)²(0.2)² + 2(0.5)(0.5)(0.2)(0.2)(1.0)
σp = 20%
No diversification benefit, portfolio risk equals the weighted average of individual risks.

Zero Correlation (ρ = 0.0)
σp² = (0.5)²(0.2)² + (0.5)²(0.2)² + 0
σp = 14.14%
Significant diversification, portfolio risk reduced by about 29% compared to scenario 1.

Negative Correlation (ρ = -0.5)
σp² = (0.5)²(0.2)² + (0.5)²(0.2)² + 2(0.5)(0.5)(0.2)(0.2)(-0.5)
σp = 10%
Substantial diversification, portfolio risk reduced by 50% compared to scenario 1.

The lower the correlation between assets, the greater the diversification benefit. This is why portfolio managers actively seek assets with low or negative correlations; they can reduce overall portfolio risk without proportionally reducing expected returns.

In practice, finding truly negatively correlated assets is challenging. Most equity assets show positive correlation (typically 0.3 to 0.7), which is why diversification across asset classes (stocks, bonds, commodities, and real estate) is important.

What are the limitations of using correlation for portfolio construction?

While correlation is fundamental to portfolio theory, it has several important limitations that practitioners must understand:

Non-Stationarity (Time-Varying Correlations): Correlations are not constant; they change over time based on market conditions. During crises, correlations typically increase dramatically as most risk assets fall simultaneously. This means that diversification benefits may disappear precisely when investors need them most. Historical correlations may not predict future relationships, especially across different market regimes.

Linear Relationships Only: Correlation measures only linear relationships. Assets might have strong non-linear dependencies that correlation misses entirely. For example, some assets might show low correlation in normal markets but strong relationships during tail events (tail dependence). Traditional correlation analysis wouldn't capture this asymmetric relationship.

Focus on Second Moments: Correlation captures the relationship between means and variances (first and second moments) but ignores higher moments like skewness and kurtosis. Assets might have low correlation, but both exhibit negative skewness (tendency for large negative returns), creating recognized concentrated risk.

Outlier Sensitivity: Correlation calculations can be heavily influenced by outliers or extreme observations. A few extreme data points can distort the correlation estimate, potentially leading to poor portfolio construction decisions.

Symmetric Treatment of Deviations: Correlation treats positive and negative co-movements symmetrically, but investors often care more about downside co-movements. When two assets both crash, that's worse than when they both surge, but correlation treats these scenarios equally.

Sample Size and Estimation Error: Correlation estimates from historical data contain estimation error, especially with limited data. For portfolios with many assets, the correlation matrix has N(N-1)/2 unique correlations to estimate- with 50 assets, that's 1,225 correlations. This creates substantial estimation error.

Better Approaches: To address these limitations, practitioners use dynamic correlation models (DCC-GARCH), copula-based dependence measures, regime-switching models, robust correlation estimators, scenario analysis and stress testing, and focus on tail risk and downside correlation specifically.

LinkedIn

The FinAnalytics