Financial markets are dynamic and influenced by various factors, and understanding the relationships between various factors and interest rates is crucial. this project focuses on studying interest rates by utilizing US Treasury interest rate data points from January 2022, or beyond, studies and compares regression models for estimating yield curves, and assessing the performance of these models in large-volume, liquid market conditions.

Yield curve represents a relationship between the rate of return and maturity of certain securities. A range of activities on the market is determined by the abovementioned relationship; therefore its significance is unquestionable. Besides that, its shape reflects the shape of the economy, ie. it can predict recessions. these are the reasons why it is very important to properly and accurately estimate the yield curve. there are various models evolved for its estimation; however, the start of the study has always been the polynomial regression models.

**Prerequisites**

**Module: **Modeling Term-Structure of Interest Rates

Lectures: Yield Curve Construction – Interpolation Methods | Advanced Interpolation Methods – Vandermonde Matrix | Newton Divided Difference | Modeling Yield Curve – Regression Models (Single Factor)

In this project, you need to explain the use of polynomial regression models in analyzing interest rates and the importance of understanding the relationship between tenors and interest rates.

You must clearly state the assumptions involved or made during the analysis (may include):

about the data: assuming the provided US Treasury interest rate data is accurate and reliable (source), no major economic events significantly influence interest rates during the observed period (in the case of time-series analysis).

the relationship between variables, assuming a polynomial relationship adequately represents the complex dynamics of the curve, the relationship between tenors and interest rates does not undergo abrupt and unaccounted-for changes, or any other relevant assumptions regarding the use of models, and underlying methods.

**Market Data for Interest Rates:** __https://home.treasury.gov/resource-center/data-chart-center/interest-rates/TextView?type=daily_treasury_yield_curve&field_tdr_date_value=2023__

**Project Requirements**

**Actual vs. Predicted Interest Rates:** Create a table comparing the actual interest rates with the predicted interest rates for each degree of polynomial. In the context of this project on modeling interest rates using polynomial regression analysis, the comparison between actual and predicted interest rates involves assessing how well these polynomial regression models are capturing the observed data.

**Actual Interest Rates:**the actual interest rates refer to the real-world interest rates observed or recorded in the dataset. these are the "ground truth" values that you aim to model and predict using these polynomial regression equations.

**Predicted Interest Rates:**the predicted interest rates are the values generated by these polynomial regression models. After fitting the models to the observed data, you use the polynomial equations to predict interest rates based on the given independent variable values.

the goal of creating a table comparing actual and predicted interest rates is to assess the performance of these polynomial regression models at different degrees which helps in understanding how well these models capture the underlying patterns in the data.

If the predicted interest rates closely match the actual rates across different tenors, it indicates a good fit of the model to the data. Deviations between actual and predicted rates help identify areas where the model may struggle to capture the underlying patterns.

**Coefficient Table:** Create a table presenting the coefficients for each degree of the polynomial function. Include columns for each coefficient (**ß0**, **ß1**, ..., **ßn**). In the context of regression analysis, coefficients represent the parameters assigned to each independent variable in a mathematical equation. these coefficients are determined through the process of fitting a model to observed data. In this project, you're using polynomial regression models to estimate yield curves, and the coefficients would be associated with the terms of the polynomial equation.

Creating a "Coefficient Table" involves presenting the coefficients for each degree of the polynomial function in a structured and organized format – using polynomial regression models of different degrees to estimate yield curves.

*The coefficients ß0, ß1, ß2, and so on, would be estimated from the regression analysis, indicating the intercept, linear impact, quadratic impact, cubic impact, and so on, on interest rates.*

Positive coefficients indicate a positive relationship between the independent and the dependent variables. Negative coefficients indicate a negative relationship. Magnitude matters! larger absolute values mean a stronger impact on the dependent variable.

**Regression Residuals:** Create a table presenting the residuals for each degree of the polynomial function. In the context of polynomial regression analysis, residuals are the vertical distances between the observed data points and the corresponding points predicted by the regression model. Mathematically, the residual for each observation is calculated as the difference between the actual value (y) and the predicted value (y^). Creating a table for regression residuals helps in assessing the accuracy and goodness of fit of the polynomial models.

the table of residuals provides a systematic way to examine how well these polynomial regression models are capturing the variability in the data. It helps identify patterns in the discrepancies between the actual and predicted values.

While time-series analysis of residuals, if the residuals are randomly scattered around zero, it suggests that the model is capturing the variability in the data well. Patterns or trends in the residuals might indicate areas where the model performs poorly or fails to capture certain features of the data.

**R-squared (Coefficient of Determination):** It is a statistical measure that represents the proportion of the total variance in the dependent variable that is explained by the independent variables in a regression model. In the context of polynomial regression, having a table of R-squared values for different degrees of the polynomial helps evaluate the goodness of fit of the models.

R-squared = 1 - (Sum of Squares of Residuals / Total Sum of Squares)

where,

Sum of Squares of Residuals = Summation of (Y - Yhat)^2

Total Sum of Squares = Summation of (Y - Ybar)^2

A higher R-squared value indicates that a larger proportion of the variability in the dependent variable is explained by the model. Comparing R-squared values across different degrees helps identify the degree that provides the best balance between model complexity and explanatory power.

**Your Research Report Must Include!**

You need to summarize the findings, discuss the trade-off between model complexity and goodness of fit, evaluate whether higher-degree polynomials result in overfitting or capturing noise in the data, and consider the risk of underfitting if the polynomial degree is too low to capture essential patterns.

Performance: Identify the degree of the polynomial that performs best based on a combination of low residuals and high R-squared, and justify why this degree is chosen as the optimal model.

#### References: If any external sources or libraries were used during the analysis, provide proper references. Include details on datasets, statistical methods, or relevant literature that influenced the research.

**Team Collaboration and Selection Requirement:**

for this project, candidates have the flexibility to choose their mode of operation:

**Standalone Mode:** If you feel confident and would like to take on the challenge individually, you're welcome to work on the project as a standalone participant. This will give you an opportunity to showcase your individual strengths and decision-making skills.

**Team Collaboration:** Alternatively, if you believe that collaboration will enhance the quality and depth of your analysis, you are encouraged to form a team. Collaborative efforts often bring diverse perspectives, leading to richer insights and more comprehensive results.

**Self-Selection:**Candidates are free to select their own teammates. If you already have someone in mind, align with them, and inform the project coordinator of your team composition.**Team Size:**While there's no strict limit, we recommend teams of 2-3 members for effective collaboration and equitable distribution of work.**Commitment Agreement:**Ensure that all members of the team are equally committed to the project.

Please note: Whether you choose to work individually or in a team, the assessment criteria will remain consistent. The emphasis will be on the depth of analysis, quality of insights, and presentation of findings.

## Bình luận