AI For Trading:Backtest (118)

What is a backtest?



Backtest Validity

A "valid" backtest must satisfy:

  • 1、The profit calculation is realistic(利润计算是现实的,即模拟的损益实际上可以通过实现与回测交易相同的工具)。
  • 2、No lookahead bias.

Backtest Overfitting


Trading in larger sizes than would be optimal.

Backtest Best Practices

    1. Use cross-validation to achieve just the right amount of model complexity.
    1. Always keep an out-of-sample test dataset. You should only look at the results of a test on this dataset once all model decisions have been made. If you let the results of this test influence decisions made about the model, you no longer have an estimate of generalization error.
    1. Be wary of creating multiple model configurations. If the Sharpe ratio of a backtest is 2, but there are 10 model configurations, this is a kind of multiple comparison bias. This is different than repeatedly tweaking the parameters to get a sharpe ratio of 2.
    1. Be careful about your choice of time period for validation and testing. Be sure that the test period is not special in any way.
    1. Be careful about how often you touch the data. You should only use the test data once, when your validation process is finished and your model is fully built. Too many tweaks in response to tests on validation data are likely to cause the model to increasingly fit the validation data.
    1. Keep track of the dates on which modifications to the model were made, so that you know the date on which a provable out-of-sample period commenced. If a model hasn’t changed for 3 years, then the performance on the past 3 years is a measure of out-of-sample performance.

Traditional ML is about fitting a model until it works. Finance is different—you can’t keep adjusting parameters to get a desired result. Maximizing the in-sample sharpe ratio is not good—it would probably make out of sample sharpe ratio worse. It’s very important to follow good research practices.

财务是不同的 - 您无法继续调整参数以获得所需的结果。
最大化样本内锐度比并不好 - 这可能会使样本锐化率更差。

Gradient Boosting

In our exercise about overfitting, we're going to use a type of model that we haven't yet encountered in the course, but that's popular and well-known, and has been used successfully in machine learning competitions: gradient boosted trees. Here we're going to give you a short introduction to gradient boosting so that you have an intuition for how the model works.

We've already studied ensembling; well, boosting is another type of ensembling, or combining weak learners into a strong learner. It's also typically done with decision trees as the weak learners. The video below will give you a quick introduction to boosting by telling you about the first successful boosting algorithm, Adaboost.