Versioning
Logs the master dataset changes
Master 1.0.3 - 2023/04/14
Train - Test split
X_train_full
andy_train_full
files were added for download. The last 3 months' data and resolved targets are included. Unresolved targets are set as NaNs.
Master 1.0.2 - 2023/04/07
Features
One
feature
from thegordon
strategy
was removed due to a minor fix.
Master 1.0.1 - 2023/03/31
Features
Some
features
were removed from themaster
dataset. Thefeatures
names changed since they are named by their order in their respectivestrategies
.
Master 1.0 - 2023/03/24
Features
The
features
processing has changed and thefeatures
are not orthogonalized to risk factors anymore.
Targets
The features changed from specific compounded returns over the target's span to raw compounded returns over the respective span.
Master 0.1 - 2023/03/17
Features
Feature names range from Feature_1 to Feature_n. They are prefixed with their respective
strategy
.Features are orthogonal to risk factors.
The quantization of the features is done in a gaussian way by moon's cross-sections.
Targets
Computed as the cumulative product of the specific return of each asset on different time horizons, with a 2 days lag:
target_w
: 7 daystarget_r
: 30 daystarget_g
: 60 daystarget_b
: 90 days
The targets are quantized in a fat-tailed way by
moon
cross-sections.
Train - Test split
The dataset is split into three parts:
X_train
contains thefeatures
up to the lastmoon
- 90 days.y_train
contains thetargets
up to the lastmoon
- 90 days.X_test
contains thefeatures
of the last 90 days.
Embargo
: the 90 days size ofX_test
data you must predict represents the part of the data on which thetargets
are not fully computed -target_b
being a 90 daystarget
.Moons
are continuous between the train set and the test set - Firstmoon
of the test set starts at maxmoon
of the train set + 1.
Example submission
The example_submission file is built with a linear regression built with sklearn.
Submission
Submissions file must respect the following rules to be valid:
The column names must be the same as those in the
example_submission
file.Values must be in the 0-1 range.
NaNs are not accepted.
The number of
moons
must be the same as the ones in theexample_submission
file.The
ids
in eachmoon
must be the same as in theexample_submission
file.There mustn't be any constant column in any column in any
moon
.
Last updated