We tested the following hyperparameter space:
| Parameter | From | To | Steps |
|---|---|---|---|
| Loss Function | BPR-MAX | TOP1-MAX | - |
| Final Activation Function | ELU-0.5 | Linear | - |
| Learning Rate | 0.1 0.5 |
0.01 0.1 |
10 5 |
| Momentum | 0.00 | 0.90 | 0.10 |
| Drop-Out | 0.00 | 0.90 | 0.10 |
| Constrained Embedding | True | False | - |
| Dataset | Loss Function | Final Activation Function | Learning Rate | Momentum | Drop-Out | Constrained Embedding |
|---|---|---|---|---|---|---|
| RSC15 | TOP1-MAX | Linear | 0.04 | 0.0 | 0.3 | True |
| RETAILROCKET | TOP1-MAX | Linear | 0.03 | 0.2 | 0.3 | True |
| ZALANDO | BPR-MAX | ELU-0.5 | 0.1 | 0.3 | 0.1 | False |
| DIGINETICA | TOP1-MAX | Linear | 0.05 | 0.0 | 0.4 | True |
| DIGINETICA (STAMP) | TOP1-MAX | ELU-0.5 | 0.07 | 0.0 | 0.6 | True |
| 8TRACKS | TOP1-MAX | ELU-0.5 | 0.04 | 0.3 | 0.8 | True |
| AOTM | TOP1-MAX | ELU-0.5 | 0.04 | 0.6 | 0.0 | False |
| NOWPLAYING | TOP1-MAX | ELU-0.5 | 0.05 | 0.1 | 0.6 | True |
| 30MUSIC | TOP1-MAX | ELU-0.5 | 0.05 | 0.1 | 0.6 | True |
We tested the following hyperparameter space:
| Parameter | From | To | Steps |
|---|---|---|---|
| Number of Epochs | 10 | 30 | 10 |
| Decay Rate | 0.0 | 0.9 | 10 |
| Initial Learning Rate | 0.001 0.0001 |
0.01 0.001 |
10 10 |
| Dataset | Number of Epochs | Decay Rate | Initial Learning Rate |
|---|---|---|---|
| RSC15 | 20 | 0 | 0.0007 |
| RETAILROCKET | 10 | 0.6 | 0.0008 |
| ZALANDO | 30 | 0.7 | 0.009 |
| DIGINETICA | 20 | 0.1 | 0.0009 |
| DIGINETICA (STAMP) | 20 | 0.1 | 0.0008 |
| 8TRACKS | 30 | 0.0 | 0.0008 |
| AOTM | 30 | 0 | 0.004 |
| NOWPLAYING | 20 | 0.9 | 0.0005 |
| 30MUSIC | 10 | 0.4 | 0.003 |
We tested the following hyperparameter space:
| Parameter | From | To | Steps |
|---|---|---|---|
| Learning Rate | 0.1 0.5 |
0.01 0.1 |
10 5 |
| Dataset | Learning Rate |
|---|---|
| RSC15 | 0.0008 |
| RETAILROCKET | 0.01 |
| ZALANDO | 0.007 |
| DIGINETICA | 0.0007 |
| DIGINETICA (STAMP) | 0.008 |
| 8TRACKS | 0.002 |
| AOTM | 0.004 |
| NOWPLAYING | 0.004 |
| 30MUSIC | 0.007 |
We tested the following hyperparameter space:
| Parameter | From | To | Steps |
|---|---|---|---|
| Learning Rate | 0.01 0.001 |
0.001 0.0001 |
10 5 |
| Iterations | 10 | 30 | 10 |
| Negative Sampling | True | False | - |
| Dataset | Learning Rate | Iterations | Negative Sampling |
|---|---|---|---|
| RETAILROCKET | 0.006 | 10 | True |
| DIGINETICA | 0.003 | 10 | False |
| DIGINETICA (STAMP) | 0.009 | 20 | False |
| AOTM | 0.005 | 30 | True |
We tested the following hyperparameter space:
| Parameter | From | To | Steps | Options |
|---|---|---|---|---|
| Learning Rate | 0.001 0.0001 |
0.0001 0.00001 |
10 10 |
- |
| Memory Size | - | - | - | 128,256,512 |
| Dataset | Learning Rate | Memory Size |
|---|---|---|
| RSC15 | 0.0002 | 256 |
| RETAILROCKET | 0.0003 | 512 |
| ZALANDO | 0.0005 | 256 |
| DIGINETICA | 0.0002 | 256 |
| 8TRACKS | 0.0008 | 256 |
| AOTM | 0.00001 | 256 |
| NOWPLAYING | 0.0005 | 128 |
| 30MUSIC | 0.0009 | 128 |
We tested the following hyperparameter space:
| Parameter | From | To | Steps |
|---|---|---|---|
| Learning Rate | 0.01 | 0.0001 | 20 |
| L2 Regularization | 0.0001 | 0.000001 | 20 |
| LR Decay | 0.0 | 0.9 | 10 |
| LR Decay Steps | 3 | 7 | 3 |
| Dataset | Learning Rate | L2 Regularization | LR Decay | LR Decay Steps |
|---|---|---|---|---|
| RSC15 | 0.0007 | 0.00001 | 0.1 | 7 |
| RETAILROCKET | 0.0002 | 0.000003 | 0.54 | 3 |
| ZALANDO | 0.006 | 0.000005 | 0.28 | 3 |
| DIGINETICA | 0.0001 | 0.000007 | 0.63 | 3 |
| 8TRACKS | 0.002 | 0.00005 | 0.46 | 7 |
| AOTM | 0.001 | 0.00006 | 0.1 | 7 |
| NOWPLAYING | 0.006 | 0.000007 | 0.1 | 3 |
| 30MUSIC | 0.006 | 0.00003 | 0.36 | 3 |
We tested the following hyperparameter space:
| Parameter | From | To | Steps | Options |
|---|---|---|---|---|
| Steps | 2 14 |
15 30 |
13 4 |
|
| Weighting | - | - | - | Div, Linear, Quadratic, Log |
| Dataset | Steps | Weighting |
|---|---|---|
| RSC15 | 8 | Div |
| RETAILROCKET | 7 | Div |
| ZALANDO | 3 | Quadratic |
| DIGINETICA | 25 | Div |
| DIGINETICA (STAMP) | 30 | Quadratic |
| 8TRACKS | 25 | Log |
| AOTM | 6 | Div |
| NOWPLAYING | 9 | Quadratic |
| 30MUSIC | 30 | Quadratic |
We tested the following hyperparameter space:
| Parameter | Options |
|---|---|
| Number of Neighbors | 50, 100, 500, 1000, 1500 |
| Sample Size | 500, 1000, 2500, 5000, 10000 |
| Similarity | Cosine, Jaccard |
| Dataset | Number of Neighbors | Sample Size | Similarity |
|---|---|---|---|
| RSC15 | 500 | 10000 | Jaccard |
| RETAILROCKET | 50 | 5000 | Cosine |
| ZALANDO | 50 | 10000 | Cosine |
| DIGINETICA | 100 | 500 | Cosine |
| DIGINETICA (STAMP) | 1000 | 5000 | Cosine |
| 8TRACKS | 1000 | 1000 | Cosine |
| AOTM | 50 | 1000 | Cosine |
| NOWPLAYING | 50 | 2500 | Jaccard |
| 30MUSIC | 100 | 500 | Cosine |
We tested the following hyperparameter space:
| Parameter | Options |
|---|---|
| Number of Neighbors | 50, 100, 500, 1000, 1500 |
| Sample Size | 500, 1000, 2500, 5000, 10000 |
| Weighting | Same, Div, Linear, Quadratic, Log |
| Weighting Score | Same, Div, Linear, Quadratic, Log |
| IDF Weighting | False, 1, 2, 5, 10 |
| Dataset | Number of Neighbors | Sample Size | Weighting | Weighting Score | IDF_Weighting |
|---|---|---|---|---|---|
| RSC15 | 100 | 1000 | Quadratic | Quadratic | False |
| RETAILROCKET | 1500 | 2500 | Same | Linear | 10 |
| ZALANDO | 50 | 10000 | Log | Quadratic | 10 |
| DIGINETICA | 500 | 5000 | Quadratic | Div | 5 |
| DIGINETICA (STAMP) | 50 | 2500 | Log | Quadratic | 1 |
| 8TRACKS | 100 | 5000 | Quaratic | Quadratic | False |
| AOTM | 50 | 100 | Div | Quadratic | False |
| NOWPLAYING | 100 | 2500 | Quadratic | Quadratic | False |
| 30MUSIC | 100 | 10000 | Quadratic | Quadratic | False |
We tested the following hyperparameter space:
| Parameter | Options |
|---|---|
| Number of Neighbors | 100, 200, 500, 1000, 1500, 2000 |
| Sample Size | 1000, 2500, 5000, 10000 |
| Similarity | Cosine, Dot Product |
| Session Position Weighting | 0.00001, L/8, L/4, L/2, L, 2L (L=average session length in the dataset) |
| Session Neighborhood Weighting | 2.5, 5, 10, 20, 40, 80, 100 |
| Item Neighborhood Weighting | 0.00001, L/8, L/4, L/2, L, 2L (L=average session length in the dataset) |
| Item Position Weighting | 0.00001, L/8, L/4, L/2, L, 2L (L=average session length in the dataset) |
| IDF Weighting | False, 1, 2, 5, 10 |
| Dataset | Number of Neighbors | Sample Size | Similarity | Session Position Weighting | Session Neighborhood Weighting | Item Neighborhood Weighting | Item Position Weighting | IDF Weighting |
|---|---|---|---|---|---|---|---|---|
| RSC15 | 1500 | 10000 | Cosine | 0.5 | 5 | 2 | 2 | 2 |
| RETAILROCKET | 1000 | 2500 | Dot Product | 3.62 | 100 | 0.4525 | 3.62 | 1 |
| ZALANDO | 1500 | 10000 | Cosine | 3.13 | 100 | 3.13 | 1.56 | 1 |
| DIGINETICA | 1500 | 5000 | Cosine | 4.9 | 40 | 4.9 | 1.225 | 10 |
| 8TRACKS | 2000 | 2500 | Cosine | 5.68 | 100 | 22.72 | 0.00001 | False |
| AOTM | 200 | 5000 | Cosine | 7.05 | 80 | 14.1 | 0.00001 | 5 |
| NOWPLAYING | 200 | 1000 | Dot Product | 10.2 | 40 | 2 | 1.275 | False |
| 30MUSIC | 2000 | 2000 | Dot Product | 8.4 | 40 | 4.2 | 0.00001 | 1 |
We tested the following hyperparameter space:
| Parameter | Options |
|---|---|
| Number of Neighbors | 100, 200, 500, 1000, 1500, 2000 |
| Sample Size | 1000, 2500, 5000, 10000 |
| Similarity | Cosine, Dot Product |
| Session Position Weighting | 0.00001, L/8, L/4, L/2, L, 2L (L=average session length in the dataset) |
| Session Neighborhood Weighting | 2.5, 5, 10, 20, 40, 80, 100 |
| Item Neighborhood Weighting | 0.00001, L/8, L/4, L/2, L, 2L (L=average session length in the dataset) |
| Dataset | Number of Neighbors | Sample Size | Session Position Weighting | Session Neighborhood Weighting | Item Neighborhood Weighting |
|---|---|---|---|---|---|
| RSC15 | 1000 | 10000 | 0.00001 | 10 | 2 |
| RETAILROCKET | 500 | 1000 | 1.81 | 100 | 0.4525 |
| ZALANDO | 100 | 1000 | 1.56 | 100 | 3.13 |
| DIGINETICA | 500 | 10000 | 1.225 | 20 | 4.9 |
| 8TRACKS | 500 | 10000 | 5.68 | 100 | 11.36 |
| AOTM | 500 | 1000 | 28.2 | 100 | 14.1 |
| NOWPLAYING | 2000 | 2500 | 0.00001 | 100 | 20.4 |
| 30MUSIC | 1000 | 10000 | 0.00001 | 100 | 4.2 |
We tested the following hyperparameter space:
| Parameter | Options |
|---|---|
| Expert | StdExpert, DirichletExpert |
| Max Considered Context Length | 5,10,20,30,40,50,75 |
| Number of Recent Candidates (Only for Adaptive Configuration) | 5,10,20,30,40,50,75 |
| Dataset | Expert | Max Considered Context Length | Number of Recent Candidates |
|---|---|---|---|
| All | StdExpert | 50 | 1000 |