The sample code comment discusses using a portion of the full dataset to make training faster, suggesting that the following code takes 100 first samples from the dataset. However, the code in repository actually leaves the first 100 items out, and picks the rest of the full set into the evaluated X_digits, Y_digits matrices.