Imagenet original is a dataset with ~1.3M images and 1000 classes. Imagenette is a much smaller subset of 10 distinct classes
Normalization is the process of making the data input have 0 mean and std 1
If no Normalization.from_stats() is passed to batch_tfms, FA calculates the stats from a single batch of your data
Training on images that gradually increase in size as training progresses. For example, first fit_one_cycle on size 128, then fine_tune on 224
*** Did not try
TTA is when you data augment test images, and average the predictions In FA, do learn.tta(n=4)
Slower, because we need each additional image needs an inference as well
Mixup is when you mixup images by adding weighted versions together. This means the image pixels are added together. Weighted labels are added together as well Implemented in fa2 by adding a callback into the learner You will also want to use the LabelSmoothingCrossEntropy loss func Example: cnn_learner(dls, …, cbs = MixUp(), loss_func=LabelSmoothingCrossEntropy())
Because it allows the model to optimize predictions to values between 0, and 1. Instead of always forcing to 0/1
It is harder to see what’s in a MixUp image. Also, now it needs to predict continuous numbers instead of a 1/0
If there are mislabels in your dataset, trying to optimize to 1/0 predictions can be harmful. Encourage model to be less confident by having labels between 1/0, and then adding a little “epsilon” error to every other prediction Example for one hot-encoded -> [0, 1, 0, 0] -> [0.1, 0.7, 0.1, 0.1]
Mislabeled data or harder to label data. I think label smoothing can give a higher fidelity to data if you start adding in prediction + confidence level For example [0, 1, 0, 0] -> [0, 0.9, 0, 0] -> Do you care if all sum to 1? Or just that they are constrainted between 0 and 1 (a range)
1 - epsilon + epsilon/N -> where N = 5
If the dataset is large, find a smaller subset that is representative of the whole. This will let you try experiments much faster. Key is finding smallest dataset that is representative and able to discriminate between techniques For example, Imagenet (1000 classes) -> Imagenette (10 classes)