[[neural network]] [[gradient descent]] [[keras]] - [[feed forward neural net]] - [[recurrent neural network]] - [[convolutional neural network]] - [[transformer]] - [[generative adversarial network]] - [[diffusion model]] - [[autoencoder]] ## dropout Introduced by Srivastava et al in paper Dropout: a simple way to prevent neural networks from overfitting (co-authors include Geoffrey Hinton, Alex Krizhevsky and Ilya Sutskever). ## early stopping The hyperparameter patience indicates how many epochs should show reduced validation accuracy before stopping early. Too much patience is inefficient but too little patience may result in stopping in some local minima not the optimal solution. Previous model weights must be saved to allow this rollback (this can be non-trivial for very large models!). ## learning rate