This book shows machine learning enthusiasts and practitioners how to get the best of both worlds by deriving Fisher kernels from deep learning models. In addition, the book shares insight on how to store and retrieve large-dimensional Fisher vectors using feature selection and compression techniques. Feature selection and feature compression are two of the most popular off-the-shelf methods for reducing data's high-dimensional memory footprint and thus making it suitable for large-scale visual retrieval and classification. Kernel methods long remained the de facto standard for solving large-scale object classification tasks using low-level features, until the revival of deep models in 2006. Later, they made a comeback with improved Fisher vectors in 2010. However, their supremacy was always challenged by various versions of deep models, now considered to be the state of the art for solving various machine learning and computer vision tasks. Although the two research paradigms differ significantly, the excellent performance of Fisher kernels on the Image Net large-scale object classification dataset has caught the attention of numerous kernel practitioners, and many have drawn parallels between the two frameworks for improving the empirical performance on benchmark classification tasks. Exploring concrete examples on different data sets, the book compares the computational and statistical aspects of different dimensionality reduction approaches and identifies metrics to show which approach is superior to the other for Fisher vector encodings. It also provides references to some of the most useful resources that could provide practitioners and machine learning enthusiasts a quick start for learning and implementing a variety of deep learning models and kernel functions.