With the resurgence of neural networks in the 2010s, deep learning has become essential for machine learning practitioners and even many software engineers. This book provides a comprehensive introduction for data scientists and software engineers with machine learning experience. Youâ??ll start with deep learning basics and move quickly to the details of important advanced architectures, implementing everything from scratch along the way.
Author Seth Weidman shows you how neural networks work using a first principles approach. Youâ??ll learn how to apply multilayer neural networks, convolutional neural networks, and recurrent neural networks from the ground up. With a thorough understanding of how neural networks work mathematically, computationally, and conceptually, youâ??ll be set up for success on all future deep learning projects.
This book provides:
Extremely clear and thorough mental modelsâ??accompanied by working code examples and mathematical explanationsâ??for understanding neural networks Methods for implementing multilayer neural networks from scratch, using an easy-to-understand object-oriented framework Working implementations and clear-cut explanations of convolutional and recurrent neural networks Implementation of these neural network concepts using the popular PyTorch framework
From the brand

Your partner in learning
Bestsellers
Software Development
Programming Languages
AI / Machine Learning
Data Science
Data, Databases and more
Cloud Services
Business
Finance
Blockchain / Cryptocurrency
Security
Lean series
Cookbooks
Head First series
97 Things series

Sharing the knowledge of experts
O’Reilly’s mission is to change the world by sharing the knowledge of innovators. For over 40 years, we’ve inspired companies and individuals to do new things (and do them better) by providing the skills and understanding that are necessary for success.
Our customers are hungry to build the innovations that propel the world forward. And we help them do just that.
ASIN : 1492041416
Publisher : O’Reilly Media
Publication date : October 15, 2019
Edition : 1st
Language : English
Print length : 250 pages
ISBN-10 : 935213902X
ISBN-13 : 978-9352139026
Item Weight : 2.31 pounds
Dimensions : 7 x 0.5 x 9 inches
Best Sellers Rank: #188,113 in Books (See Top 100 in Books) #49 in Computer Neural Networks #99 in Python Programming #331 in Artificial Intelligence & Semantics
Customer Reviews: 4.2 4.2 out of 5 stars (191) var dpAcrHasRegisteredArcLinkClickAction; P.when(‘A’, ‘ready’).execute(function(A) { if (dpAcrHasRegisteredArcLinkClickAction !== true) { dpAcrHasRegisteredArcLinkClickAction = true; A.declarative( ‘acrLink-click-metrics’, ‘click’, { “allowLinkDefault”: true }, function (event) { if (window.ue) { ue.count(“acrLinkClickCount”, (ue.count(“acrLinkClickCount”) || 0) + 1); } } ); } }); P.when(‘A’, ‘cf’).execute(function(A) { A.declarative(‘acrStarsLink-click-metrics’, ‘click’, { “allowLinkDefault” : true }, function(event){ if(window.ue) { ue.count(“acrStarsLinkWithPopoverClickCount”, (ue.count(“acrStarsLinkWithPopoverClickCount”) || 0) + 1); } }); });
13 reviews for Deep Learning from Scratch: Building with Python from First Principles
Add a review
Original price was: $65.99.$50.16Current price is: $50.16.


Kindle Customer –
Excellent book written by an educator, but definitely check the github…
This book steps you through coding layers, backpropagation, optimization, the trainer for the neural net; all the little bits and pieces that you’ll want to know in-depth. It is well-organized and clearly explained. The author comes back to frame ideas as computational graphs over and over, and that provides an anchor for the reader as concepts become more complex. You will apply your code to model data sets, many of them the “usual suspects” that everyone should know.You will want to check the author’s github for this book. In some later chapters, parts of the code are omitted from print in the book, but are present in the github repo. There are also some minor corrections and updates that have been made to the repo since the book was printed. This was an excellent resource for me.
zuluwhiskey –
Not bad
I personally disliked the layout of this book. I think some concepts could have been handled better. But I do have A LOT of highlighting in some chapters. His explanation and examples were very good. If you aren’t really comfortable with object oriented programming, you might want to look for another book.
O –
Code examples are very intuitive!
– Concepts are explained very clearly- Explanation of how Jacobian tensors are used (best placed in Appendix) to compute partial derivatives of matrix transformations would clarify how the back-propagation equations in the book were obtained.* e.g. Computing Jacobians in our code would use up significant computer resources, therefore, rather than computing them for back-prop., we use their essential values (hence, the partial derivatives in the code examples)* Jacobians make the concept of backprop. intuitive for understanding
kowman –
This book will teach you how to code neural networks – MLPs, ConvNets, RNNs with LTSM and GRU cells!
This book is the one book I have found that actually teaches how to code these networks from scratch. The vast majority of other books are simply theoretical in nature, or use a toolkit like Theano, TensorFlow, or PyTorch which gives little understanding of how neural networks actually work. It is one thing to read the paper by Hochreiter on LTSM cells but another to have a complete code implementation. Furthermore, knowledge of precisely how the networks work is necessary to design new types of architectures.
Amazonic customer –
Great
Very informative
Nate –
Miserably Written. You do the Author’s Job for Them.
First chapter and the promise to include all code is already broken as several functions are called that are not written down anywhere. So if you want to follow along, you need to reverse engineer a lot of what the author has done. Not sure why they thought this was acceptable.Edit after more reading:The notation is a disaster. Derivations left unfinished even in the appendix. Meaningless diagrams. Abuse of the notation the author makes up (without warning!). Major parts of the code just left out. The author is a maniac and I’m convinced the positive reviews are fake. Don’t bother with this book, it’s horrible. The author seems to know their stuff, but they left about half the book out.Information is presented out of order. Critical information is relegated to a footnote and motivated long after it should be. The math and code, while on their own easy enough to read, seem to come from nowhere as the author just presents in a stream of thought mindset. There does not seem to have been an editor, as there is no other way a text could come out this manic.
calvinnme –
The content is very instructive, the printed book is AWFUL
I think some other reviewer said the same thing I am about to say, but sometimes reinforcement in a review is helpful. This book will teach you deep learning in the Python language. It has code, exposition, and figures. And the figures are why I give it 3 stars. The figures are in black and white in the printed book. If you look at the actual book as it was intended to be on O’Reilly’s website you see those figures are in color and far more instructive than the black and white images that the printed book contains. Bravo to the author, boo to the publisher. Maybe if enough of us complain they will do this and other O’Reilly books justice and start printing images in color again. Even if we have to pay a little bit more.
TSR –
Could be better. Certainly a 2nd Ed is warranted
Some great points to be noted are – An Illustrative approach and lots of code. I like that, less talk and more code.Just getting started on it. But In the math intro, why does the author not use Lim h→0 f(a+h)−f(a)/h and instead use Lim h→0 f(a+h)−f(a-h)/2h to define the mathematically precise definition of the derivative? It’s an odd choice of the introduction of derivatives when the norm for introducing the derivative is the former. I mean I’m no Math major but isn’t the latter used when f(x) is symmetric about the point a? i.e. f(a+h) = f(a-h)? It’s such a restrictive definition. Anyway, I would like to be corrected.The code examples are cool.
Eli Emanuel –
I found this to be a great book, that covers on the mathematics, Python code as well as diagram illustration of what would otherwise seem to be abstruse concepts. In just seven short chapters, the author manages to:1) Instruct the reader about the mathematics involved in deep learning in a clear, concise and comprehensive manner.2) Expound on concepts and theories involved in neural network, deep learning model through Python codes and visual aids such as diagrams.3) Illustrate how to build neural networks, and deep learning models from scratchChapter 1) Foundations: Chapter one touches on Calculus (chain-rule, derivatives), Linear Algebra (vectors, matrices & operations), nested functions. The author works through an example and elucidates the mathematics involved, demonstrates the codes line-by-line, as well as giving visual aids such as diagrams to illustrate the process flow of the concepts.Chapter 2) Fundamentals: Chapter two is an extension of the mathematical principles, concepts of deep learning, as well as the explanations of chapter one. In chapter two, the reader is exposed to the building of a traditional linear regression model from scratch as well as the building of neural network. The author gave sufficient explanations as to why the neural network model is able to give a more accurate prediction of the ‘target’ and ‘feature’.Chapter 3) Deep Learning from Scratch: In chapter three, the reader gets to learn about ‘layers’, ‘operations’ as well as ‘class’. Towards the end of the chapter, the reader gets to see various aspects of the deep learning process being integrated as a deep learning example built from scratch.Chapter 4) Extensions: Chapter four starts off with a ‘loss function’, the chapter also covers activation functions other than ‘Sigmoid’ and explains why these activation functions might accelerate learning. Next, the chapter covers ‘momentum’ and illustrates that momentum is the most important extension of the stochastic gradient descent optimization technique. The chapter briefly discusses three techniques that are essential, namely: (1) learning rate decay, (2) weight initialization, and (3) dropout. The reader will learn how each of these techniques enable the neural network to find successively more optimal solutions. The concepts illustrated in this chapter are elucidated by breaking them down into three aspects: (1) Math, (2) Intuition, and (3) Code.Chapter 5: Convolutional Neural Networks: In chapter five, the reader will learn about ‘convolutional operations’, ‘feature maps’. The author elaborates on processes and aspects involved in ‘multichannel convolution operation’, explaining that feature maps are each set of features detected by a particular set of weights. In another section, the author further elaborates that “feature maps is a linear combination of convolving m1 different filters over that same location in each of the corresponding m1 feature maps from the prior layer.” The author also expounds on the concept of ‘convolutional filters’ and ‘convolutional layers’. The author illustrates the operations of ‘1D convolutions with batches: forward pass & backward pass’ before moving on to 2D convolutions as well as the codes involved in ‘2D convolutions: forward pass & backward pass’. The author then instructs that the three Python functions: ‘_output’, ‘_input_grad’, & ‘_param_grad’ are necessary functions needed to create a ‘Conv2DOperation’, which forms the core of the ‘Conv2DLayers’ used in CNNs as illustrated in the book.Chapter 6) Recurrent Neural Networks: In chapter six, the reader will learn that ‘Recurrent Neural Networks’ (RNN) is a class of neural network architectures meant for handling sequences of data, designed to take in sequences of such data and return a correct prediction as output. In this chapter, the author uncovers the key limitation of the framework, that the book has been using so far, namely ‘Handling Branching’. The author touches on ‘Automatic Differentiation’ as well as showing how ‘Gradient Accumulation’ works in Python codes. In the later part of the chapter, the author reintroduces RNN, illustrating the ‘First Class for RNNs: RNNLayer’ as well as the ‘Second Class for RNNs: RNNNodes’ as well as putting the two classes together. RNN is implemented by code and the forward method and the backward method are expounded upon. In the next part of the chapter, the author talks about ‘Vanilla RNNNodes’ as well as their limitations, and illustrates two advanced variants of the vanilla RNN, namely the: (1) ‘Gated Recurrent Units’ (GRU), as well as the (2) Long Short Term Memory (LSTM). Chapter six has no lack of visual illustrations based on flowcharts and diagrams on the workings of RNNs.Chapter 7) PyTorch: In chapter seven, the author mainly covers PyTorch, starting of with ‘Tensors’, deep learning with PyTorch, then to the elements of PyTorch, namely: model, layer, optimizer, & loss. The author then goes through examples using Neural Networks implemented with PyTorch. Next the author touches upon ‘Convolution Neural Networks’ (CNN) and ‘Long Short Term Memory’ (LSTM) using PyTorch. In the later parts of the chapter, the author illustrates ‘Unsupervised Learning’ with autoencoders used as an example.APPENDIX A) Deep Dive: In this section, the author fills in the gap in terms of the explanations given for mathematical concepts illustrated in the book, such as the ‘Matrix Chain Rule’. The author supplements explanations of ‘Gradient of the Loss with Respects to Bias Terms’. The author also adds a section on how to implement ‘Convolutions via Matrix Multiplication’ efficiently in Numpy.There is no lack of mathematical illustration, diagram illustration of concepts and process-flow of Neural Networks and deep learning, as well as lines of code intended to implement Deep Learning from scratch in this book. This book is readily comprehensible for lay person and I would recommend this book to anyone interested in Deep Learning and Neural Networks. Also, check on the website for this book as well as the author’s GitHub page and try to implement the codes, modifying them a little to supplement your understanding of the whole deep learning process. Overall, although this book is short in length, and a lot of concepts could be further illustrated, I would rate this book a 5/5.
MB –
Very disappointing, overly simplistic, and not worth the money
Lara Rosenberg –
I was looking for a book that would explain the mathematics of neural networks, and this book covers that if you’ve never had took Calculus or Linear Algebra before. However, if you did, you might want to skip the first part of the book (but do check out the code included in the repo!). Explanations are very clear and a lot of care is put into describing concepts in different ways. Only downside I’ve found is that the mathematical notation is a but flimsy. Overall, pretty good book!
quidam –
C’est un livre de cuisine par quelqu’un qui a juste une licence de mathématiques (bachelor of science).
Shlok Dave –
This is how each and every complex topic must be learnt. An excellent and unique book. Perfect foundation for the subject as vast and complex as NN.