Deep Learning Basics

This sample provides the Preface and Chapter 1. It outlines the book’s approach and objectives, then walks through a minimal ML example to give you a feel for the style: compact math, runnable code, and clear visuals.

Preface

Deep learning now powers search, recommendations, translation, generative media, and large language models. This book gives a clear, code‑first path to understanding and applying these ideas with PyTorch, focusing on practical workflows, compact math, and visuals that build intuition.

What This Book Aims to Do

Make core ideas tangible with runnable code you can adapt.
Keep math concise: just enough to reason about shapes, derivatives, and objectives.
Provide visual feedback at every step: boundaries, loss curves, feature maps.
Teach good habits for data hygiene, evaluation, and reliable training.

Who It’s For

Python programmers who want a grounded introduction to modern deep learning with PyTorch. If you’re comfortable with basic Python and NumPy, you’re ready. A short refresher is included in the appendices of the full book.

How the Book Is Organized

The full book progresses in five parts, followed by practical appendices:

Part I — Foundations of Machine Learning: the classic data → features → model → metrics loop and where traditional approaches strain.
Part II — Neural Networks and PyTorch Basics: tensors, autograd, layers, activations, and clean training loops with nn.Module.
Part III — Supervised Deep Learning in Practice: input pipelines, regularization, learning‑rate schedules, CNNs, and robust training at scale.
Part IV — Toward Large Language Models: sequences and embeddings, attention and transformers, and reliable large‑model training (DDP, AMP, checkpoints).
Part V — Broader Context and Next Steps: risks, documentation, governance, and a practical learning path with projects and readings.

Appendices provide quick references for Python/NumPy, probability, linear algebra, calculus, installation and environment, full scripts, notebooks, and a glossary.

Why These Topics Matter

Representation learning: modern models learn features directly from raw data, reducing fragile hand‑crafted pipelines.
Scale and transfer: performance improves with data and compute; pretrained models can be adapted efficiently.
Reliability: clean input pipelines, stable training loops, and thoughtful evaluation make models trustworthy and reproducible.
Responsibility: understanding risks, documentation, and safeguards is essential when models affect people and decisions.

How to Use This Sample

This sample includes the Preface and the first chapter. Skim the Preface to understand the approach and scope. Then read Chapter 1 for a compact, hands‑on introduction to the ML workflow. The full book continues step‑by‑step from these foundations to working neural networks and transformers.

Enjoy the journey—and build something you can show.

1. Introduction to Machine Learning

What machine learning is, how it’s used, and where deep learning fits. Think of it as teaching a recipe to a chef: we try an outcome, taste it, tweak the seasoning (parameters), and try again until it’s just right.

You’ll Need

Hardware: a computer with internet (no GPU required).
Software: Python 3.10+, NumPy, scikit-learn, Matplotlib; a terminal and an editor.
Alternative: use the Colab-ready notebook to skip local setup.
Data: none.
Validate: python -m code.env_check prints Python + device; python code/hello_world.py says Hello.
If stuck: Section 1.5

At a Glance

Definitions: classic ML definitions (Samuel, Mitchell)
Categories: supervised, unsupervised, reinforcement
Workflow: data → features → model → training → evaluation
Context: why deep learning became dominant (representation learning at scale)

You’ll Learn

Orient yourself in the ML landscape and vocabulary.
Walk through the standard workflow using a tiny regression.
Read and question metrics; avoid common beginner pitfalls.

1.1. What Is Machine Learning?

“Machine learning is the field of study that gives computers the ability to learn without being explicitly programmed.” — Arthur Samuel (1959)

“A computer program is said to learn from experience \(E\) with respect to some class of tasks \(T\) and performance measure \(P\) if its performance at tasks in \(T\), as measured by \(P\), improves with experience \(E\).” — Tom M. Mitchell (1997)

Put simply: ML uses data to tune a model’s parameters so it gets better at a task. We define what “better” means with a metric (performance measure) and we supply experience as data. Like tasting and retasting a sauce, we evaluate, adjust, and converge on something delicious (useful).

1.2. Types of ML

Supervised learning: learn from labeled examples \((x, y)\) (e.g., predict house prices). You provide both the picture and its caption; the model learns to write the caption.
Unsupervised learning: find structure in unlabeled \(x\) (e.g., cluster customers). You give a basket of mixed fruit and ask the model to sort by similarity.
Reinforcement learning: learn to act via trial-and-error with rewards (e.g., game playing). The model is an agent in an environment learning by consequences: explore, get treats or bumps, adjust policy.

Most of this book focuses on supervised learning because it provides the clearest path to fundamentals you’ll reuse everywhere.

1.3. The ML Workflow

At a high level, the steps repeat across tasks and models:

Frame the problem

Define inputs, outputs, and evaluation. What are we predicting? How will we know we succeeded? Write down your target metric and a “good enough” threshold before training.

Prepare data

Split data into train/validation/test; create features and targets; keep a held-out test set untouched until the end. Treat data hygiene like kitchen hygiene: clean tools, clear labels, and no cross‑contamination between train and test.

Choose a model

Start simple (linear/logistic regression) before fancier models. Simpler baselines are like a flashlight in a dark cave — they keep you oriented.

Train

Fit the model on training data, tune hyperparameters on validation data.

Evaluate and iterate

Measure on validation/test; analyze errors; refine features/model. Don’t chase a single number; also look at error cases and residual plots.

1.4. A Minimal Example: Linear Regression

In [1]: import numpy as np
In [2]: from sklearn.linear_model import LinearRegression

In [3]: X = np.array([[0.0],[1.0],[2.0],[3.0]])
In [4]: y = np.array([0.0, 1.0, 2.1, 2.9])
In [5]: model = LinearRegression().fit(X, y)
In [6]: model.coef_.ravel().tolist(), round(model.intercept_, 3)
Out[6]: ([0.97], 0.03)

Let’s unpack what happened:

We constructed toy inputs X (a column vector) and targets y.
LinearRegression().fit(…) finds the best slope and intercept in the least-squares sense.
Printing coefficients shows the learned line; you can predict with model.predict.

In [7]: pred = model.predict(X)
In [8]: list(zip(np.round(X.ravel(),1), np.round(y,2), np.round(pred,2)))
Out[8]: [(0.0, 0.0, 0.03), (1.0, 1.0, 1.0), (2.0, 2.1, 1.97), (3.0, 2.9, 2.94)]

In [9]: # Optional: visualize the fit
   ...: import matplotlib.pyplot as plt; plt.style.use('seaborn-v0_8')
   ...: xx = np.linspace(0, 3, 50).reshape(-1, 1)
   ...: plt.figure(figsize=(4,3))
   ...: plt.scatter(X, y, label='data')
   ...: plt.plot(xx, model.predict(xx), 'r-', label='fit')
   ...: plt.xlabel('x'); plt.ylabel('y'); plt.legend(); plt.tight_layout(); plt.show()

Tip: For first experiments, prefer tiny, transparent datasets. You should be able to reason about the expected outcome before you run the model.

See scripts and figure generators in Appendix F — Chapter 1 Scripts and the companion notebook: Colab-ready notebook.

Scatter of toy points with fitted regression line

Figure 1. Linear regression on a toy dataset

The fitted line tracks the trend of the tiny dataset; we’ll formalize error measures (MAE/MSE/R²) in Chapter 3.

1.5. First Run (Environment Check)

Before you dive deeper, verify your environment. From the project root:

python -m code.env_check
python code/hello_world.py
python code/ch01/minimal_regression_sklearn.py

Expected:

env_check prints your Python version and, if PyTorch is installed, basic device info (CPU or CUDA). It’s robust: if PyTorch isn’t present yet, it simply reports CPU.
hello_world.py prints a friendly greeting, proving Python can run project scripts.
The minimal regression script prints a coefficient and intercept, then predictions.

1.6. Sanity Box

Quick Checks

Does the sign/magnitude of the learned slope make intuitive sense?
If you duplicate a training row, do results change (they shouldn’t much)?
If you shuffle inputs, is the score stable? If not, data is too small or model too sensitive.

1.7. Common Pitfalls

Using the test set for tuning (leaks signal; keep it for the very end).
Comparing models with different target scalings (normalize consistently).
Trusting a single metric without context (inspect residuals and errors).

1.8. Exercises

Recreate the example with your own X, y pairs and verify the line.
Add noise to y and see how the intercept/slope change.
Split data into train/test and report mean absolute error on test.

1.9. Where We’re Heading Next

We’ll build up from linear models to neural networks, explaining tensors, autograd, and the PyTorch way.

Deep Learning Basics — Sample