Build intelligent systems with solid software, ML, and AI engineering practices.
Preface
This sample gives you a feel for the book’s hands‑on, engineering‑first approach. You will see how we pair concise explanations with runnable assets: notebooks for guided exploration and small, composable Python scripts you can reuse. The full book expands this pattern across the entire lifecycle: software engineering habits that make ideas travel well; ML workflows that are reproducible and comparable; and AI engineering that combines retrieval, LLMs, and orchestration into dependable applications.
-
Skim the concepts, then run the companion code in the public repository. Each example is small by design: you can understand it in minutes, and adapt it to your own work.
-
Prefer clarity over ceremony: we seed randomness, add flags, and print verifiable outputs so you can test and compare runs quickly.
-
Choose your path: work locally with a virtual environment, or open notebooks directly in Google Colab using the links provided in the code repo.
-
Reliable software practices for data and AI: CLIs, packaging, tests.
-
ML in practice: configuration, tracking, pipelines, and deployment.
-
AI systems beyond models: retrieval‑augmented generation, agents, scaling, and responsible AI.
1. The Engineering Mindset
From toy scripts to reliable systems: cultivate reproducibility, scalability, and clarity from day one.
-
Engineering vs. experimentation: trade-offs and mindsets.
-
Reproducibility and reliability as first-class goals.
-
Running example: evolving a tiny script toward a service.
-
Frame problems with engineering constraints and outcomes.
-
Identify minimal process to make work reproducible.
1.1. Why Engineering Patterns Matter
Engineering turns ideas into dependable systems. Instead of “it worked once in my notebook,” we aim for repeatable runs, clear interfaces, and verifiable outcomes. Small, consistent habits—like seeding randomness, adding a CLI wrapper, and writing one tiny test—compound to yield reliability and faster iteration.
1.2. A Tiny Running Example
We’ll start with a tiny, fully self-contained CLI that computes a simple statistic with a seed for reproducibility. This illustrates the mindset: a minimal interface, clear inputs, deterministic behavior, and a concise output that validates the result. In Chapter 2, we’ll package the logic; in Chapter 3, we’ll preview wrapping it in a small HTTP service.
# Software, ML, and AI Engineering
# (c) Dr. Yves J. Hilpisch
# AI-Powered by GPT-5.
"""A tiny CLI that computes a reproducible mean and stdev.
Run:
python code/ch01/mini_cli.py --seed 42 --n 5
"""
from __future__ import annotations # postpone annotations
import argparse # parse command-line flags
import random # deterministic RNG with a seed
import statistics as stats # mean and stdev utilities
def compute_values(seed: int, n: int) -> tuple[list[float], float, float]:
"""Generate n pseudo-random floats deterministically and summarize.
Args:
seed: RNG seed for reproducibility.
n: number of values to generate (> 0).
Returns:
A tuple of (values, mean, stdev).
"""
rng = random.Random(seed) # create an isolated RNG
values = [rng.random() for _ in range(n)] # n floats in [0, 1)
mu = stats.fmean(values) # compute mean (float)
# Use population stdev for a tiny demo; either is fine when stated.
sigma = stats.pstdev(values) # population standard deviation
return values, mu, sigma # return both raw values and summary
def build_parser() -> argparse.ArgumentParser:
"""Create the CLI argument parser."""
p = argparse.ArgumentParser(
description="Compute a reproducible mean and stdev."
)
p.add_argument(
"--seed", type=int, default=123, help="RNG seed (int)."
)
p.add_argument(
"--n",
type=int,
default=5,
help="Number of values to generate (>0).",
)
return p # return configured parser
def main() -> None:
"""Entry point for the CLI."""
args = build_parser().parse_args() # parse flags from sys.argv
if args.n <= 0: # validate the input
raise SystemExit("--n must be > 0") # exit with a clear error
values, mu, sigma = compute_values(args.seed, args.n) # run computation
# Print a concise, verifiable summary (copy-pastable into tests if needed).
print(
f"seed={args.seed} n={args.n} mean={mu:.6f} stdev={sigma:.6f}"
) # user-facing output
if __name__ == "__main__": # only run when executed as a script
main() # invoke the CLI entry
$ # Run the CLI twice with different seeds
$ python code/ch01/mini_cli.py --seed 42 --n 5
$ python code/ch01/mini_cli.py --seed 1 --n 5
1.3. From Toy Script to System — A Roadmap
-
Script → CLI: add a parser, explicit inputs, and a single-line output.
-
CLI → Package: move the core function under
src/
and write one unit test. -
Package → Service: expose the function via a tiny HTTP endpoint.
-
Service → Container: ship with a
Dockerfile
, pin versions, and echo config. -
Container → Observability: add basic logs, a health check, and one metric.
1.5. Exercises
1.5.1. Reproducible CLI (warm-up)
Run the listing above with two different seeds and compare outputs.
$ python code/ch01/mini_cli.py --seed 1 --n 5
$ python code/ch01/mini_cli.py --seed 2 --n 5
-
Goal: see deterministic results per seed; confirm the output format
seed=… n=… mean=… stdev=…
. -
Deliverable: a short note (2–3 sentences) explaining what stays the same and what changes across runs, and why.
1.5.2. Minimal test (determinism)
Create code/ch01/test_mini_cli.py
that imports compute_values
and asserts
that for a fixed seed the returned (mu, sigma)
stay constant across two calls.
-
Hint: use
assert mu1 == mu2 and sigma1 == sigma2
for the same(seed, n)
. -
Deliverable: one test function file. This sets up Chapter 2’s packaging and CI topics; see Chapter 2.
1.5.3. Improve user feedback (logging flag)
Add --verbose
to the CLI to print a one-line configuration echo (seed, n).
Keep the default output unchanged unless --verbose
is supplied.
-
Guideline: briefly echo configuration at startup to aid debugging.
-
Deliverable: show two runs (with/without
--verbose
) and the difference in output.
1.5.4. Input validation (UX)
Make the error for invalid --n
more helpful (e.g.,
--n must be > 0; got {value}
). Add a second check that rejects non-sensical
seeds (e.g., non-integer if you extend the parser).
-
Guideline: fail fast with clear messages; avoid stack traces for user mistakes.
-
Deliverable: commands that demonstrate each validation message.
1.5.5. Packaging preview (bridge to Chapter 2)
Move compute_values
into a src/
package (e.g., src/mymltool/core.py
) and
adapt the CLI to import it.
-
Guideline: keep one entry point (
main()
), one pure function (compute_values
). -
Deliverable: a short note describing the new layout and how to run it.
-
Reference: see Chapter 2 for full packaging and tests.
1.6. Where We’re Heading Next
We ground these ideas with concrete tools and layout in Chapter 2.
1.7. Further Sources
-
argparse — standard command-line parsing in Python; build small, robust CLIs.
-
PEP 8 — the canonical Python style guide; helps write consistent, readable code.
-
pytest — lightweight Python testing; write small tests and run them quickly.
-
The Twelve‑Factor App — guidelines for building reliable, portable, and observable services.