diff --git a/chapter_linear-classification/generalization-classification.md b/chapter_linear-classification/generalization-classification.md
index 2cddbce53f..a51214580a 100644
--- a/chapter_linear-classification/generalization-classification.md
+++ b/chapter_linear-classification/generalization-classification.md
@@ -436,7 +436,7 @@ On the other hand, a fixed classifier is useless---it
 generalizes perfectly, but fits neither
 the training data nor the test data.
 The central question of learning
-has thus historically been framed as a tradeoff
+has thus historically been framed as a trade-off
 between more flexible (higher variance) model classes
 that better fit the training data but risk overfitting,
 versus more rigid (higher bias) model classes
diff --git a/chapter_linear-regression/weight-decay.md b/chapter_linear-regression/weight-decay.md
index ea454c9e2e..6af653a7a5 100644
--- a/chapter_linear-regression/weight-decay.md
+++ b/chapter_linear-regression/weight-decay.md
@@ -137,7 +137,7 @@ To penalize the size of the weight vector,
 we must somehow add $\| \mathbf{w} \|^2$ to the loss function,
 but how should the model trade off the
 standard loss for this new additive penalty?
-In practice, we characterize this tradeoff
+In practice, we characterize this trade-off
 via the *regularization constant* $\lambda$,
 a nonnegative hyperparameter
 that we fit using validation data:
diff --git a/chapter_recurrent-modern/beam-search.md b/chapter_recurrent-modern/beam-search.md
index 2fa951c8f9..3d36e49af9 100644
--- a/chapter_recurrent-modern/beam-search.md
+++ b/chapter_recurrent-modern/beam-search.md
@@ -240,7 +240,7 @@ arising when the beam size is set to 1.
 
 Sequence searching strategies include 
 greedy search, exhaustive search, and beam search.
-Beam search provides a tradeoff between accuracy and 
+Beam search provides a trade-off between accuracy and 
 computational cost via the flexible choice of the beam size.