MOST COMMON MISTAKES IN MACHINE LEARNING AND HOW TO AVOID THEM
Welcome
About the Author
Dedication
Protecting the Neural Tree
Preface
Supplemental Material
Conventions
Acknowledgements
Introduction
Terminology
1
Not understanding the data
2
Reporting train performance
3
Not setting a seed value
4
Including irrelevant features
5
Ignoring differences in scales
6
Using the test set for fine tunning
7
Only reporting accuracy
8
Not comparing against a baseline
9
Not accounting for variance
10
Injecting data into the test set
11
Not shuffling the training data
12
Not saving the results
13
Not parallelizing
14
Encoding categories as integers
15
Forget data changes over time
16
Ignoring inter-user variance
17
Wasting unlabeled data
Appendix
A
Setup Your Environment
B
Datasets
B.1
CALIFORNIA-HOUSING
B.2
DIAGNOSTIC
B.3
DIGITS
B.4
IRIS
B.5
WINE
B.6
WISDM
Citing this Book
References
MOST COMMON MISTAKES IN MACHINE LEARNING AND HOW TO AVOID THEM
Dedication
To My Family, who have put up with me despite my continuous mistakes.