Hands-On Machine Learning with R

Hands-On Machine Learning with R

Boehmke, Brad; Greenwell, Brandon M.

Taylor & Francis Ltd

11/2019

484

Dura

Inglês

9781138495685

15 a 20 dias

938

Descrição não disponível.
I FUNDAMENTALS 1. Introduction to Machine Learning 1.1 Supervised learning 1.1.1 Regression problems 1.1.2 Classification problems 1.2 Unsupervised learning 1.3 Roadmap 1.4 The data sets 2. Modeling Process 2.1 Prerequisites 2.2 Data splitting 2.2.1 Simple random sampling 2.2.2 Stratified sampling 2.2.3 Class imbalances 2.3 Creating models in R 2.3.1 Many formula interfaces 2.3.2 Many engines 2.4 Resampling methods 2.4.1 k-fold cross validation 2.4.2 Bootstrapping 2.4.3 Alternatives 2.5 Bias variance trade-off 2.5.1 Bias 2.5.2 Variance 2.5.3 Hyperparameter tuning 2.6 Model evaluation 2.6.1 Regression models 2.6.2 Classification models 2.7 Putting the processes together 3. Feature & Target Engineering 3.1 Prerequisites 3.2 Target engineering 3.3 Dealing with missingness 3.3.1 Visualizing missing values 3.3.2 Imputation 3.4 Feature filtering 3.5 Numeric feature engineering 3.5.1 Skewness 3.5.2 Standardization 3.6 Categorical feature engineering 3.6.1 Lumping 3.6.2 One-hot & dummy encoding 3.6.3 Label encoding 3.6.4 Alternatives 3.7 Dimension reduction 3.8 Proper implementation 3.8.1 Sequential steps 3.8.2 Data leakage 3.8.3 Putting the process together II SUPERVISED LEARNING 4. Linear Regression 4.1 Prerequisites 4.2 Simple linear regression 4.2.1 Estimation 4.2.2 Inference 4.3 Multiple linear regression 4.4 Assessing model accuracy 4.5 Model concerns 4.6 Principal component regression 4.7 Partial least squares 4.8 Feature interpretation 4.9 Final thoughts 5. Logistic Regression 5.1 Prerequisites 5.2 Why logistic regression 5.3 Simple logistic regression 5.4 Multiple logistic regression 5.5 Assessing model accuracy 5.6 Model concerns 5.7 Feature interpretation 5.8 Final thoughts 6. Regularized Regression 6.1 Prerequisites 6.2 Why regularize? 6.2.1 Ridge penalty 6.2.2 Lasso penalty 6.2.3 Elastic nets 6.3 Implementation 6.4 Tuning 6.5 Feature interpretation 6.6 Attrition data 6.7 Final thoughts 7. Multivariate Adaptive Regression Splines 7.1 Prerequisites 7.2 The basic idea 7.2.1 Multivariate regression splines 7.3 Fitting a basic MARS model 7.4 Tuning 7.5 Feature interpretation 7.6 Attrition data 7.7 Final thoughts 8. K-Nearest Neighbors 8.1 Prerequisites 8.2 Measuring similarity 8.2.1 Distance measures 8.2.2 Pre-processing 8.3 Choosing k 8.4 MNIST example 8.5 Final thoughts 9 Decision Trees 9.1 Prerequisites 9.2 Structure 9.3 Partitioning 9.4 How deep? 9.4.1 Early stopping 9.4.2 Pruning 9.5 Ames housing example 9.6 Feature interpretation 9.7 Final thoughts 10. Bagging 10.1 Prerequisites 10.2 Why and when bagging works 10.3 Implementation 10.4 Easily parallelize 10.5 Feature interpretation 10.6 Final thoughts 11. Random Forests 11.1 Prerequisites 11.2 Extending bagging 11.3 Out-of-the-box performance 11.4 Hyperparameters 11.4.1 Number of trees 11.4.2 mtry 11.4.3 Tree complexity 11.4.4 Sampling scheme 11.4.5 Split rule 11.5 Tuning strategies 11.6 Feature interpretation 11.7 Final thoughts 12. Gradient Boosting 12.1 Prerequisites 12.2 How boosting works 12.2.1 A sequential ensemble approach 12.2.2 Gradient descent 12.3 Basic GBM 12.3.1 Hyperparameters 12.3.2 Implementation 12.3.3 General tuning strategy 12.4 Stochastic GBMs 12.4.1 Stochastic hyperparameters 12.4.2 Implementation 12.5 XGBoost 12.5.1 XGBoost hyperparameters 12.5.2 Tuning strategy 12.6 Feature interpretation 12.7 Final thoughts 13. Deep Learning 13.1 Prerequisites 13.2 Why deep learning 13.3 Feedforward DNNs 13.4 Network architecture 13.4.1 Layers and nodes 13.4.2 Activation 13.5 Backpropagation 13.6 Model training 13.7 Model tuning 13.7.1 Model capacity 13.7.2 Batch normalization 13.7.3 Regularization 13.7.4 Adjust learning rate 13.8 Grid Search 13.9 Final thoughts 14. Support Vector Machines 14.1 Prerequisites 14.2 Optimal separating hyperplanes 14.2.1 The hard margin classifier 14.2.2 The soft margin classifier 14.3 The support vector machine 14.3.1 More than two classes 14.3.2 Support vector regression 14.4 Job attrition example 14.4.1 Class weights 14.4.2 Class probabilities 14.5 Feature interpretation 14.6 Final thoughts 15. Stacked Models 15.1 Prerequisites 15.2 The Idea 15.2.1 Common ensemble methods 15.2.2 Super learner algorithm 15.2.3 Available packages 15.3 Stacking existing models 15.4 Stacking a grid search 15.5 Automated machine learning 15.6 Final thoughts 16. Interpretable Machine Learning 16.1 Prerequisites 16.2 The idea 16.2.1 Global interpretation 16.2.2 Local interpretation 16.2.3 Model-specific vs. model-agnostic 16.3 Permutation-based feature importance 16.3.1 Concept 16.3.2 Implementation 16.4 Partial dependence 16.4.1 Concept 16.4.2 Implementation 16.4.3 Alternative uses 16.5 Individual conditional expectation 16.5.1 Concept 16.5.2 Implementation 16.6 Feature interactions 16.6.1 Concept 16.6.2 Implementation 16.6.3 Alternatives 16.7 Local interpretable model-agnostic explanations 16.7.1 Concept 16.7.2 Implementation 16.7.3 Tuning 16.7.4 Alternative uses 16.8 Shapley values 16.8.1 Concept 16.8.2 Implementation 16.8.3 XGBoost and built-in Shapley values 16.9 Localized step-wise procedure 16.9.1 Concept 16.9.2 Implementation 16.10Final thoughts III DIMENSION REDUCTION 17. Principal Components Analysis 17.1 Prerequisites 17.2 The idea 17.3 Finding principal components 17.4 Performing PCA in R 17.5 Selecting the number of principal components 17.5.1 Eigenvalue criterion 17.5.2 Proportion of variance explained criterion 17.5.3 Scree plot criterion 17.6 Final thoughts 18. Generalized Low Rank Models 18.1 Prerequisites 18.2 The idea 18.3 Finding the lower ranks 18.3.1 Alternating minimization 18.3.2 Loss functions 18.3.3 Regularization 18.3.4 Selecting k 18.4 Fitting GLRMs in R 18.4.1 Basic GLRM model 18.4.2 Tuning to optimize for unseen data 18.5 Final thoughts 19. Autoencoders 19.1 Prerequisites 19.2 Undercomplete autoencoders 19.2.1 Comparing PCA to an autoencoder 19.2.2 Stacked autoencoders 19.2.3 Visualizing the reconstruction 19.3 Sparse autoencoders 19.4 Denoising autoencoders 19.5 Anomaly detection 19.6 Final thoughts IV Clustering 20. K-means Clustering 20.1 Prerequisites 20.2 Distance measures 20.3 Defining clusters 20.4 k-means algorithm 20.5 Clustering digits 20.6 How many clusters? 20.7 Clustering with mixed data 20.8 Alternative partitioning methods 20.9 Final thoughts 21. Hierarchical Clustering 21.1 Prerequisites 21.2 Hierarchical clustering algorithms 21.3 Hierarchical clustering in R 21.3.1 Agglomerative hierarchical clustering 21.3.2 Divisive hierarchical clustering 21.4 Determining optimal clusters 21.5 Working with dendrograms 21.6 Final thoughts 22. Model-based Clustering 22.1 Prerequisites 22.2 Measuring probability and uncertainty 22.3 Covariance types 22.4 Model selection 22.5 My basket example 22.6 Final thoughts Bibliography Index
Este título pertence ao(s) assunto(s) indicados(s). Para ver outros títulos clique no assunto desejado.
Roc Curve;Partial Dependence Plots;regularized regression;Variable Importance Scores;random forests;Bagged Decision Trees;R packages;Variable Importance Measures;machine learning methods;GLM Model;gradient boosting machines;Data Set;Grid Search;Mars Model;Lower RMSE;Random Forest;Hyperparameter Tuning;Knn Model;Support Vector Machines;Ml Algorithm;MLR;Elbow Method;Silhouette Method;PLS;Hidden Layers;Loss Function;Minimum MSE;Stochastic Gradient Boosting;Mars