책 이미지

책 정보
· 분류 : 외국도서 > 경제경영 > 통계
· ISBN : 9780367473228
· 쪽수 : 342쪽
· 출판일 : 2020-09-01
목차
I Introduction 1. Preface What this book is not about The targeted audience How this book is structured Companion website Why R? Coding instructions Acknowledgements Future developments 2. Notations and data Notations Dataset 3. Introduction Context Portfolio construction: the workflow Machine Learning is no Magic Wand 4. Factor investing and asset pricing anomalies Introduction Detecting anomalies Simple portfolio sorts Factors Predictive regressions, sorts, and p-value issues Fama-Macbeth regressions Factor competition Advanced techniques Factors or characteristics? Hot topics: momentum, timing and ESG Factor momentum Factor timing The green factors The link with machine learning A short list of recent references Explicit connections with asset pricing models Coding exercises 5. Data preprocessing Know your data Missing data Outlier detection Feature engineering Feature selection Scaling the predictors Labelling Simple labels Categorical labels The triple barrier method Filtering the sample Return horizons Handling persistence Extensions Transforming features Macro-economic variables Active learning Additional code and results Impact of rescaling: graphical representation Impact of rescaling: toy example Coding exercises II Common supervised algorithms 6. Penalized regressions and sparse hedging for minimum variance portfoliosPenalised regressions Simple regressions Forms of penalizations Illustrations Sparse hedging for minimum variance portfolios Presentation and derivations Example Predictive regressions Literature review and principle Code and results Coding exercise 7. Tree-based methods Simple trees Principle Further details on classification Pruning criteria Code and interpretation Random forests Principle Code and results Boosted trees: Adaboost Methodology Illustration Boosted trees: extreme gradient boosting Managing Loss Penalisation Aggregation Tree structure Extensions Code and results Instance weighting Discussion Coding exercises 8. Neural networks The original perceptron Multilayer perceptron (MLP) Introduction and notations Universal approximation Learning via back-propagation Further details on classification How deep should we go? And other practical issues Architectural choices Frequency of weight updates and learning duration Penalizations and dropout Code samples and comments for vanilla MLP Regression example Classification example Custom losses Recurrent networks Presentation Code and results Other common architectures Generative adversarial networks Auto-encoders A word on convolutional networks Advanced architectures Coding exercise 9. Support vector machines SVM for classification SVM for regression Practice Coding exercises 10. Bayesian methods The Bayesian framework Bayesian sampling Gibbs sampling Metropolis-Hastings sampling Bayesian linear regression Naive Bayes classifier Bayesian additive trees General formulation Priors Sampling and predictions Code III From predictions to portfolios 11. Validating and tuning Learning metrics Regression analysis Classification analysis Validation The variance-bias tradeoff: theory The variance-bias tradeoff: illustration The risk of overfitting: principle The risk of overfitting: some solutions The search for good hyperparameters Methods Example: grid search Example: Bayesian optimization Short discussion on validation in backtests 12.Ensemble models Linear ensembles Principles Example Stacked ensembles Two stage training Code and results Extensions Exogenous variables Shrinking inter-model correlations Exercise 13. Portfolio backtesting Setting the protocol Turning signals into portfolio weights Performance metrics Discussion Pure performance and risk indicators Factor-based evaluation Risk-adjusted measures Transaction costs and turnover Common errors and issues Forward looking data Backtest overfitting Simple safeguards Implication of non-stationarity: forecasting is hard General comments The no free lunch theorem Example Coding exercises IV Further important topics 14.Interpretability Global interpretations Simple models as surrogates Variable importance (tree-based) Variable importance (agnostic) Partial dependence plot Local interpretations LIME Shapley values Breakdown 15. Two key concepts: causality and non-stationarity Causality Granger causality Causal additive models Structural time-series models Dealing with changing environments Non-stationarity: yet another illustration Online learning Homogeneous transfer learning 16. Unsupervised learning The problem with correlated predictors Principal component analysis and autoencoders A bit of algebra PCA Autoencoders Application Clustering via k-means Nearest neighbors Coding exercise 17. Reinforcement learning Theoretical layout General framework Q-learning SARSA The curse of dimensionality Policy gradient Principle Extensions Simple examples Q-learning with simulations Q-learning with market data Concluding remarks Exercises V Appendix Data Description Solution to exercises