BOOKPRICE.co.kr
책, 도서 가격비교 사이트

인기 검색어

일간

|

주간

|

월간

실시간 검색어

검색가능 서점

알라딘

교보문고

yes24

영풍문고

G마켓

11번가

도서목록 제공

알라딘, 영풍문고, 교보문고

app 다운로드

GooglePlay 다운로드

AppStore 다운로드

QR CODE

Data Mining for Business Analytics: Concepts, Techniques, and Applications with Jmp Pro

Data Mining for Business Analytics: Concepts, Techniques, and Applications with Jmp Pro (Hardcover)

갈리트 시뮤엘리, Nitin R. Patel, Peter C. Bruce, Mia L. Stephens (지은이)

John Wiley & Sons Inc

249,930원

일반도서

검색중

서점	할인가	할인률	배송비	혜택/추가	실질최저가	구매하기

notice_icon

검색 결과 내에 다른 책이 포함되어 있을 수 있습니다.

중고도서

검색중

서점	유형	등록개수	최저가	구매하기

eBook

검색중

서점	정가	할인가	마일리지	실질최저가	구매하기

책 이미지

Data Mining for Business Analytics: Concepts, Techniques, and Applications with Jmp Pro

eBook 미리보기

책 정보

· 제목 : Data Mining for Business Analytics: Concepts, Techniques, and Applications with Jmp Pro (Hardcover)
· 분류 : 외국도서 > 과학/수학/생태 > 수학 > 확률과 통계 > 일반
· ISBN : 9781118877432
· 쪽수 : 464쪽
· 출판일 : 2016-06-13

목차

Dedication i

Foreword xvii

Preface xviii

Acknowledgments xx

PART I PRELIMINARIES

CHAPTER 1 Introduction 3

1.1 What is Business Analytics? 3

1.2 What is Data Mining? 5

1.3 Data Mining and Related Terms 5

1.4 Big Data 6

1.5 Data science 7

1.6 Why Are There So Many Different Methods? 8

1.7 Terminology and Notation 9

1.8 Road Maps to This Book 11

Order of Topics 12

CHAPTER 2 Overview of the Data Mining Process 15

2.1 Introduction 15

2.2 Core Ideas in Data Mining 16

2.3 The Steps in Data Mining 19

2.4 Preliminary Steps 20

2.5 Predictive Power and Overfitting 28

2.6 Building a Predictive Model with JMP Pro 33

2.7 Using JMP Pro for Data Mining 42

2.8 Automating Data Mining Solutions 42

Data Mining Software Tools (Herb Edelstein) 44

Problems 47

PART II DATA EXPLORATION AND DIMENSION REDUCTION

CHAPTER 3 Data Visualization 52

3.1 Uses of Data Visualization 52

3.2 Data Examples 54

Example 1: Boston Housing Data 54

Example 2: Ridership on Amtrak Trains 55

3.3 Basic Charts: Bar Charts, Line Graphs, and Scatterplots 55

Distribution Plots 58

Heatmaps: visualizing correlations and missing values 61

3.4 Multi-Dimensional Visualization 63

Adding Variables: Color, Hue, Size, Shape, Multiple Panels, Animation 63

Manipulations: Re-scaling, Aggregation and Hierarchies, Zooming and Panning, Filtering 67

Reference: Trend Line and Labels 70

Scaling Up: Large Datasets 72

Multivariate Plot: Parallel Coordinates Plot 73

Interactive Visualization 74

3.5 Specialized Visualizations 76

Visualizing Networked Data 76

Visualizing Hierarchical Data: Treemaps 77

Visualizing Geographical Data: Maps 78

3.6 Summary of Major Visualizations and Operations, According to Data Mining Goal 80

Prediction 80

Classification 81

Time Series Forecasting 81

Unsupervised Learning 82

Problems 83

CHAPTER 4 Dimension Reduction 85

4.1 Introduction 85

4.2 Curse of Dimensionality 86

4.3 Practical Considerations 86

Example 1: House Prices in Boston 87

4.4 Data Summaries 88

4.5 Correlation Analysis 91

4.6 Reducing the Number of Categories in Categorical Variables 92

4.7 Converting A Categorical Variable to A Continuous Variable 94

4.8 Principal Components Analysis 94

Example 2: Breakfast Cereals 95

Principal Components 101

Normalizing the Data 102

Using Principal Components for Classification and Prediction 104

4.9 Dimension Reduction Using Regression Models 104

4.10 Dimension Reduction Using Classification and Regression Trees 106

Problems 107

PART III PERFORMANCE EVALUATION

CHAPTER 5 Evaluating Predictive Performance 111

5.1 Introduction 111

5.2 Evaluating Predictive Performance 112

Benchmark: The Average 112

Prediction Accuracy Measures 113

5.3 Judging Classifier Performance 115

Benchmark: The Naive Rule 115

Class Separation 115

The Classification Matrix 116

Using the Validation Data 117

Accuracy Measures 117

Cutoff for Classification 118

Performance in Unequal Importance of Classes 122

Asymmetric Misclassification Costs 123

5.4 Judging Ranking Performance 127

5.5 Oversampling 131

Problems 138

PART IV PREDICTION AND CLASSIFICATION METHODS

CHAPTER 6 Multiple Linear Regression 141

6.1 Introduction 141

6.2 Explanatory vs. Predictive Modeling 142

6.3 Estimating the Regression Equation and Prediction 143

Example: Predicting the Price of Used Toyota Corolla Automobiles . 144

6.4 Variable Selection in Linear Regression 149

Reducing the Number of Predictors 149

How to Reduce the Number of Predictors 150

Manual Variable Selection 151

Automated Variable Selection 151

Problems 160

CHAPTER 7 k-Nearest Neighbors (kNN) 165

7.1 The k-NN Classifier (categorical outcome) 165

Determining Neighbors 165

Classification Rule 166

Example: Riding Mowers 166

Choosing k 167

Setting the Cutoff Value 169

7.2 k-NN for a Numerical Response 171

7.3 Advantages and Shortcomings of k-NN Algorithms 172

Problems 174

CHAPTER 8 The Naive Bayes Classifier 176

8.1 Introduction 176

Example 1: Predicting Fraudulent Financial Reporting 177

8.2 Applying the Full (Exact) Bayesian Classifier 178

8.3 Advantages and Shortcomings of the Naive Bayes Classifier 187

Advantages and Shortcomings of the naive Bayes Classifier 187

Problems 191

CHAPTER 9 Classification and Regression Trees 194

9.1 Introduction 194

9.2 Classification Trees 195

Example 1: Riding Mowers 196

9.3 Growing a Tree 198

Growing a Tree Example 198

Growing a Tree with CART 203

9.4 Evaluating the Performance of a Classification Tree 203

Example 2: Acceptance of Personal Loan 203

9.5 Avoiding Overfitting 204

Stopping Tree Growth: CHAID 205

Pruning the Tree 207

9.6 Classification Rules from Trees 208

9.7 Classification Trees for More Than two Classes 210

9.8 Regression Trees 210

Prediction 213

Evaluating Performance 214

9.9 Advantages and Weaknesses of a Tree 214

9.10 Improving Prediction: Multiple Trees 216

9.11 CART, and Measures of Impurity 218

Measuring Impurity 218

Problems 221

CHAPTER 10 Logistic Regression 224

10.1 Introduction 224

10.2 The Logistic Regression Model 226

Example: Acceptance of Personal Loan 227

Model with a Single Predictor 229

Estimating the Logistic Model from Data: Computing Parameter Estimates 231

10.3 Evaluating Classification Performance 234

Variable Selection 236

10.4 Example of Complete Analysis: Predicting Delayed Flights 237

Data Preprocessing 240

Model Fitting, Estimation and Interpretation - A Simple Model 240

Model Fitting, Estimation and Interpretation - The Full Model 241

Model Performance 243

Variable Selection 245

10.5 Appendix: Logistic Regression for Profiling 249

Appendix A: Why Linear Regression Is Inappropriate for a Categorical Response 249

Appendix B: Evaluating Explanatory Power 250

Appendix C: Logistic Regression for More Than Two Classes 253

Problems 257

CHAPTER 11 Neural Nets 260

11.1 Introduction 260

11.2 Concept and Structure of a Neural Network 261

11.3 Fitting a Network to Data 261

Example 1: Tiny Dataset 262

Computing Output of Nodes 263

Preprocessing the Data 266

Training the Model 267

Using the Output for Prediction and Classification 272

Example 2: Classifying Accident Severity 273

Avoiding overfitting 275

11.4 User Input in JMP Pro 277

11.5 Exploring the Relationship Between Predictors and Response 280

11.6 Advantages and Weaknesses of Neural Networks 281

Problems 282

CHAPTER 12 Discriminant Analysis 284

12.1 Introduction 284

Example 1: Riding Mowers 285

Example 2: Personal Loan Acceptance 285

12.2 Distance of an Observation from a Class 286

12.3 From Distances to Propensities and Classifications 288

12.4 Classification Performance of Discriminant Analysis 292

12.5 Prior Probabilities 293

12.6 Classifying More Than Two Classes 294

Example 3: Medical Dispatch to Accident Scenes 294

12.7 Advantages and Weaknesses 296

Problems 299

CHAPTER 13 Combining Methods: Ensembles and Uplift Modeling 302

13.1 Ensembles 303

Why Ensembles Can Improve Predictive Power 303

Simple Averaging 305

Bagging 306

Boosting 306

Advantages and Weaknesses of Ensembles 307

13.2 Uplift (Persuasion) Modeling 308

A-B Testing 308

Uplift 308

Gathering the Data 309

A Simple Model 310

Modeling Individual Uplift 311

Using the Results of an Uplift Model 312

Creating Uplift Models in JMP Pro 313

13.3 Summary 315

Problems 316

PART V MINING RELATIONSHIPS AMONG RECORDS

CHAPTER 14 Cluster Analysis 320

14.1 Introduction 320

Example: Public Utilities 322

14.2 Measuring Distance Between Two Observations 324

Euclidean Distance 324

Normalizing Numerical Measurements 324

Other Distance Measures for Numerical Data 326

Distance Measures for Categorical Data 327

Distance Measures for Mixed Data 327

14.3 Measuring Distance Between Two Clusters 328

14.4 Hierarchical (Agglomerative) Clustering 330

Single Linkage 332

Complete Linkage 332

Average Linkage 333

Centroid Linkage 333

Dendrograms: Displaying Clustering Process and Results 334

Validating Clusters 335

Limitations of Hierarchical Clustering 339

14.5 Nonhierarchical Clustering: The k-Means Algorithm 340

Initial Partition into k Clusters 342

Problems 350

PART VI FORECASTING TIME SERIES

CHAPTER 15 Handling Time Series 355

15.1 Introduction 355

15.2 Descriptive vs. Predictive Modeling 356

15.3 Popular Forecasting Methods in Business 357

Combining Methods 357

15.4 Time Series Components 358

Example: Ridership on Amtrak Trains 358

15.5 Data Partitioning and Performance Evaluation 362

Benchmark Performance: Naive Forecasts 362

Generating Future Forecasts 363

Problems 365

CHAPTER 16 Regression-Based Forecasting 368

16.1 A Model with Trend 368

Linear Trend 368

Exponential Trend 372

Polynomial Trend 374

16.2 A Model with Seasonality 375

16.3 A Model with Trend and Seasonality 378

16.4 Autocorrelation and ARIMA Models 378

Computing Autocorrelation 380

Computing Autocorrelation 380

Improving Forecasts by Integrating Autocorrelation Information 383

Improving Forecasts by Integrating Autocorrelation Information383

Fitting AR Models to Residuals 384

Fitting AR Models to Residuals 384

Evaluating Predictability 387

Evaluating Predictability 387

Problems 389

CHAPTER 17 Smoothing Methods 399

17.1 Introduction 399

17.2 Moving Average 400

Centered Moving Average for Visualization 400

Trailing Moving Average for Forecasting 401

Choosing Window Width (w) 404

17.3 Simple Exponential Smoothing 405

Choosing Smoothing Parameter 406

Relation Between Moving Average and Simple Exponential Smoothing 408

17.4 Advanced Exponential Smoothing 409

Series with a trend 409

Series with a Trend and Seasonality 410

Problems 414

PART VII CASES

CHAPTER 18 Cases 425

18.1 Charles Book Club 425

18.2 German Credit 434

Background 434

Data 434

18.3 Tayko Software Cataloger 439

18.4 Political Persuasion 442

Background 442

Predictive Analytics Arrives in US Politics 442

Political Targeting 442

Uplift 443

Data 444

Assignment 444

18.5 Taxi Cancellations 446

Business Situation 446

Assignment 446

18.6 Segmenting Consumers of Bath Soap 448

Appendix 451

18.7 Direct-Mail Fundraising 452

18.8 Predicting Bankruptcy 455

18.9 Time Series Case: Forecasting Public Transportation Demand 458

References 460

Data Files Used in the Book 461

Index 463

저자소개

갈리트 시뮤엘리 (지은이) 정보 더보기

현재 대만 국립 칭화대학교 서비스 사이언스 연구소의 칭화 특훈교수이며, 베스트셀러인 비즈니스를 위한 데이터마이닝 책의 공동저자이다. 그동안 관련 분야에서 다수의 전문서적을 출간하였으며, 최고 학술지에 다수의 논문을 게재하였다. 또한, 시뮤엘리 교수는 인도 경영대학, 미국 메릴랜드대학교 스미스 경영대학원, 인도 경영대학 대만 국립칭화대학교, Statistics.com 등에서 예측, 데이터마이닝, 통계학, 기타 데이터 분석 등의 과목을 설계하고, 강의한 경력이 있다.

펼치기

갈리트 시뮤엘리의 다른 책 >

Nitin R. Patel (지은이) 정보 더보기

메사추세츠주 케임브리지에 소재한 싸이텔(Cytel) 주식회사의 공동 창업자로, 현재 이사로 재직 중이다. 미국 통계학회의 펠로우로서 MIT와 하버드대학교의 방문 교수를 역임하였다.

펼치기

Nitin R. Patel의 다른 책 >

Peter C. Bruce (지은이) 정보 더보기

통계 교육 기관인 Statistics.com의 설립자다. Resampling Stats 소프트웨어의 개발자로서 다수의 저널 논문을 출판하였고 ≪Practical Statistics for Data Scientists≫(한국어판: ≪데이터 과학을 위한 통계≫(한빛미디어, 2021)의 공동 저자로 참여하였다.

펼치기

Peter C. Bruce의 다른 책 >

Mia L. Stephens (지은이) 정보 더보기

펼치기

Mia L. Stephens의 다른 책 >

추천도서

분야의 베스트셀러 >

이 포스팅은 쿠팡 파트너스 활동의 일환으로,

이에 따른 일정액의 수수료를 제공받습니다.

이 포스팅은 제휴마케팅이 포함된 광고로 커미션을 지급 받습니다.

도서 DB 제공 : 알라딘 서점(www.aladin.co.kr)

최근 본 책