BOOKPRICE.co.kr
책, 도서 가격비교 사이트

인기 검색어

일간

|

주간

|

월간

실시간 검색어

검색가능 서점

알라딘

교보문고

yes24

영풍문고

G마켓

11번가

도서목록 제공

알라딘, 영풍문고, 교보문고

app 다운로드

GooglePlay 다운로드

AppStore 다운로드

QR CODE

Social Media Data Mining and Analytics

Social Media Data Mining and Analytics (Paperback)

가보르 자보 (Gabor Szabo), Oscar Boykin (지은이)

John Wiley & Sons Inc

83,250원

일반도서

검색중

서점	할인가	할인률	배송비	혜택/추가	실질최저가	구매하기
	66,600원	-20%	0원	2,000원	64,600원	>

notice_icon

검색 결과 내에 다른 책이 포함되어 있을 수 있습니다.

중고도서

검색중

서점	유형	등록개수	최저가	구매하기

eBook

검색중

서점	정가	할인가	마일리지	실질최저가	구매하기

책 이미지

Social Media Data Mining and Analytics

eBook 미리보기

책 정보

· 제목 : Social Media Data Mining and Analytics (Paperback)
· 분류 : 외국도서 > 컴퓨터 > 데이터베이스 관리 > 데이터 웨어하우징
· ISBN : 9781118824856
· 쪽수 : 352쪽
· 출판일 : 2018-10-23

목차

Introduction xvii

Chapter 1 Users: TheWho of Social Media 1

Measuring Variations in User Behavior in Wikipedia 2

The Diversity of User Activities 3

The Origin of the User Activity Distribution 12

The Consequences of the Power Law 20

The Long Tail in Human Activities 25

Long Tails Everywhere: The 80/20 Rule (p/q Rule) 28

Online Behavior on Twitter 32

Retrieving Tweets for Users 33

Logarithmic Binning 36

User Activities on Twitter 37

Summary 39

Chapter 2 Networks: The How of Social Media 41

Types and Properties of Social Networks 42

When Users Create the Connections: Explicit Networks 43

Directed Versus Undirected Graphs 45

Node and Edge Properties 45

Weighted Graphs 46

Creating Graphs from Activities: Implicit Networks 48

Visualizing Networks 51

Degrees: The Winner Takes All 55

Counting the Number of Connections 57

The Long Tail in User Connections 58

Beyond the Idealized Network Model 62

Capturing Correlations: Triangles, Clustering, and Assortativity 64

Local Triangles and Clustering 64

Assortativity 70

Summary 75

Chapter 3 Temporal Processes: The When of Social Media 77

What Traditional Models Tell You About Events in Time 77

When Events Happen Uniformly in Time 79

Inter-Event Times 81

Comparing to a Memoryless Process 86

Autocorrelations 89

Deviations from Memorylessness 91

Periodicities in Time in User Activities 93

Bursty Activities of Individuals 99

Correlations and Bursts 105

Reservoir Sampling 106

Forecasting Metrics in Time 110

Finding Trends 112

Finding Seasonality 115

Forecasting Time Series with ARIMA 117

The Autoregressive Part (“AR”) 118

The Moving Average Part (“MA”) 119

The Full ARIMA(p, d, q) Model 119

Summary 121

Chapter 4 Content: The What of Social Media 123

Defining Content: Focus on Text and Unstructured Data 123

Creating Features from Text: The Basics of Natural Language Processing 125

The Basic Statistics of Term Occurrences in Text 128

Using Content Features to Identify Topics 129

The Popularity of Topics 138

How Diverse Are Individual Users’ Interests? 141

Extracting Low-Dimensional Information from High-Dimensional Text 144

Topic Modeling 145

Unsupervised Topic Modeling 147

Supervised Topic Modeling 155

Relational Topic Modeling 162

Summary 169

Chapter 5 Processing Large Datasets 171

Map Reduce: Structuring Parallel and Sequential Operations 172

Counting Words 174

Skew: The Curse of the Last Reducer 177

Multi-Stage MapReduce Flows 179

Fan-Out 180

Merging Data Streams 181

Joining Two Data Sources 183

Joining Against Small Datasets 186

Models of Large-Scale MapReduce 187

Patterns in MapReduce Programming 188

Static MapReduce Jobs 188

Iterative MapReduce Jobs 195

PageRank for Ranking in Graphs 195

K-means Clustering 199

Incremental MapReduce Jobs 203

Temporal MapReduce Jobs 204

Rollups and Data Cubing 205

Expanding Rollup Jobs 211

Challenges with Processing Long-Tailed Social Media Data 212

Sampling and Approximations: Getting Results with Less Computation 214

HyperLogLog 217

HyperLogLog Example 219

HyperLogLog on the Stack Exchange Dataset 221

Performance of HLL on Large Datasets 222

Bloom Filters 223

A Bloom Filter Example 226

Bloom Filter as Pre-Computed Membership Knowledge 228

Bloom Filters on Large Social Datasets 229

Count-Min Sketch 231

Count-Min Sketch—Heavy Hitters Example 233

Count-Min Sketch—Top Percentage Example 235

Aggregating Approximate Data Structures 235

Summary of Approximations 236

Executing on a Hadoop Cluster (Amazon EC2) 237

Installing a CDH Cluster on Amazon EC2 237

Providing IAM Access to Collaborators 241

Adding On-Demand Cluster Capabilities 242

Summary 243

Chapter 6 Learn, Map, and Recommend 245

Social Media Services Online 246

Search Engines 246

Content Engagement 246

Interactions with the Real World 248

Interactions with People 249

Problem Formulation 251

Learning and Mapping 253

Matrix Factorization 255

Learning, Training 257

Under- and Overfitting 257

Regularizing in Matrix Factorization 259

Non-Negative Matrix Factorization and Sparsity 260

Demonstration on Movie Ratings 261

Interpreting the Learned Stereotypes 265

Exploratory Analysis 269

Prediction and Recommendation 274

Evaluation 277

Overview of Methodologies 278

Nearest Neighbor-Based Approaches 278

Approaches Based on Supervised Learning 280

Predicting Movie Ratings with Logistic Regression 280

Common Issues with Features 288

Domain-Specific Applications 289

Summary 290

Chapter 7 Conclusions 293

The Surprising Stability of Human Interaction Patterns 293

Averages, Standard Deviations, and Sampling 296

Removing Outliers 303

Index 309

저자소개

가보르 자보 (Gabor Szabo) (지은이) 정보 더보기

헝가리 출신의 재즈 기타리스트

펼치기

가보르 자보 (Gabor Szabo)의 다른 책 >

Oscar Boykin (지은이) 정보 더보기

펼치기

추천도서

분야의 베스트셀러 >

이 포스팅은 쿠팡 파트너스 활동의 일환으로,

이에 따른 일정액의 수수료를 제공받습니다.

이 포스팅은 제휴마케팅이 포함된 광고로 커미션을 지급 받습니다.

도서 DB 제공 : 알라딘 서점(www.aladin.co.kr)

최근 본 책