logo
logo
x
바코드검색
BOOKPRICE.co.kr
책, 도서 가격비교 사이트
바코드검색

인기 검색어

실시간 검색어

검색가능 서점

도서목록 제공

[eBook Code] Markov Decision Processes in Artificial Intelligence

[eBook Code] Markov Decision Processes in Artificial Intelligence (eBook Code, 1st)

Olivier Sigaud, Olivier Buffet (엮은이)
  |  
Wiley-ISTE
2013-03-04
  |  
275,150원

일반도서

검색중
서점 할인가 할인률 배송비 혜택/추가 실질최저가 구매하기
알라딘 220,120원 -20% 0원 0원 220,120원 >
yes24 로딩중
교보문고 로딩중
notice_icon 검색 결과 내에 다른 책이 포함되어 있을 수 있습니다.

중고도서

검색중
로딩중

e-Book

검색중
서점 정가 할인가 마일리지 실질최저가 구매하기
로딩중

해외직구

책 이미지

[eBook Code] Markov Decision Processes in Artificial Intelligence

책 정보

· 제목 : [eBook Code] Markov Decision Processes in Artificial Intelligence (eBook Code, 1st) 
· 분류 : 외국도서 > 기술공학 > 기술공학 > 전자공학 > 일반
· ISBN : 9781118620106
· 쪽수 : 480쪽

목차

Preface xvii

List of Authors xix

PART 1. MDPS: MODELS AND METHODS 1

Chapter 1. Markov Decision Processes 3
Frédérick GARCIA and Emmanuel RACHELSON

1.1. Introduction 3

1.2. Markov decision problems 4

1.3. Value functions 9

1.4. Markov policies 12

1.5. Characterization of optimal policies 14

1.6. Optimization algorithms for MDPs 28

1.7. Conclusion and outlook 37

1.8. Bibliography 37

Chapter 2. Reinforcement Learning 39
Olivier SIGAUD and Frédérick GARCIA

2.1. Introduction 39

2.2. Reinforcement learning: a global view 40

2.3. Monte Carlo methods 45

2.4. From Monte Carlo to temporal difference methods 45

2.5. Temporal difference methods 46

2.6. Model-based methods: learning a model 59

2.7. Conclusion 63

2.8. Bibliography 63

Chapter 3. Approximate Dynamic Programming 67
Rémi MUNOS

3.1. Introduction 68

3.2. Approximate value iteration (AVI) 70

3.3. Approximate policy iteration (API) 77

3.4. Direct minimization of the Bellman residual 87

3.5. Towards an analysis of dynamic programming in Lp-norm 88

3.6. Conclusions 93

3.7. Bibliography 93

Chapter 4. Factored Markov Decision Processes 99
Thomas DEGRIS and Olivier SIGAUD

4.1. Introduction 99

4.2. Modeling a problem with an FMDP 100

4.3. Planning with FMDPs 108

4.4. Perspectives and conclusion 122

4.5. Bibliography 123

Chapter 5. Policy-Gradient Algorithms 127
Olivier BUFFET

5.1. Reminder about the notion of gradient 128

5.2. Optimizing a parameterized policy with a gradient algorithm 130

5.3. Actor-critic methods 143

5.4. Complements 147

5.5. Conclusion 150

5.6. Bibliography 150

Chapter 6. Online Resolution Techniques 153
Laurent PÉRET and Frédérick GARCIA

6.1. Introduction 153

6.2. Online algorithms for solving an MDP 155

6.3. Controlling the search 167

6.4. Conclusion 180

6.5. Bibliography 180

PART 2. BEYOND MDPS 185

Chapter 7. Partially Observable Markov Decision Processes 187
Alain DUTECH and Bruno SCHERRER

7.1. Formal definitions for POMDPs 188

7.2. Non-Markovian problems: incomplete information 196

7.3. Computation of an exact policy on information states 202

7.4. Exact value iteration algorithms 207

7.5. Policy iteration algorithms 222

7.6. Conclusion and perspectives 223

7.7. Bibliography 225

Chapter 8. Stochastic Games 229
Andriy BURKOV, Laëtitia MATIGNON and Brahim CHAIB-DRAA

8.1. Introduction 229

8.2. Background on game theory 230

8.3. Stochastic games 245

8.4. Conclusion and outlook 269

8.5. Bibliography 270

Chapter 9. DEC-MDP/POMDP 277
Aurélie BEYNIER, François CHARPILLET, Daniel SZER and Abdel-Illah MOUADDIB

9.1. Introduction 277

9.2. Preliminaries 278

9.3. Multi agent Markov decision processes 279

9.4. Decentralized control and local observability 280

9.5. Sub-classes of DEC-POMDPs 285

9.6. Algorithms for solving DEC-POMDPs 295

9.7. Applicative scenario: multirobot exploration 310

9.8. Conclusion and outlook . . . 312

9.9. Bibliography 313

Chapter 10. Non-Standard Criteria 319
Matthieu BOUSSARD, Maroua BOUZID, Abdel-Illah MOUADDIB, Régis SABBADIN and Paul WENG

10.1. Introduction 319

10.2. Multicriteria approaches 320

10.3. Robustness in MDPs 327

10.4. Possibilistic MDPs 329

10.5. Algebraic MDPs 342

10.6. Conclusion 354

10.7. Bibliography 355

PART 3. APPLICATIONS 361

Chapter 11. Online Learning for Micro-Object Manipulation 363
Guillaume LAURENT

11.1. Introduction 363

11.2. Manipulation device 364

11.3. Choice of the reinforcement learning algorithm 367

11.4. Experimental results 370

11.5. Conclusion 373

11.6. Bibliography 373

Chapter 12. Conservation of Biodiversity 375
Iadine CHADÈS

12.1. Introduction 375

12.2. When to protect, survey or surrender cryptic endangered species 376

12.3. Can sea otters and abalone co-exist? 381

12.4. Other applications in conservation biology and discussions 391

12.5. Bibliography 392

Chapter 13. Autonomous Helicopter Searching for a Landing Area in an Uncertain Environment 395
Patrick FABIANI and Florent TEICHTEIL-KÖNIGSBUCH

13.1. Introduction 395

13.2. Exploration scenario 397

13.3. Embedded control and decision architecture 401

13.4. Incremental stochastic dynamic programming 404

13.5. Flight tests and return on experience 407

13.6. Conclusion 410

13.7. Bibliography 410

Chapter 14. Resource Consumption Control for an Autonomous Robot 413
Simon LE GLOANNEC and Abdel-Illah MOUADDIB

14.1. The rover’s mission 414

14.2. Progressive processing formalism 415

14.3. MDP/PRU model 416

14.4. Policy calculation 418

14.5. How to model a real mission 419

14.6. Extensions 422

14.7. Conclusion 423

14.8. Bibliography 423

Chapter 15. Operations Planning 425
Sylvie THIÉBAUX and Olivier BUFFET

15.1. Operations planning 425

15.2. MDP value function approaches 433

15.3. Reinforcement learning: FPG 442

15.4. Experiments 446

15.5. Conclusion and outlook 448

15.6. Bibliography 450

Index 453

저자소개

Olivier Sigaud (엮은이)    정보 더보기
펼치기
Olivier Buffet (엮은이)    정보 더보기
펼치기
이 포스팅은 쿠팡 파트너스 활동의 일환으로,
이에 따른 일정액의 수수료를 제공받습니다.
도서 DB 제공 : 알라딘 서점(www.aladin.co.kr)
최근 본 책