BOOKPRICE.co.kr
책, 도서 가격비교 사이트

인기 검색어

일간

|

주간

|

월간

실시간 검색어

검색가능 서점

알라딘

교보문고

yes24

영풍문고

G마켓

11번가

도서목록 제공

알라딘, 영풍문고, 교보문고

app 다운로드

GooglePlay 다운로드

AppStore 다운로드

QR CODE

Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym

Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym (Paperback)

Nimish Sanghi (지은이)

Apress

81,200원

일반도서

검색중

서점	할인가	할인률	배송비	혜택/추가	실질최저가	구매하기
	66,580원	-18%	0원	3,330원	63,250원	>

notice_icon

검색 결과 내에 다른 책이 포함되어 있을 수 있습니다.

중고도서

검색중

서점	유형	등록개수	최저가	구매하기

eBook

검색중

서점	정가	할인가	마일리지	실질최저가	구매하기

책 이미지

Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym

eBook 미리보기

책 정보

· 제목 : Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym (Paperback)
· 분류 : 외국도서 > 컴퓨터 > 인공지능(AI)
· ISBN : 9781484268087
· 쪽수 : 382쪽
· 출판일 : 2021-04-02

목차

Chapter 1: Deep Reinforcement Learning

Chapter Goal: Introduce the reader to field of reinforcement learning and setting the context of what they will learn in rest of the book

Sub -Topics

1. Deep reinforcement learning

2. Examples and case studies

3. Types of algorithms with mind-map

4. Libraries and environment setup

5. Summary

Chapter 2: Markov Decision Processes

Chapter Goal: Help the reader understand models, foundations on which all algorithms are built.

Sub - Topics

1. Agent and environment

2. Rewards

3. Markov reward and decision processes

4. Policies and value functions

5. Bellman equations

Chapter 3: Model Based Algorithms Using Dynamic Programming

Chapter Goal: Introduce reader to dynamic programming and related algorithms

Sub - Topics:

1. Introduction to OpenAI Gym environment

2. Policy evaluation/prediction

3. Policy iteration and improvement

4. Generalised policy iteration

5. Value iteration

Chapter 4: Model Free Solution Approaches

Chapter Goal: Introduce Reader to model free methods which form the basis for majority of current solutions

Sub - Topics:

1. Prediction and control with Monte Carlo methods

2. Exploration vs exploitation

3. TD learning methods

4. TD control

5. On policy learning using SARSA

6. Off policy learning using q-learning

Chapter 5: Function Approximation and Deep Learning in Reinforcement Learning

Chapter Goal: Help readers understand value function approximation and Deep Learning use in Reinforcement Learning.

1. Limitations to tabular methods studied so far

2. Value function approximation

3. Linear methods and features used

4. Non linear function approximation using deep Learning

Chapter 6: Deep Q - Learning

Chapter Goal: Help readers understand core use of deep learning in reinforcement learning. Deep q learning and many of its variants are introduced here with in depth code exercises.

1. Deep q-networks (DQN)

2. Issues in Naive DQN

3. Introduce experience replay and target networks

4. Double q-learning (DDQN)

5. Duelling DQN

6. Categorical 51-atom DQN (C51)

7. Quantile regression DQN (QR-DQN)

8. Hindsight experience replay (HER)

Chapter 7: Policy Gradient Algorithms

Chapter Goal: Introduce reader to concept of policy gradients and related theory. Gain in depth knowledge of common policy gradient methods through hands-on exercises

1. Policy gradient approach and its advantages

2. The policy gradient theorem

3. REINFORCE algorithm

4. REINFORCE with baseline

5. Actor-critic methods

6. Advantage actor critic (A2C/A3C)

7. Proximal policy optimization (PPO)

8. Trust region policy optimization (TRPO)

Chapter 8: Interpolation Between Policy Gradients and Q-Learning

Chapter Goal: Introduce reader to the trade offs between two approaches ways to connect together the two seemingly dissimilar approaches. Gain in depth knowledge of some land mark approaches.

1. Tradeoff between policy gradients and q-learning

2. The connection

3. Deep deterministic policy gradient (DDPG)

4. Twin delayed DDPG (TD3)

5. Soft actor critic (SAC)

Chapter 9: Combining Planning and Learning

Chapter Goal: Introduce reader to the scalable approaches which are sample efficient for scalable problems.

1. Model based reinforcement learning

2. Dyna and its variants

3. Guided policy search

4. Monte Carlo tree search (MCTS)

5. AlphaGo

Chapter 10: Exploration vs Exploitation Revisited

Chapter Goal: With the backdrop of having gone through most of the popular algorithms, readers are now introduced again to exploration vs exploitation dilemma, central to reinforcement learning.

1. Multi arm bandits

2. Upper confidence bound

3. Thompson sampling

저자소개

Nimish Sanghi (지은이) 정보 더보기

펼치기

추천도서

분야의 베스트셀러 >

이 포스팅은 쿠팡 파트너스 활동의 일환으로,

이에 따른 일정액의 수수료를 제공받습니다.

이 포스팅은 제휴마케팅이 포함된 광고로 커미션을 지급 받습니다.

도서 DB 제공 : 알라딘 서점(www.aladin.co.kr)

최근 본 책