logo
logo
x
바코드검색
BOOKPRICE.co.kr
책, 도서 가격비교 사이트
바코드검색

인기 검색어

실시간 검색어

검색가능 서점

도서목록 제공

Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym

Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym (Paperback)

Nimish Sanghi (지은이)
  |  
Apress
2021-04-02
  |  
109,580원

일반도서

검색중
서점 할인가 할인률 배송비 혜택/추가 실질최저가 구매하기
알라딘 54,790원 -50% 0원 550원 54,240원 >
yes24 로딩중
교보문고 로딩중
notice_icon 검색 결과 내에 다른 책이 포함되어 있을 수 있습니다.

중고도서

검색중
로딩중

e-Book

검색중
서점 정가 할인가 마일리지 실질최저가 구매하기
로딩중

해외직구

책 이미지

Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym

책 정보

· 제목 : Deep Reinforcement Learning with Python: With Pytorch, Tensorflow and Openai Gym (Paperback) 
· 분류 : 외국도서 > 컴퓨터 > 인공지능(AI)
· ISBN : 9781484268087
· 쪽수 : 382쪽

목차

Chapter 1:  Deep Reinforcement Learning
Chapter Goal: Introduce the reader to field of reinforcement learning and setting the context of what they will learn in rest of the book
Sub -Topics
1. Deep reinforcement learning
2. Examples and case studies
3. Types of algorithms with mind-map
4. Libraries and environment setup
5. Summary 

Chapter 2:  Markov Decision Processes
Chapter Goal: Help the reader understand models, foundations on which all algorithms are built. 
Sub - Topics
1. Agent and environment
2. Rewards
3. Markov reward and decision processes
4. Policies and value functions
5. Bellman equations

Chapter 3: Model Based Algorithms Using Dynamic Programming
Chapter Goal: Introduce reader to dynamic programming and related algorithms 
Sub - Topics:  
1. Introduction to OpenAI Gym environment
2. Policy evaluation/prediction
3. Policy iteration and improvement
4. Generalised policy iteration
5. Value iteration

Chapter 4: Model Free Solution Approaches
Chapter Goal: Introduce Reader to model free methods which form the basis for majority of current solutions
Sub - Topics: 
1. Prediction and control with Monte Carlo methods
2. Exploration vs exploitation
3. TD learning methods
4. TD control
5. On policy learning using SARSA
6. Off policy learning using q-learning

Chapter 5: Function Approximation and Deep Learning in Reinforcement Learning 
Chapter Goal: Help readers understand value function approximation and Deep Learning use in Reinforcement Learning. 
1. Limitations to tabular methods studied so far
2. Value function approximation
3. Linear methods and features used
4. Non linear function approximation using deep Learning 

Chapter 6: Deep Q - Learning 
Chapter Goal: Help readers understand core use of deep learning in reinforcement learning. Deep q learning and many of its variants are introduced here with in depth code exercises. 
1. Deep q-networks (DQN)
2. Issues in Naive DQN 
3. Introduce experience replay and target networks
4. Double q-learning (DDQN)
5. Duelling DQN
6. Categorical 51-atom DQN (C51)
7. Quantile regression DQN (QR-DQN)
8. Hindsight experience replay (HER)

Chapter 7: Policy Gradient Algorithms 
Chapter Goal: Introduce reader to concept of policy gradients and related theory. Gain in depth knowledge of common policy gradient methods through hands-on exercises
1. Policy gradient approach and its advantages
2. The policy gradient theorem
3. REINFORCE algorithm
4. REINFORCE with baseline
5. Actor-critic methods
6. Advantage actor critic (A2C/A3C)
7. Proximal policy optimization (PPO)
8. Trust region policy optimization (TRPO)

Chapter 8: Interpolation Between Policy Gradients and Q-Learning 
Chapter Goal: Introduce reader to the trade offs between two approaches ways to connect together the two seemingly dissimilar approaches. Gain in depth knowledge of some land mark approaches.
1. Tradeoff between policy gradients and q-learning
2. The connection
3. Deep deterministic policy gradient (DDPG)
4. Twin delayed DDPG (TD3)
5. Soft actor critic (SAC)

Chapter 9: Combining Planning and Learning 
Chapter Goal: Introduce reader to the scalable approaches which are sample efficient for scalable problems.
1. Model based reinforcement learning
2. Dyna and its variants
3. Guided policy search
4. Monte Carlo tree search (MCTS)
5. AlphaGo

Chapter 10: Exploration vs Exploitation Revisited 
Chapter Goal: With the backdrop of having gone through most of the popular algorithms, readers are now introduced again to exploration vs exploitation dilemma, central to reinforcement learning. 
1. Multi arm bandits
2. Upper confidence bound
3. Thompson sampling




저자소개

Nimish Sanghi (지은이)    정보 더보기
펼치기
이 포스팅은 쿠팡 파트너스 활동의 일환으로,
이에 따른 일정액의 수수료를 제공받습니다.
도서 DB 제공 : 알라딘 서점(www.aladin.co.kr)
최근 본 책