跳至主要內容
Ryan Lee

Ryan Lee

I'm just a boy trying to find a place in this world.

交通仿真
基于元胞自动机(Cellular Automata,CA),模拟交通流、行人流过程。
交通规划原理
四种增长系数法预测交通分布的GUI编写
路径规划
改进A*、Dijkstra、Floyd、0-1规划模型实现全局路径规划
机器学习
机器学习一些常见代码(BP、Pytorch实现ANN、MNIST手写数字识别等)
数学建模
常见数学建模算法整理
个人简介
个人简介
谈谈自己
Chapter 5 Monte Carlo Learning

This chapter we will introduce a model-free approach for deriving optimal policy.

Here, model-free refers that we do not rely on a specific mathematical model to obtain state value or action value. Like, in the policy evaluation, we use BOE to obtain state value, which is just model-based. For model-free, we do not use that equation anymore. Instead, we leverage the mean estimation methods.


RyanLee_ljx...大约 4 分钟RL
Value Iteration and Policy Iteration

In the last chapter, we study the Bellman Optimality Equation and introduce the iterative algorithm. This chapter we will introduce three model-based approach for deriving optimal policy. I recommand read the pdf tutorial by yourself. In this blog I will mainly focus on the difference between value iteration, policy iteration and truncated policy iteration.


RyanLee_ljx...大约 2 分钟RL
Chapter 3 Optimal Policy and Bellman Optimality Equation

We know that RL's ultimate goal is to find the optimal policy. In this chapter we will show how we obtain optimal policy through Bellman Optimality Equation.

Optimal Policy

The state value could be used to evaluate if a policy is good or not: if

vπ1(s)vπ2(s),  sS v_{\pi_{1}}(s) \ge v_{\pi_{2}}(s), \ \ \forall s \in \mathcal S


RyanLee_ljx...大约 3 分钟RL
Chapter 2 Bellman Equation

This chapter we will introduce two key concepts and one important formula.

Revision

I recommand you reading the motivating examples in the tutorial. Here I will skip this part and directly introduce the concepts.

Before delving into the context, we need to do a revision about previous key concepts.


RyanLee_ljx...大约 4 分钟RL
Before reading

This blog is mainly a notebook of Mathematical Foundations of Reinforcement Learning by Shiyu Zhao from Westlake University WindyLab.

You can find more about the book and related tutorial videos at this link.


RyanLee_ljx...小于 1 分钟RL
Chapter 1 Basic Concepts of Reinforcement Learning

Reinforcement Learning (RL) can be described by the grid world example.

We place one agent in an environment, the goal of the agent is to find a good route to the target. Every cell/grid the agent placed can be seen as a state. Agent can take one action at each state according to a certain policy. The goal of RL is to find a good policy to guide the agent taking a sequence of acitons, travelling from the start place, moving from one state to another, and finally reach the target.


RyanLee_ljx...大约 3 分钟RL
Attention Mechanism

This article will introduce a powerful technique in machine learning called Ateention Mechanism.

The core method of attention mechanism is to pay more attention to what we want. It allows model to weigh the importance of different parts of input dynamically rather than treating them equally. The model learns to assign higher weights to the most relevant elements.


RyanLee_ljx...大约 6 分钟ML
Control Variate

layout: Slide sidebar: false breadcrumb: false pageInfo: false

Introduction to Control Variate

Target

Reduce the variance of a random variable XX.


RyanLee_ljx...大约 1 分钟ML