+1 vote
in Reinforcement Learning by
Name some approaches or algorithms you know in to solve a problem in Reinforcement Learning

1 Answer

0 votes
by
Dynamic Programming (DP): When the model is fully known, following Bellman equations, we can use DP to iteratively evaluate value functions and improve policy.

Monte-Carlo (MC)Methods: It learns from episodes of raw experience without modeling the environmental dynamics and computes the observed mean return as an approximation of the expected return. One important thing here is that the episodes must be complete, which means that all the episodes must eventually terminate.

Temporal-Difference (TD) Learning: Similar to Monte-Carlo methods, TD Learning is model-free and learns from episodes of experience. However, TD learning can learn from incomplete episodes and hence we don’t need to track the episode up to termination.

Policy Gradient: All previous methods aim to learn the state/action-value function and then to select actions accordingly. Policy Gradient methods instead learn the policy function directly with respect to some parameter θ, so here we aim to find the best θ that produces the highest return.

Evolution Strategies (ES): It learns the optimal solution by imitating Darwin's theory of the evolution of species by natural selection. Two prerequisites for applying ES: (i) our solutions can freely interact with the environment and see whether they can solve the problem; (ii) we are able to compute a fitness score of how good each solution is. We don’t have to know the environment configuration to solve the problem.

Related questions

+1 vote
asked May 5, 2023 in Reinforcement Learning by sharadyadav1986
+1 vote
asked Feb 4, 2021 in Artificial Intelligence by SakshiSharma
...