0 votes
in Artificial Intelligence by
What do you understand by the reward maximization?

2 Answers

0 votes
by

1) Reward maximization term is used in reinforcement learning, and which is a goal of the reinforcement learning agent. 

2) In RL, a reward is a positive feedback by taking action for a transition from one state to another. 

3) If the agent performs a good action by applying optimal policies, he gets a reward, and if he performs a bad action, one reward is subtracted. 

4) The goal of the agent is to maximize these rewards by applying optimal policies, which is termed as reward maximization.

0 votes
by
The RL agent works based on the theory of reward maximization. This is exactly why the RL agent must be trained in such a way that, he takes the best action so that the reward is maximum.

The collective rewards at a particular time with the respective action is written as:

The above equation is an ideal representation of rewards. Generally, things don’t work out like this while summing up the cumulative rewards.

Let me explain this with a small game. In the figure you can see a fox, some meat and a tiger.

Our RL agent is the fox and his end goal is to eat the maximum amount of meat before being eaten by the tiger.

Since this fox is a clever fellow, he eats the meat that is closer to him, rather than the meat which is close to the tiger, because the closer he is to the tiger, the higher are his chances of getting killed.

As a result, the rewards near the tiger, even if they are bigger meat chunks, will be discounted. This is done because of the uncertainty factor, that the tiger might kill the fox.

The next thing to understand is, how discounting of rewards work?

To do this, we define a discount rate called gamma. The value of gamma is between 0 and 1. The smaller the gamma, the larger the discount and vice versa.

Related questions

0 votes
asked Feb 11 in VIM by SakshiSharma
0 votes
asked Dec 21, 2023 in C Plus Plus by GeorgeBell
...