SOLVED: Which of the below given option is commonly used to addresses reward hacking in ...

Which of the below given option is commonly used to addresses reward hacking in Reinforcement learning to measure the difference between two probability distributions.

Regularization

Gradient Descent

Policy Iteration

KL Divergence

Verified Answer

Correct Option - d

To get all Infosys Certified Generative AI Professional - Expert Exam questions Join Group https://bit.ly/infy_premium_group

We're passionate about offering best placement materials and courses!! A one stop place for Placement Materials. We daily post Offcampus updates and Placement Materials.

Important Links

Privacy Policy

Refund Policy

Terms & Conditions

About Us

Contact Us