FuRL Visual-Language Models as Fuzzy Rewards for Reinforcement Learning

Source

@misc{fu_2024_furl,
    title={{FuRL}: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning}, 
    author={Fu, Yuwei and Zhang, Haichao and Wu, Di and Xu, Wei and Boulet, Benoit},
    year={2024},
    eprint={2406.00645},
    archivePrefix={arXiv},
    primaryClass={cs.AI},
    url={https://arxiv.org/abs/2406.00645}, 
}
(McGrill, Horizon Robotics) arXiv

TL;DR

VLM reward is fuzzy since it doesn't align with the goal reaching.

Flash Reading

References