Select language

العربية
Bahasa Indonesia
Čeština
Deutsch
Español
Français
한국어
Italiano
Magyar
日本語
Norsk
Polski
Português (Brasil)
Português (Portugal)
Русский
ไทย
Türkçe
中文(简体)‎
中文(繁體)‎

Welcome to My Research!

You may have access to the free features available through My Research. You can save searches, save documents, create alerts and more. Please log in through your library or institution to check if you have access.

Find your institution now

Translate this article into 20 different languages!

If you log in through your library or institution you might have access to this article in multiple languages.

Find your institution now

Get access to 20+ different citations styles

Styles include MLA, APA, Chicago and many more. This feature may be available for free if you log in through your library or institution.

Find your institution now

Looking for a PDF of this document?

You may have access to it for free by logging in through your library or institution.

Find your institution now

Copy link

Document URL

Want to save this document?

You may have access to different export options including Google Drive and Microsoft OneDrive and citation management tools like RefWorks and EasyBib. Try logging in through your library or institution to get access to these tools.

Find your institution now

Document Preview Unavailable

UFT: Unifying Fine-Tuning of SFT and RLHF/DPO/UNA through a Generalized Implicit Reward Function

Wang, Zhichao; Bi, Bin; Zhu, Zixu; Mao, Xiangbo; Wang, Jun; et al. arXiv.org, Oct 28, 2024.

You might have access to this document

Try and log in through your institution to see if they have access to the full text.
Log in through your library