Difference between revisions of "AI safety"

From GISAXS
Jump to: navigation, search
(Research)
(Research)
Line 18: Line 18:
 
* 2023-05: [https://arxiv.org/abs/2305.03047 Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision]
 
* 2023-05: [https://arxiv.org/abs/2305.03047 Principle-Driven Self-Alignment of Language Models from Scratch with Minimal Human Supervision]
 
* 2023-06: [https://arxiv.org/abs/2306.17492 Preference Ranking Optimization for Human Alignment]
 
* 2023-06: [https://arxiv.org/abs/2306.17492 Preference Ranking Optimization for Human Alignment]
 +
* 2023-08: [https://arxiv.org/abs/2308.06259 Self-Alignment with Instruction Backtranslation]

Revision as of 13:46, 14 February 2025