-
Natural Language Reinforcement Learning
Paper • 2411.14251 • Published • 31 -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Value
Feature Extraction • 8B • Updated -
Benjamin-eecs/Llama-3.1-8B-Instruct-NLRL-TicTacToe-Policy
Feature Extraction • 8B • Updated • 1 -
Waterhorse/Llama-3.1-8B-Instruct-NLRL-Breakthrough-Value
Feature Extraction • 8B • Updated • 2
Bo Liu
Benjamin-eecs
AI & ML interests
Reinforcement Learning, Reasoning, Machine Learning Systems
Recent Activity
upvoted a paper about 1 month ago
Paying Less Generalization Tax: A Cross-Domain Generalization Study of RL Training for LLM Agents liked
a dataset 2 months ago
facebook/principia-bench upvoted a paper 3 months ago
On the Interplay of Pre-Training, Mid-Training, and RL on Reasoning Language Models