Xiangyi Li's picture

Xiangyi Li

xdotli

·

https://www.xiangyi.li

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

upvoted a paper 8 days ago

StockBench: Can LLM Agents Trade Stocks Profitably In Real-world Markets?

submitted a paper 8 days ago

SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks

View all activity

Organizations

Papers 1

arxiv:2602.12670

models 0

None public yet

datasets 4

xdotli/skillsbench-trajectories

Updated about 1 month ago • 2.68k

xdotli/xai-clash-eval

Updated Oct 13, 2024 • 14

xdotli/hn

Viewer • Updated Aug 30, 2024 • 642 • 3

xdotli/npr

Viewer • Updated Aug 15, 2024 • 1k • 4.08k