Projects and Papers
Exploring Bandit Algorithms
Authors: David Chen
CS 221 • 2025
This project evaluates classic and modern stochastic multi-armed bandit algorithms by comparing their theoretical regret bounds against empirical performance and runtime. I developed a simulation framework to test strategies such as ϵ-greedy, Thompson sampling, and information-directed sampling across independent and linear bandit settings. This comprehensive analysis aims to provide both quantitative benchmarks and qualitative insights into how different exploration-exploitation strategies behave in practice.
Evaluating Stitching Capabilities of RvS Transformer Algorithms
Authors: David Chen
CS 234 • 2024
This paper benchmarks the ability of Transformer-based reinforcement learning methods to "stitch" suboptimal trajectories into optimal policies across challenging AntMaze environments. We introduce an enhanced Waypoint Transformer with a refined waypoint selection strategy that improves performance. These contributions provide a comprehensive evaluation of current sequence modeling approaches and suggest new avenues for goal-conditioned behavior cloning.