arxiv:2504.01931
Chakraborty
souradip24
AI & ML interests
Reinforcement Learning, Machine Learning, NLP
Recent Activity
upvoted a paper about 10 hours ago
Transfer Q Star: Principled Decoding for LLM Alignment updated a model 2 months ago
souradip24/dpo-merged-vllm-r4-r3 published a model 2 months ago
souradip24/dpo-merged-vllm-r4-r3