Lize Pirenne's picture

Lize Pirenne

Inversta

·

Pangasius

AI & ML interests

LLMs, RL

Recent Activity

upvoted a paper about 8 hours ago

Hölder Policy Optimisation

upvoted a paper about 8 hours ago

Steered LLM Activations are Non-Surjective

upvoted a paper about 10 hours ago

Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces

View all activity

Organizations

None yet

upvoted 2 papers about 8 hours ago

Hölder Policy Optimisation

Paper • 2605.12058 • Published 24 days ago • 21

Steered LLM Activations are Non-Surjective

Paper • 2604.09839 • Published 29 days ago • 13

upvoted a paper about 10 hours ago

Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces

Paper • 2605.02801 • Published May 4 • 9

upvoted a paper about 11 hours ago

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Paper • 2605.05185 • Published 30 days ago • 102

upvoted 2 papers 2 days ago

Asymmetric Flow Models

Paper • 2605.12964 • Published 23 days ago • 22

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Paper • 2605.06169 • Published 29 days ago • 233

upvoted a paper 27 days ago

LLM Safety From Within: Detecting Harmful Content with Internal Representations

Paper • 2604.18519 • Published Apr 20 • 26

upvoted 8 papers about 1 month ago

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 164

Elucidating the SNR-t Bias of Diffusion Probabilistic Models

Paper • 2604.16044 • Published Apr 17 • 73

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 243

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping

Paper • 2604.11297 • Published Apr 13 • 144

Reinforcement Learning via Value Gradient Flow

Paper • 2604.14265 • Published Apr 15 • 7

Continuous Adversarial Flow Models

Paper • 2604.11521 • Published Apr 13 • 11

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published Apr 15 • 30

upvoted 5 papers about 2 months ago

FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling

Paper • 2604.06916 • Published Apr 8 • 34

INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Paper • 2604.07209 • Published Apr 8 • 38

DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published Apr 9 • 53

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 114

GrandCode: Achieving Grandmaster Level in Competitive Programming via Agentic Reinforcement Learning

Paper • 2604.02721 • Published Apr 3 • 631