Two papers accepted at EMNLP 2025: DyePack: Provably Flagging Test Set Contamination in LLMs Using Backdoors Tool Preferences in Agentic LLMs are Unreliable