Was looking at a ICLR 2025 Oral paper and I am shocked it got oral [D]
After my last post about score analysis of ICLR, I am looking into the review itself now.
They evaled SQL code generation by LLM using nature language metric and not executation metric, and they tested it and found around 20% false positive rate. This is a major flaw how is it even getting oral?
submitted by /u/Striking-Warning9533
[link] [comments]
Like
0
Liked
Liked