arXiv:2503.17181v4 Announce Type: replace-cross Abstract: Despite the rapid progress of large language models (LLMs) in code generation, existing evaluations focus on functional correctness or syntactic validity, overlooking how LLMs make critical design choices such as which library or programming language to use. To fill this gap, we perform...
Read the full article at the source.
Comments (0)
No comments yet. Be the first to comment!