Spaces:

HPAI-BSC
/

TuRTLe-Leaderboard

Running

App Files Files Community

ggcristian commited on Jul 9

Commit

21a7ca2

1 Parent(s): 9dc61b0

Add MLCAD 2025 citation

Browse files

Files changed (2) hide show

app.py +6 -6
static/about.py +7 -5

app.py CHANGED Viewed

@@ -214,7 +214,7 @@ with gr.Blocks(
         <a href="http://arxiv.org/abs/2504.01986" target="_blank" style="text-decoration: none; margin-right: 10px;">
             <button style="background: #b31b1b; color: white; padding: 10px 14px; border-radius: 8px; border: none; font-size: 16px; cursor: pointer;">
-                arXiv Preprint
             </button>
         </a>
@@ -235,7 +235,8 @@ with gr.Blocks(
             <p style="margin-bottom: 15px;  text-align: start !important;">Welcome to the TuRTLe Model Leaderboard! TuRTLe is a <b>unified evaluation framework designed to systematically assess Large Language Models (LLMs) in RTL (Register-Transfer Level) generation</b> for hardware design.
             Evaluation criteria include <b>syntax correctness, functional accuracy, synthesizability, and post-synthesis quality</b> (PPA: Power, Performance, Area). TuRTLe integrates multiple benchmarks to highlight strengths and weaknesses of available LLMs.
             Use the filters below to explore different RTL benchmarks, simulators and models.</p>
-            <p style="margin-top:10px; text-align:start !important;"> <span style="font-variant:small-caps; font-weight:bold;">UPDATE (JULY 2025)</span>: Verilator has been added as an additional simulator alongside Icarus Verilog. You can now filter and compare results by simulator</p>
             <p style="margin-top: -6px; text-align: start !important; "><span style="font-variant: small-caps; font-weight: bold;">UPDATE (JUNE 2025)</span>: We make our framework open-source on GitHub and we add 7 new recent models! For a total of 40 base and instruct models and 5 RTL benchmarks</p>
         </div>
         """
@@ -371,15 +372,14 @@ with gr.Blocks(
             <div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ccc; border-radius: 10px;">
                     <ul style="font-size: 16px; margin-bottom: 20px; margin-top: 20px;">
                         <li><a href="https://github.com/bigcode-project/bigcode-evaluation-harness" target="_blank">Code Generation LM Evaluation Harness</a></li>
                         <li>RTL-Repo: Allam and M. Shalan, “Rtl-repo: A benchmark for evaluating llms on large-scale rtl design projects,” in 2024 IEEE LLM Aided Design Workshop (LAD). IEEE, 2024, pp. 1–5.</li>
                         <li>VeriGen: S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, “Verigen: A large language model for verilog code generation,” ACM Transactions on Design Automation of Electronic Systems, vol. 29, no. 3, pp. 1–31, 2024. </li>
                         <li>VerilogEval (I): M. Liu, N. Pinckney, B. Khailany, and H. Ren, “Verilogeval: Evaluating large language models for verilog code generation,” in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–8.</li>
                         <li>VerilogEval (II): N. Pinckney, C. Batten, M. Liu, H. Ren, and B. Khailany, “Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation,” ACM Trans. Des. Autom. Electron. Syst., feb 2025. https://doi.org/10.1145/3718088</li>
                         <li>RTLLM: Y. Lu, S. Liu, Q. Zhang, and Z. Xie, “Rtllm: An open-source benchmark for design rtl generation with large language model,” in 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2024, pp. 722–727.</li>
                     </ul>
-                    <p style="font-size: 16px; margin-top: 15px;">
-                        Feel free to contact us:
-                    </p>
                 </div>
                 """
             )
@@ -388,7 +388,7 @@ with gr.Blocks(
                 citation_button = gr.Textbox(
                     value=CITATION_BUTTON_TEXT,
                     label=CITATION_BUTTON_LABEL,
-                    lines=10,
                     elem_id="citation-button",
                     show_copy_button=True,
                 )

         <a href="http://arxiv.org/abs/2504.01986" target="_blank" style="text-decoration: none; margin-right: 10px;">
             <button style="background: #b31b1b; color: white; padding: 10px 14px; border-radius: 8px; border: none; font-size: 16px; cursor: pointer;">
+                arXiv MLCAD 2025
             </button>
         </a>
             <p style="margin-bottom: 15px;  text-align: start !important;">Welcome to the TuRTLe Model Leaderboard! TuRTLe is a <b>unified evaluation framework designed to systematically assess Large Language Models (LLMs) in RTL (Register-Transfer Level) generation</b> for hardware design.
             Evaluation criteria include <b>syntax correctness, functional accuracy, synthesizability, and post-synthesis quality</b> (PPA: Power, Performance, Area). TuRTLe integrates multiple benchmarks to highlight strengths and weaknesses of available LLMs.
             Use the filters below to explore different RTL benchmarks, simulators and models.</p>
+            <p style="margin-top:10px; text-align:start !important;"> <span style="font-variant:small-caps; font-weight:bold;">UPDATE (JULY 2025)</span>: Our TuRTLe paper has been accepted to <a href="https://mlcad.org/symposium/2025/" target="_blank"><b>MLCAD 2025</b></a> which will be held in September in Santa Cruz, California!</p>
+            <p style="margin-top: -6px; text-align:start !important;"> <span style="font-variant:small-caps; font-weight:bold;">UPDATE (JULY 2025)</span>: Verilator has been added as an additional simulator alongside Icarus Verilog. You can now filter and compare results by simulator</p>
             <p style="margin-top: -6px; text-align: start !important; "><span style="font-variant: small-caps; font-weight: bold;">UPDATE (JUNE 2025)</span>: We make our framework open-source on GitHub and we add 7 new recent models! For a total of 40 base and instruct models and 5 RTL benchmarks</p>
         </div>
         """
             <div style="max-width: 800px; margin: auto; padding: 20px; border: 1px solid #ccc; border-radius: 10px;">
                     <ul style="font-size: 16px; margin-bottom: 20px; margin-top: 20px;">
                         <li><a href="https://github.com/bigcode-project/bigcode-evaluation-harness" target="_blank">Code Generation LM Evaluation Harness</a></li>
+                        <li>Williams, S. Icarus Verilog [Computer software]. <a href="https://github.com/steveicarus/iverilog" target="_blank">https://github.com/steveicarus/iverilog</a></li>
+                        <li>Snyder, W., Wasson, P., Galbi, D., & et al. Verilator [Computer software]. <a href="https://github.com/verilator/verilator" target="_blank">https://github.com/verilator/verilator</a></li>
                         <li>RTL-Repo: Allam and M. Shalan, “Rtl-repo: A benchmark for evaluating llms on large-scale rtl design projects,” in 2024 IEEE LLM Aided Design Workshop (LAD). IEEE, 2024, pp. 1–5.</li>
                         <li>VeriGen: S. Thakur, B. Ahmad, H. Pearce, B. Tan, B. Dolan-Gavitt, R. Karri, and S. Garg, “Verigen: A large language model for verilog code generation,” ACM Transactions on Design Automation of Electronic Systems, vol. 29, no. 3, pp. 1–31, 2024. </li>
                         <li>VerilogEval (I): M. Liu, N. Pinckney, B. Khailany, and H. Ren, “Verilogeval: Evaluating large language models for verilog code generation,” in 2023 IEEE/ACM International Conference on Computer Aided Design (ICCAD). IEEE, 2023, pp. 1–8.</li>
                         <li>VerilogEval (II): N. Pinckney, C. Batten, M. Liu, H. Ren, and B. Khailany, “Revisiting VerilogEval: A Year of Improvements in Large-Language Models for Hardware Code Generation,” ACM Trans. Des. Autom. Electron. Syst., feb 2025. https://doi.org/10.1145/3718088</li>
                         <li>RTLLM: Y. Lu, S. Liu, Q. Zhang, and Z. Xie, “Rtllm: An open-source benchmark for design rtl generation with large language model,” in 2024 29th Asia and South Pacific Design Automation Conference (ASP-DAC). IEEE, 2024, pp. 722–727.</li>
                     </ul>
                 </div>
                 """
             )
                 citation_button = gr.Textbox(
                     value=CITATION_BUTTON_TEXT,
                     label=CITATION_BUTTON_LABEL,
+                    lines=14,
                     elem_id="citation-button",
                     show_copy_button=True,
                 )

static/about.py CHANGED Viewed

@@ -1,10 +1,12 @@
 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
-CITATION_BUTTON_TEXT = r"""@misc{garciagasulla2025turtleunifiedevaluationllms,
       title={TuRTLe: A Unified Evaluation of LLMs for RTL Generation},
       author={Dario Garcia-Gasulla and Gokcen Kestor and Emanuele Parisi and Miquel Albert\'i-Binimelis and Cristian Gutierrez and Razine Moundir Ghorab and Orlando Montenegro and Bernat Homs and Miquel Moreto},
       year={2025},
-      eprint={2504.01986},
-      archivePrefix={arXiv},
-      primaryClass={cs.AR},
       url={https://arxiv.org/abs/2504.01986},
-}"""

 CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results"
+CITATION_BUTTON_TEXT = r"""@inproceedings{garciagasulla2025turtleunifiedevaluationllms,
       title={TuRTLe: A Unified Evaluation of LLMs for RTL Generation},
       author={Dario Garcia-Gasulla and Gokcen Kestor and Emanuele Parisi and Miquel Albert\'i-Binimelis and Cristian Gutierrez and Razine Moundir Ghorab and Orlando Montenegro and Bernat Homs and Miquel Moreto},
+      booktitle = {Proceedings of the 2025 ACM/IEEE International Symposium on Machine Learning for CAD},
+      series = {MLCAD '25}
       year={2025},
+      publisher = {Association for Computing Machinery},
+      address = {New York, NY, USA},
+      location = {Santa Cruz, CA, USA},
       url={https://arxiv.org/abs/2504.01986},
+      }"""