Update index.html

eternal8080 · Jul 7, 2024 · 4ce6be7 · 4ce6be7
1 parent 897b38b
commit 4ce6be7
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/index.html b/index.html
@@ -170,8 +170,8 @@ <h1 class="title is-1 publication-title">GeoEval: Benchmark for Evaluating LLMs
           <div class="box m-5">
             <div class="content has-text-centered">
               <img src="static/images/lidia.jpg" alt="geometric reasoning" width="84%"/>
-              <p> Accuracy scores of one leading LLM (i.e., PoT GPT-4), four primary LMMs, random chance, and human performance our proposed
-              across mathematical reasoning and visual context types. PoT refers to program-of-thought prompting, and PoT GPT-4 is a textual LLM augmented with the caption and OCR text. GPT-4V is manually evaluated via the playground chatbot. <b class="best-score-text" style="color: #C6011F"> The scores of Gemini Ultra are from the Gemini Team, Google.</b>
+              <p>  the performance of models across various subjects, revealing distinct strengths. The WizardMath-7B model significantly outperforms others in flat geometry problems, such as length
+and lines. Conversely, in solid geometry problems like cuboids and spheres, GPT-4V surpasses WizardMath-7B, indicating its superior capability in addressing solid geometry questions.
               </p>
             </div>
           </div>