-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
140 changed files
with
396 additions
and
0 deletions.
There are no files selected for viewing
Binary file added
BIN
+5.88 KB
.../2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/1/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<text x="310" y="185" font-family="Noto Sans" font-size="40" stroke="none" fill="black">10</text> | ||
</svg> |
Binary file added
BIN
+6.1 KB
...results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/1/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.89 KB
.../2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/2/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<text x="360" y="150" font-family="Noto Sans" font-size="40" stroke="none">10</text> | ||
</svg> |
Binary file added
BIN
+6.1 KB
...results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/2/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.83 KB
.../2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/3/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<text x="310" y="180" font-family="Noto Sans" font-size="24" stroke="none" fill="black">10</text> | ||
</svg> |
Binary file added
BIN
+6.04 KB
...results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/3/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.85 KB
...024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/1/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<text x="400" y="300" font-family="Noto Sans" font-size="24" stroke="none" fill="black">10</text> | ||
</svg> |
Binary file added
BIN
+6.04 KB
...sults/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/1/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
10 |
Binary file added
BIN
+5.85 KB
...024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/3/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<text x="420" y="300" font-family="Noto Sans" font-size="24" stroke="none" fill="black">10</text> | ||
</svg> |
Binary file added
BIN
+6.04 KB
...sults/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/3/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.92 KB
...n_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/1/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"><text x="410" y="285" font-family="Noto Sans" font-size="40">10</text></svg> |
Binary file added
BIN
+6.1 KB
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/1/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.9 KB
...n_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/2/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"><text x="410" y="370" font-family="Noto Sans" font-size="36" fill="black">10</text></svg> |
Binary file added
BIN
+6.08 KB
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/2/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+6.03 KB
...n_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/3/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><text x='570' y='410' font-family='Noto Sans' font-size='64'>10</text></svg> |
Binary file added
BIN
+6.18 KB
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/3/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.85 KB
...tion_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/1/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><text x='557' y='484' font-family='Noto Sans' font-size='18'>10</text></svg> |
Binary file added
BIN
+6.03 KB
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/1/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+5.92 KB
...tion_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/2/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"><text x="360" y="715" font-family="Noto Sans" font-size="40">10</text></svg> |
Binary file added
BIN
+6.1 KB
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/2/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
10 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,135 @@ | ||
# Ghostwriter evaluation results 2024-12-21_13-57-31 | ||
|
||
There are 4 scenarios and 4 test cases with 3 attempts (48 total tests). | ||
## Test: blank_math | ||
|
||
### claude_sonnet_latest_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluations/blank_math/input.png' border=1 width=200 /> | ||
|
||
``` | ||
10 | ||
``` | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o-mini_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o-mini_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/gpt-4o_with_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluations/blank_math/input.png' border=1 width=200 /> | ||
|
||
``` | ||
10 | ||
``` | ||
|
||
### claude_sonnet_latest_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/blank_math/claude_sonnet_latest_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
## Test: tic_tac_toe_1 | ||
|
||
### claude_sonnet_latest_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluations/tic_tac_toe_1/input.png' border=1 width=200 /> | ||
|
||
``` | ||
Your turn! Place an O anywhere you'd like. | ||
``` | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o-mini_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### claude_sonnet_latest_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
## Test: x_in_box | ||
|
||
### claude_sonnet_latest_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o-mini_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### claude_sonnet_latest_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
## Test: x_in_boxes | ||
|
||
### claude_sonnet_latest_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o-mini_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### gpt-4o_with_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o_with_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o_with_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o_with_seg/3/merged-output.png' border=1 width=200 /> | ||
|
||
### claude_sonnet_latest_no_seg | ||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/1/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/2/merged-output.png' border=1 width=200 /> | ||
|
||
<img src='../../evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/3/merged-output.png' border=1 width=200 /> | ||
|
Binary file added
BIN
+16 KB
...24-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/1/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
...uation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<circle cx="150" cy="200" r="15" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.13 KB
...ults/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/1/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+16.2 KB
...24-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/2/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
...uation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<circle cx="200" cy="250" r="30" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.24 KB
...ults/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/2/result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added
BIN
+16.1 KB
...24-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/3/merged-output.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4 changes: 4 additions & 0 deletions
4
...uation_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<circle cx="140" cy="230" r="25" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.21 KB
...ults/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_no_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+16 KB
...-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/1/merged-output.png
Oops, something went wrong.
4 changes: 4 additions & 0 deletions
4
...tion_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<circle cx="240" cy="380" r="20" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.14 KB
...ts/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/1/result.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
...tion_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
Your turn! Place an O anywhere you'd like. |
Binary file added
BIN
+16 KB
...-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/3/merged-output.png
Oops, something went wrong.
4 changes: 4 additions & 0 deletions
4
...tion_results/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<circle cx="250" cy="400" r="20" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.14 KB
...ts/2024-12-21_13-57-31/tic_tac_toe_1/claude_sonnet_latest_with_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+16.4 KB
...esults/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/1/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 768 1024'><circle cx='384' cy='600' r='60' stroke='black' stroke-width='5' fill='none' /></svg> |
Binary file added
BIN
+6.63 KB
...ation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+16.3 KB
...esults/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/2/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"><circle cx="384" cy="410" r="50" stroke="black" stroke-width="5" fill="none"/></svg> |
Binary file added
BIN
+6.53 KB
...ation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+16.4 KB
...esults/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/3/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><circle cx='390' cy='510' r='50' stroke='black' stroke-width='5' fill='none'/></svg> |
Binary file added
BIN
+6.53 KB
...ation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o-mini_no_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+16.1 KB
...n_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/1/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' height='1024' width='768'><circle cx='246' cy='430' r='25' stroke='black' stroke-width='3' fill='none'/></svg> |
Binary file added
BIN
+6.19 KB
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+16.2 KB
...n_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/2/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><circle cx='512' cy='512' r='35' stroke='black' stroke-width='3' fill='none' /></svg> |
Binary file added
BIN
+6.27 KB
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+16.3 KB
...n_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/3/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><circle cx='384' cy='563' r='50' stroke='black' stroke-width='5' fill='none'/></svg> |
Binary file added
BIN
+6.53 KB
evaluation_results/2024-12-21_13-57-31/tic_tac_toe_1/gpt-4o_with_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+8.03 KB
...ts/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/1/merged-output.png
Oops, something went wrong.
5 changes: 5 additions & 0 deletions
5
evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<line x1="300" y1="400" x2="500" y2="600" stroke="black" stroke-width="2"/> | ||
<line x1="500" y1="400" x2="300" y2="600" stroke="black" stroke-width="2"/> | ||
</svg> |
Binary file added
BIN
+7.28 KB
...n_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+8.05 KB
...ts/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/2/merged-output.png
Oops, something went wrong.
5 changes: 5 additions & 0 deletions
5
evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<line x1="284" y1="437" x2="484" y2="637" stroke="black" stroke-width="2"/> | ||
<line x1="484" y1="437" x2="284" y2="637" stroke="black" stroke-width="2"/> | ||
</svg> |
Binary file added
BIN
+7.28 KB
...n_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+8.03 KB
...ts/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/3/merged-output.png
Oops, something went wrong.
7 changes: 7 additions & 0 deletions
7
evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
<?xml version="1.0" encoding="UTF-8"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<path d="M 300 400 L 500 600 M 500 400 L 300 600" | ||
stroke="black" | ||
stroke-width="2" | ||
fill="none"/> | ||
</svg> |
Binary file added
BIN
+7.28 KB
...n_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_no_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+8.69 KB
.../2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/1/merged-output.png
Oops, something went wrong.
5 changes: 5 additions & 0 deletions
5
evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<line x1="400" y1="572" x2="1040" y2="1185" stroke="black" stroke-width="2"/> | ||
<line x1="1040" y1="572" x2="400" y2="1185" stroke="black" stroke-width="2"/> | ||
</svg> |
Binary file added
BIN
+7.69 KB
...results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+8.09 KB
.../2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/2/merged-output.png
Oops, something went wrong.
7 changes: 7 additions & 0 deletions
7
evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<path d="M 420 592 L 620 793 M 620 592 L 420 793" | ||
stroke="black" | ||
stroke-width="2" | ||
fill="none"/> | ||
</svg> |
Binary file added
BIN
+7.11 KB
...results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+8.03 KB
.../2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/3/merged-output.png
Oops, something went wrong.
5 changes: 5 additions & 0 deletions
5
evaluation_results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<line x1="420" y1="420" x2="620" y2="620" stroke="black" stroke-width="2"/> | ||
<line x1="620" y1="420" x2="420" y2="620" stroke="black" stroke-width="2"/> | ||
</svg> |
Binary file added
BIN
+7.12 KB
...results/2024-12-21_13-57-31/x_in_box/claude_sonnet_latest_with_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+9.36 KB
...ion_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/1/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 768 1024'><line x1='170' y1='350' x2='560' y2='750' stroke='black' stroke-width='3'/><line x1='560' y1='350' x2='170' y2='750' stroke='black' stroke-width='3'/></svg> |
Binary file added
BIN
+8.57 KB
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+9.25 KB
...ion_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/2/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><line x1='180' y1='250' x2='590' y2='660' stroke='black' stroke-width='2'/><line x1='590' y1='250' x2='180' y2='660' stroke='black' stroke-width='2'/></svg> |
Binary file added
BIN
+8.45 KB
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+8.42 KB
...ion_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/3/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" viewBox="0 0 768 1024"><line x1="250" y1="394" x2="518" y2="662" stroke="black" stroke-width="2"/><line x1="518" y1="394" x2="250" y2="662" stroke="black" stroke-width="2"/></svg> |
Binary file added
BIN
+7.63 KB
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o-mini_no_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+8.65 KB
...uation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/1/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><line x1='420' y1='600' x2='1010' y2='1170' stroke='black' stroke-width='2'/><line x1='1010' y1='600' x2='420' y2='1170' stroke='black' stroke-width='2'/></svg> |
Binary file added
BIN
+7.62 KB
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+8.72 KB
...uation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/2/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"><line x1="400" y1="572" x2="1040" y2="1185" stroke="black" stroke-width="3"/><line x1="1040" y1="572" x2="400" y2="1185" stroke="black" stroke-width="3"/></svg> |
Binary file added
BIN
+7.88 KB
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+8.65 KB
...uation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/3/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><line x1='450' y1='600' x2='990' y2='1150' stroke='black' stroke-width='3'/><line x1='990' y1='600' x2='450' y2='1150' stroke='black' stroke-width='3'/></svg> |
Binary file added
BIN
+7.73 KB
evaluation_results/2024-12-21_13-57-31/x_in_box/gpt-4o_with_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+9.58 KB
.../2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/1/merged-output.png
Oops, something went wrong.
11 changes: 11 additions & 0 deletions
11
evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<!-- X in first box --> | ||
<path d="M 100,200 L 300,350 M 300,200 L 100,350" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in second box --> | ||
<path d="M 450,200 L 650,350 M 650,200 L 450,350" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in third box --> | ||
<path d="M 100,500 L 200,600 M 200,500 L 100,600" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+8.45 KB
...results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+10.3 KB
.../2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/2/merged-output.png
Oops, something went wrong.
11 changes: 11 additions & 0 deletions
11
evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<!-- X in top left box --> | ||
<path d="M 100 200 L 300 400 M 300 200 L 100 400" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in top right box --> | ||
<path d="M 400 200 L 600 400 M 600 200 L 400 400" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in bottom box --> | ||
<path d="M 150 500 L 300 650 M 300 500 L 150 650" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+9.13 KB
...results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+9.42 KB
.../2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/3/merged-output.png
Oops, something went wrong.
14 changes: 14 additions & 0 deletions
14
evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<!-- X in upper left box --> | ||
<path d="M 200 250 L 350 400 M 350 250 L 200 400" | ||
stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in upper right box --> | ||
<path d="M 450 250 L 600 400 M 600 250 L 450 400" | ||
stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in bottom box --> | ||
<path d="M 325 450 L 425 550 M 425 450 L 325 550" | ||
stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+8.31 KB
...results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_no_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+8.23 KB
...024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/1/merged-output.png
Oops, something went wrong.
11 changes: 11 additions & 0 deletions
11
evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,11 @@ | ||
<?xml version="1.0" encoding="UTF-8" standalone="no"?> | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024" viewBox="0 0 768 1024"> | ||
<!-- X in left box --> | ||
<path d="M170 154 L229 227 M229 154 L170 227" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in right box --> | ||
<path d="M559 152 L588 205 M588 152 L559 205" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in bottom box --> | ||
<path d="M307 1242 L456 1405 M456 1242 L307 1405" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.54 KB
...sults/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+9.46 KB
...024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/2/merged-output.png
Oops, something went wrong.
10 changes: 10 additions & 0 deletions
10
evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024" viewBox="0 0 768 1024"> | ||
<!-- Left box X --> | ||
<path d="M145,568 L473,861 M473,568 L145,861" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- Right box X --> | ||
<path d="M831,763 L1152,1075 M1152,763 L831,1075" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- Bottom box X --> | ||
<path d="M317,1252 L446,1395 M446,1252 L317,1395" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+7.76 KB
...sults/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+8.47 KB
...024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/3/merged-output.png
Oops, something went wrong.
10 changes: 10 additions & 0 deletions
10
evaluation_results/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/3/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,10 @@ | ||
<svg xmlns="http://www.w3.org/2000/svg" width="768" height="1024"> | ||
<!-- X in top left box --> | ||
<path d="M160,144 L239,237 M239,144 L160,237" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in top right box --> | ||
<path d="M549,142 L598,215 M598,142 L549,215" stroke="black" stroke-width="2" fill="none"/> | ||
|
||
<!-- X in bottom box --> | ||
<path d="M297,1232 L466,1415 M466,1232 L297,1415" stroke="black" stroke-width="2" fill="none"/> | ||
</svg> |
Binary file added
BIN
+6.81 KB
...sults/2024-12-21_13-57-31/x_in_boxes/claude_sonnet_latest_with_seg/3/result.png
Oops, something went wrong.
Binary file added
BIN
+9.53 KB
...n_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/1/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/1/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><line x1='100' y1='200' x2='250' y2='350' stroke='black' stroke-width='2'/><line x1='250' y1='200' x2='100' y2='350' stroke='black' stroke-width='2'/><line x1='350' y1='300' x2='500' y2='450' stroke='black' stroke-width='2'/><line x1='500' y1='300' x2='350' y2='450' stroke='black' stroke-width='2'/><line x1='150' y1='600' x2='230' y2='680' stroke='black' stroke-width='2'/><line x1='230' y1='600' x2='150' y2='680' stroke='black' stroke-width='2'/></svg> |
Binary file added
BIN
+8.35 KB
evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/1/result.png
Oops, something went wrong.
Binary file added
BIN
+11.1 KB
...n_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/2/merged-output.png
Oops, something went wrong.
1 change: 1 addition & 0 deletions
1
evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/2/result.out
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
<svg xmlns='http://www.w3.org/2000/svg' width='768' height='1024'><line x1='120' y1='268' x2='395' y2='543' stroke='black' stroke-width='3'/><line x1='395' y1='268' x2='120' y2='543' stroke='black' stroke-width='3'/><line x1='422' y1='395' x2='697' y2='670' stroke='black' stroke-width='3'/><line x1='697' y1='395' x2='422' y2='670' stroke='black' stroke-width='3'/><line x1='120' y1='668' x2='245' y2='793' stroke='black' stroke-width='3'/><line x1='245' y1='668' x2='120' y2='793' stroke='black' stroke-width='3'/></svg> |
Binary file added
BIN
+10.6 KB
evaluation_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/2/result.png
Oops, something went wrong.
Binary file added
BIN
+10.4 KB
...n_results/2024-12-21_13-57-31/x_in_boxes/gpt-4o-mini_no_seg/3/merged-output.png
Oops, something went wrong.
Oops, something went wrong.