Skip to content

Commit

Permalink
Some clarifications from Laura
Browse files Browse the repository at this point in the history
  • Loading branch information
andreas-zeller committed Jan 7, 2025
1 parent 270defd commit 9b82c56
Showing 1 changed file with 22 additions and 2 deletions.
24 changes: 22 additions & 2 deletions notebooks/Alhazen.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -685,6 +685,23 @@
"3. Computation of _feature vectors_ from a set of inputs, which will then be used as input for the decision tree"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Internal and \"Friendly\" Feature Names\n",
"\n",
"We use two kinds of _names_ for features:\n",
"\n",
"* _internal_ names have the form `<SYMBOL>@N` and refer to the `N`-th expansion of symbol (starting with 0).\n",
" In `CALC_GRAMMAR`, for instance, `<function>@0` refers to the expansion of `<function>` to `\"sqrt\"`\n",
"* _friendly_ names are more user-friendly (hence the name).\n",
" The above feature `<function>@0` has the \"friendly\" name `<function> == \"sqrt\"`.\n",
"\n",
"We use internal names in all our interaction with the machine learner, as they are unambiguous and do not contain whitespace.\n",
"When showing the final results, we switch to \"friendly\" names."
]
},
{
"cell_type": "markdown",
"metadata": {},
Expand Down Expand Up @@ -1003,7 +1020,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"The `friendly` format is a bit more concise and more readable:"
"The `friendly` representation is a bit more concise and more readable:"
]
},
{
Expand Down Expand Up @@ -3381,7 +3398,10 @@
"If the predicate evaluates to `True`, follow the left path; if it evaluates to `False`, follow the right path.\n",
"A leaf node (no children) will give you the final decision `class = BUG` or `class = NO_BUG`.\n",
"\n",
"So if the predicate states `<function> == 'sqrt' <= 0.5`, this means that if the function is _not_ `sqrt`, follow the left (`True`) path. If it is `sqrt`, follow the right (`False`) path.\n",
"So if the predicate states `<function> == 'sqrt' <= 0.5`, this means that\n",
"\n",
"* If the function is _not_ `sqrt` (the predicate `<function> == 'sqrt'` is negative, see above, and hence less than 0.5), follow the left (`True`) path.\n",
"* If the function _is_ `sqrt` (the predicate `<function> == 'sqrt'` is positive), follow the right (`False`) path.\n",
"\n",
"The `samples` field shows the number of sample inputs that contributed to this decision.\n",
"The `gini` field (aka Gini impurity) indicates how many samples fall into the displayed class (`BUG` or `NO_BUG`).\n",
Expand Down

0 comments on commit 9b82c56

Please sign in to comment.