Skip to content

Commit

Permalink
stopped developing gessamans rule based multivariate LSx and instead …
Browse files Browse the repository at this point in the history
…switched to decision tree
  • Loading branch information
kaiguender committed Aug 27, 2024
1 parent 620d04b commit 765184c
Show file tree
Hide file tree
Showing 13 changed files with 868 additions and 11 deletions.
161 changes: 157 additions & 4 deletions _proc/02_levelSetKDEx_multivariate.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@
"text/markdown": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L31){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L33){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_multivariate\n",
"\n",
Expand All @@ -83,7 +83,7 @@
"text/plain": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L31){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L33){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_multivariate\n",
"\n",
Expand Down Expand Up @@ -135,7 +135,7 @@
"text/markdown": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L262){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L264){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_multivariate_opt\n",
"\n",
Expand Down Expand Up @@ -171,7 +171,7 @@
"text/plain": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L262){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L264){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_multivariate_opt\n",
"\n",
Expand Down Expand Up @@ -216,6 +216,158 @@
"show_doc(LevelSetKDEx_multivariate_opt)"
]
},
{
"cell_type": "markdown",
"id": "a831ecfe",
"metadata": {},
"source": [
"## Level-Set Approach based on Decision Tree"
]
},
{
"cell_type": "code",
"execution_count": 3,
"has_sd": true,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L492){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_DT\n",
"\n",
"> LevelSetKDEx_DT (estimator, max_depth:int=8, min_samples_leaf:int=100)\n",
"\n",
"*[`LevelSetKDEx`](https://kaiguender.github.io/dddex/levelsetkdex_univariate.html#levelsetkdex) turns any point forecasting model into an estimator of the underlying conditional density.\n",
"The name 'LevelSet' stems from the fact that this approach interprets the values of the point forecasts\n",
"as a similarity measure between samples. \n",
"TBD*\n",
"\n",
"| | **Type** | **Default** | **Details** |\n",
"| -- | -------- | ----------- | ----------- |\n",
"| estimator | | | Model with a .fit and .predict-method (implementing the scikit-learn estimator interface). |\n",
"| max_depth | int | 8 | Maximum depth of the decision tree used to generate the bins. |\n",
"| min_samples_leaf | int | 100 | Minimum number of samples required to be in a bin. |"
],
"text/plain": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L492){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_DT\n",
"\n",
"> LevelSetKDEx_DT (estimator, max_depth:int=8, min_samples_leaf:int=100)\n",
"\n",
"*`LevelSetKDEx` turns any point forecasting model into an estimator of the underlying conditional density.\n",
"The name 'LevelSet' stems from the fact that this approach interprets the values of the point forecasts\n",
"as a similarity measure between samples. \n",
"TBD*\n",
"\n",
"| | **Type** | **Default** | **Details** |\n",
"| -- | -------- | ----------- | ----------- |\n",
"| estimator | | | Model with a .fit and .predict-method (implementing the scikit-learn estimator interface). |\n",
"| max_depth | int | 8 | Maximum depth of the decision tree used to generate the bins. |\n",
"| min_samples_leaf | int | 100 | Minimum number of samples required to be in a bin. |"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#| echo: false\n",
"#| output: asis\n",
"show_doc(LevelSetKDEx_DT)"
]
},
{
"cell_type": "markdown",
"id": "30884783",
"metadata": {},
"source": [
"## LSx Multivariate Gessaman Rule"
]
},
{
"cell_type": "code",
"execution_count": 4,
"has_sd": true,
"metadata": {},
"outputs": [
{
"data": {
"text/markdown": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L612){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_multivariate_bin\n",
"\n",
"> LevelSetKDEx_multivariate_bin (estimator, nBinsPerDim:int=None)\n",
"\n",
"*[`LevelSetKDEx`](https://kaiguender.github.io/dddex/levelsetkdex_univariate.html#levelsetkdex) turns any point forecasting model into an estimator of the underlying conditional density.\n",
"The name 'LevelSet' stems from the fact that this approach interprets the values of the point forecasts\n",
"as a similarity measure between samples. \n",
"In this version of the LSx algorithm, we are applying the so-called Gessaman rule to create statistically\n",
"equivalent blocks of samples. In essence, the algorithm is a multivariate extension of the univariate\n",
"LevelSetKDEx algorithm based on bin-building. \n",
"We are creating equally sized bins of samples based on the point predictions of the samples specified via `X`\n",
"for every coordinate axis. Every bin of one axis is combined with the bins of all other axes resulting in\n",
"a total of nBinsPerDim^dim many bins. \n",
"Example: Let's say we have 100000 samples, the binSize is given as 20 and the number of dimension\n",
"is 3. As the binSize is given as 20, we want to create 5000 bins alltogether. Hence, there have to be\n",
"5000^(1/dim) = 5000^(1/3) = 17 bins per dimension. \n",
"IMPORTANT NOTE: The getWeights function is not yet finished and has to be completed.*\n",
"\n",
"| | **Type** | **Default** | **Details** |\n",
"| -- | -------- | ----------- | ----------- |\n",
"| estimator | | | Model with a .fit and .predict-method (implementing the scikit-learn estimator interface). |\n",
"| nBinsPerDim | int | None | Number of samples belonging to each bin. |"
],
"text/plain": [
"---\n",
"\n",
"[source](https://github.com/kaiguender/dddex/blob/main/dddex/levelSetKDEx_multivariate.py#L612){target=\"_blank\" style=\"float:right; font-size:smaller\"}\n",
"\n",
"### LevelSetKDEx_multivariate_bin\n",
"\n",
"> LevelSetKDEx_multivariate_bin (estimator, nBinsPerDim:int=None)\n",
"\n",
"*`LevelSetKDEx` turns any point forecasting model into an estimator of the underlying conditional density.\n",
"The name 'LevelSet' stems from the fact that this approach interprets the values of the point forecasts\n",
"as a similarity measure between samples. \n",
"In this version of the LSx algorithm, we are applying the so-called Gessaman rule to create statistically\n",
"equivalent blocks of samples. In essence, the algorithm is a multivariate extension of the univariate\n",
"LevelSetKDEx algorithm based on bin-building. \n",
"We are creating equally sized bins of samples based on the point predictions of the samples specified via `X`\n",
"for every coordinate axis. Every bin of one axis is combined with the bins of all other axes resulting in\n",
"a total of nBinsPerDim^dim many bins. \n",
"Example: Let's say we have 100000 samples, the binSize is given as 20 and the number of dimension\n",
"is 3. As the binSize is given as 20, we want to create 5000 bins alltogether. Hence, there have to be\n",
"5000^(1/dim) = 5000^(1/3) = 17 bins per dimension. \n",
"IMPORTANT NOTE: The getWeights function is not yet finished and has to be completed.*\n",
"\n",
"| | **Type** | **Default** | **Details** |\n",
"| -- | -------- | ----------- | ----------- |\n",
"| estimator | | | Model with a .fit and .predict-method (implementing the scikit-learn estimator interface). |\n",
"| nBinsPerDim | int | None | Number of samples belonging to each bin. |"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#| echo: false\n",
"#| output: asis\n",
"show_doc(LevelSetKDEx_multivariate_bin)"
]
},
{
"cell_type": "markdown",
"id": "c0d8dc43-8377-41b9-a5a7-aa5904038841",
Expand Down Expand Up @@ -295,6 +447,7 @@
{
"cell_type": "code",
"execution_count": null,
"id": "5faa5708",
"metadata": {
"language": "python"
},
Expand Down
Binary file modified dddex/__pycache__/__init__.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/_modidx.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/baseClasses.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/crossValidation.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/levelSetKDEx_multivariate.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/levelSetKDEx_univariate.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/loadData.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/utils.cpython-39.pyc
Binary file not shown.
Binary file modified dddex/__pycache__/wSAA.cpython-39.pyc
Binary file not shown.
18 changes: 17 additions & 1 deletion dddex/_modidx.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,15 @@
'dddex/crossValidation.py'),
'dddex.crossValidation.groupedTimeSeriesSplit': ( 'crossvalidation.html#groupedtimeseriessplit',
'dddex/crossValidation.py')},
'dddex.levelSetKDEx_multivariate': { 'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate',
'dddex.levelSetKDEx_multivariate': { 'dddex.levelSetKDEx_multivariate.LevelSetKDEx_DT': ( 'levelsetkdex_multivariate.html#levelsetkdex_dt',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_DT.__init__': ( 'levelsetkdex_multivariate.html#levelsetkdex_dt.__init__',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_DT.fit': ( 'levelsetkdex_multivariate.html#levelsetkdex_dt.fit',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_DT.getWeights': ( 'levelsetkdex_multivariate.html#levelsetkdex_dt.getweights',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate.__init__': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate.__init__',
'dddex/levelSetKDEx_multivariate.py'),
Expand All @@ -71,6 +79,14 @@
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate.getWeights': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate.getweights',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate_bin': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate_bin',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate_bin.__init__': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate_bin.__init__',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate_bin.fit': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate_bin.fit',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate_bin.getWeights': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate_bin.getweights',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate_opt': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate_opt',
'dddex/levelSetKDEx_multivariate.py'),
'dddex.levelSetKDEx_multivariate.LevelSetKDEx_multivariate_opt.__init__': ( 'levelsetkdex_multivariate.html#levelsetkdex_multivariate_opt.__init__',
Expand Down
Loading

0 comments on commit 765184c

Please sign in to comment.