Skip to content

Commit

Permalink
Merge pull request #246 from ing-bank/develop
Browse files Browse the repository at this point in the history
Release
  • Loading branch information
sbrugman authored Aug 19, 2022
2 parents 1bab51e + 6aa0249 commit 65125c3
Show file tree
Hide file tree
Showing 46 changed files with 623 additions and 212 deletions.
18 changes: 18 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -21,3 +21,21 @@ jobs:
github_token: ${{ secrets.GITHUB_TOKEN }}
repository_username: __token__
repository_password: ${{ secrets.PYPI_TOKEN }}

merge-master-back-to-dev:
runs-on: ubuntu-latest
needs:
- release
steps:
- uses: actions/checkout@v2
- name: Set Git config
run: |
git config --local user.email "actions@github.com"
git config --local user.name "Github Actions"
- name: Merge master back to dev
run: |
git fetch --unshallow
git checkout dev
git pull
git merge --no-ff master -m "chore: auto-merge master back to develop"
git push
6 changes: 3 additions & 3 deletions .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,15 +10,15 @@ repos:
files: '.*'
args: [ --profile=black, --project=popmon ]
- repo: https://github.com/PyCQA/flake8
rev: "4.0.1"
rev: "5.0.4"
hooks:
- id: flake8
additional_dependencies:
- flake8-comprehensions
- tryceratops
args: [ "--ignore=E501,E203,W503,TC003,TC101,TC300"]
- repo: https://github.com/asottile/pyupgrade
rev: v2.34.0
rev: v2.37.3
hooks:
- id: pyupgrade
args: ['--py36-plus','--exit-zero-even-if-changed']
Expand All @@ -34,7 +34,7 @@ repos:
language: system
pass_filenames: false
- repo: https://github.com/nbQA-dev/nbQA
rev: 1.3.1
rev: 1.4.0
hooks:
- id: nbqa-black
- id: nbqa-pyupgrade
Expand Down
27 changes: 27 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
type: proceedings
title: "popmon: Analysis Package for Dataset Shift Detection"
authors:
- family-names: Brugman
given-names: Simon
- family-names: Sostak
given-names: Tomas
- family-names: Pradyot
given-names: Patil
- family-names: Baak
given-names: Max
year: "2022"
collection-title: "Proceedings of the 21st Python in Science Conference"
editors:
- family-names: Agarwal
given-names: Meghann
- family-names: Calloway
given-names: Chris
- family-names: Niederhut
given-names: Dillon
- family-names: Shupe
given-names: David
start: '161'
end: '168'
conference:
name: "Python in Science Conference"
address: "Austin, Texas"
3 changes: 2 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
include requirements.txt
include LICENSE
include NOTICE
include NOTICE
include extras.json
41 changes: 40 additions & 1 deletion README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Population Shift Monitoring
`popmon` works with both **pandas** and **spark datasets**.

`popmon` creates histograms of features binned in time-slices,
and compares the stability of the `profiles <https://popmon.readthedocs.io/en/latest/profiles.html>`_ and distributions of
and compares the stability of the profiles_ and distributions of
those histograms using `statistical tests <https://popmon.readthedocs.io/en/latest/comparisons.html>`_, both over time and with respect to a reference.
It works with numerical, ordinal, categorical features, and the histograms can be higher-dimensional, e.g. it can also track correlations between any two features.
`popmon` can **automatically flag** and alert on **changes observed over time**, such
Expand Down Expand Up @@ -199,7 +199,21 @@ Contributions of additional or improved integrations are welcome!
:height: 120
:target: https://github.com/elastic/kibana

Comparison and profile extensions
---------------------------------

External libraries or custom functionality can be easily added to Profiles_ and Comparisons_.
If you developed an extension that could be generically used, then please consider contributing it to the package.

Popmon currently integrates:

* `Diptest <https://github.com/RUrlus/diptest>`_

A Python/C++ implementation of Hartigan & Hartigan's dip test for unimodality.
The dip test tests for multimodality in a sample by taking the maximum difference, over all sample points, between the empirical distribution function, and the unimodal distribution function that minimizes that maximum difference.
Other than unimodality, it makes no further assumptions about the form of the null distribution.

To enable this extension install diptest using ``pip install diptest`` or ``pip install popmon[diptest]``.

Resources
=========
Expand Down Expand Up @@ -252,6 +266,28 @@ Project contributors
This package was authored by ING Wholesale Banking Advanced Analytics.
Special thanks to the following people who have contributed to the development of this package: `Ahmet Erdem <https://github.com/aerdem4>`_, `Fabian Jansen <https://github.com/faab5>`_, `Nanne Aben <https://github.com/nanne-aben>`_, Mathieu Grimal.


Citing popmon
=============
If ``popmon`` has been relevant in your work, and you would like to acknowledge the project in your publication, we suggest citing the following paper:

* Brugman, S., Sostak, T., Patil, P., Baak, M. *popmon: Analysis Package for Dataset Shift Detection*. Proceedings of the 21st Python in Science Conference. 161-168 (2022). (`link <https://conference.scipy.org/proceedings/scipy2022/popmon.html>`_)

*In BibTeX format:*

.. code-block:: bibtex
@InProceedings{ popmon-proc-scipy-2022,
author = { {S}imon {B}rugman and {T}omas {S}ostak and {P}radyot {P}atil and {M}ax {B}aak },
title = { popmon: {A}nalysis {P}ackage for {D}ataset {S}hift {D}etection },
booktitle = { {P}roceedings of the 21st {P}ython in {S}cience {C}onference },
pages = { 161 - 168 },
year = { 2022 },
editor = { {M}eghann {A}garwal and {C}hris {C}alloway and {D}illon {N}iederhut and {D}avid {S}hupe },
}
Contact and support
===================

Expand Down Expand Up @@ -298,3 +334,6 @@ Copyright ING WBAA. `popmon` is completely free, open-source and licensed under
.. |downloads| image:: https://pepy.tech/badge/popmon
:alt: PyPi downloads
:target: https://pepy.tech/project/popmon

.. _profiles: https://popmon.readthedocs.io/en/latest/profiles.html
.. _comparisons: https://popmon.readthedocs.io/en/latest/comparisons.html
5 changes: 3 additions & 2 deletions docs/autogenerate.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,9 @@ rm -rf autogen
mkdir -p source/_static autogen

# auto-generate code documentation
sphinx-apidoc -f -H POPMON -o autogen ../popmon
mv autogen/modules.rst autogen/popmon_index.rst
export SPHINX_APIDOC_OPTIONS="members,show-inheritance,ignore-module-all"
sphinx-apidoc -f -M -H "API Documentation" -o autogen ../popmon
mv autogen/modules.rst autogen/code.rst
mv autogen/* source/

# remove auto-gen directory
Expand Down
1 change: 1 addition & 0 deletions docs/requirements.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
sphinx_rtd_theme
myst_parser
sphinx_autodoc_typehints
4 changes: 2 additions & 2 deletions docs/source/code.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,6 @@ API Documentation
=================

.. toctree::
:maxdepth: 2
:maxdepth: 4

popmon_index
popmon
6 changes: 6 additions & 0 deletions docs/source/comparisons.rst
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,12 @@ The comparisons registry can be consulted for available comparisons:
print(Comparisons.get_keys())
Comparison extensions
---------------------

There are currently no comparison extensions available, and contributions are welcome.
Have a look at the :doc:`Profiles <profiles>` page for the available profile extensions.

Custom comparisons
------------------

Expand Down
6 changes: 5 additions & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@
"sphinx.ext.autodoc",
"sphinx.ext.mathjax",
"sphinx.ext.ifconfig",
"sphinx_autodoc_typehints",
]

# Add any paths that contain templates here, relative to this directory.
Expand Down Expand Up @@ -68,7 +69,7 @@

# List of patterns, relative to source directory, that match files and
# directories to ignore when looking for source files.
exclude_patterns = ["*test*", "popmon.tutorials.*", "popmon.decorators.*"]
exclude_patterns = ["*test*", "popmon.decorators.*"]

# The name of the Pygments (syntax highlighting) style to use.
pygments_style = "sphinx"
Expand All @@ -86,6 +87,9 @@

html_theme = "sphinx_rtd_theme"
html_theme_path = [sphinx_rtd_theme.get_html_theme_path()]
html_theme_options = {
"navigation_depth": 8,
}
# otherwise, readthedocs.org uses their theme by default, so no need to specify it

# Add any paths that contain custom static files (such as style sheets) here,
Expand Down
71 changes: 67 additions & 4 deletions docs/source/configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,72 @@ Report settings
Some more details on stability report settings, in particular how to set:
the reference dataset, binning specifications, monitoring rules, and where to plot boundaries.

Using ``Settings`` for configuration
------------------------------------

As of ``popmon`` v1.0.0, most options are specified on the ``Settings`` object, that is provided to the package.
Instantiating an object with the default settings and passing it to ``popmon`` is as simple as:

.. code-block:: python
from popmon import Settings
settings = Settings()
df.pm_stability_report(settings=settings)
In the next example, we change the ``reference_type`` to ``"rolling"``:

.. code-block:: python
from popmon import Settings
settings = Settings()
settings.reference_type = "rolling"
df.pm_stability_report(settings=settings)
``reference_type`` is one of the options that is defined on the top level of ``Settings``.
Other parameters are logically grouped, such as the options related to the HTML report.
Changing grouped items works similarly:

.. code-block:: python
from popmon import Settings
settings = Settings()
settings.report.title = "Report showing fewer stats"
settings.report.extended_report = False
settings.report.show_stats = ["distinct*", "filled*", "nan*"]
df.pm_stability_report(settings=settings)
A full overview of settings is available in the :doc:`api documentation <popmon>` (or one could view the `config.py <https://github.com/ing-bank/popmon/blob/master/popmon/config.py>`_).
The settings management is created on top of `pydantic <https://github.com/samuelcolvin/pydantic>`_.
For detailed instructions on how the settings object can be used, for instance exporting, we refer to `their documentation <https://pydantic-docs.helpmanual.io/>`_.

The settings are validated on assignment, and when the validation fails an ``ValidationError`` will be raised.

In some examples you may encounter an alternative syntax that has the same effect.
For completeness, we list them below:

.. code-block:: python
from popmon import Settings
# consider providing settings in the following way
settings = Settings()
settings.time_axis = "date"
df.pm_stability_report(settings=settings)
# This is identical to passing the parameters directly to the settings object
settings = Settings(time_axis="date")
df.pm_stability_report(settings=settings)
# When not passing the `settings` argument, keyword arguments will be passed on to a newly instantiated
# Settings object. This allows us to even do:
df.pm_stability_report(time_axis="date")
Binning specifications
----------------------
Expand Down Expand Up @@ -248,7 +314,7 @@ Global configuration

A number of settings is configured globally.
These can be found in the ``popmon.config`` module.
At the moment of writing these primarily cover parallel processing and descriptions of plots.
At the moment of writing this covers parallel processing.

The following snippet modifies the number of jobs and the backend used by ``joblib.Parallel``:

Expand All @@ -261,8 +327,5 @@ The following snippet modifies the number of jobs and the backend used by ``jobl
popmon.config.parallel_args["n_jobs"] = 4
popmon.config.parallel_args["backend"] = "threading"
# Disable `ing_matplotlib_theme`
popmon.config.themed = False
# Create report as usual
report = df.pm_stability_report(reference_type="self")
File renamed without changes.
17 changes: 7 additions & 10 deletions docs/source/popmon.alerting.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
popmon.alerting package
=======================

.. automodule:: popmon.alerting
:members:
:show-inheritance:
:ignore-module-all:

Submodules
----------

Expand All @@ -9,21 +14,13 @@ popmon.alerting.alerts\_summary module

.. automodule:: popmon.alerting.alerts_summary
:members:
:undoc-members:
:show-inheritance:
:ignore-module-all:

popmon.alerting.compute\_tl\_bounds module
------------------------------------------

.. automodule:: popmon.alerting.compute_tl_bounds
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

.. automodule:: popmon.alerting
:members:
:undoc-members:
:show-inheritance:
:ignore-module-all:
17 changes: 7 additions & 10 deletions docs/source/popmon.analysis.comparison.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
popmon.analysis.comparison package
==================================

.. automodule:: popmon.analysis.comparison
:members:
:show-inheritance:
:ignore-module-all:

Submodules
----------

Expand All @@ -9,21 +14,13 @@ popmon.analysis.comparison.comparisons module

.. automodule:: popmon.analysis.comparison.comparisons
:members:
:undoc-members:
:show-inheritance:
:ignore-module-all:

popmon.analysis.comparison.hist\_comparer module
------------------------------------------------

.. automodule:: popmon.analysis.comparison.hist_comparer
:members:
:undoc-members:
:show-inheritance:

Module contents
---------------

.. automodule:: popmon.analysis.comparison
:members:
:undoc-members:
:show-inheritance:
:ignore-module-all:
Loading

0 comments on commit 65125c3

Please sign in to comment.