diff --git a/.bumpversion.cfg b/.bumpversion.cfg index e280f82..a15dc63 100644 --- a/.bumpversion.cfg +++ b/.bumpversion.cfg @@ -1,5 +1,5 @@ [bumpversion] -current_version = 0.9.2 +current_version = 0.9.3 tag = False [bumpversion:file:soam/__init__.py] diff --git a/CHANGELOG.md b/CHANGELOG.md index 669569d..f155a9d 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,11 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [0.9.3 - 2021-08-13] + +### Update +- Revamp docs. + ## [0.9.2 - 2021-08-13] ### Fixed diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md index 6ff91bd..5286623 100644 --- a/CONTRIBUTING.md +++ b/CONTRIBUTING.md @@ -1,5 +1,5 @@ -# Contributing to soam -Thanks for your interest in contributing to `soam` πŸŽ‰. These are the guidelines for contributions. Reading them will help you get started on how to make useful contributions. +# Contributing to SoaM +Thanks for your interest in contributing to SoaM πŸŽ‰. These are the guidelines for contributions. Reading them will help you get started on how to make useful contributions. ## Foreword This guide is not final. It will evolve over time, as we learn and add new voices to the project. Check it from time to time and feel free to make suggestions πŸ˜ƒ @@ -30,7 +30,7 @@ This guide is not final. It will evolve over time, as we learn and add new voice ## Code of Conduct One of our core values at Mutt is that **we are an open team**. We all make mistakes and need help fixing them. We foster psychological safety. We clearly express it when we don’t know something and ask for advice. -We expect everyone contributing to `soam` to follow this principle. Be kind, don't be rude, keep it friendly; learn, teach, ask and help. +We expect everyone contributing to SoaM to follow this principle. Be kind, don't be rude, keep it friendly; learn, teach, ask and help. ## Issues @@ -51,7 +51,30 @@ If you find a security related bug or any kind of security rellated issue, **ple ## Development Setup ### Installation -To set up your environment and start developing check this [guide](https://gitlab.com/mutt_data/soam/-/blob/master/documentation/source/developers_starting_point.md). + +The project runs with Python>=3.6 + +To install the dependencies in [editable mode](https://pip.pypa.io/en/stable/reference/pip_install/#install-editable) +in the root of the project run: + +```bash +pip install -e ".[dev]" +pip install -e ".[test]" +``` + +[//comment]: # (TODO: 'python setup.py develop' is not working, should be the same as 'pip install -e .') +[//comment]: # (TODO: 'python setup.py develop' is failing to obtain muttlib.) + +This will install the package in +[development mode](https://setuptools.readthedocs.io/en/latest/setuptools.html#develop-deploy-the-project-source-in-development-mode). + +Next steps: +* If you already have the project running the last step before making your first commit is to review the +[development pipeline](development_pipeline.md). +* If you want more information about the main classes or patterns in the project go to [classes document](classes.md). +* If you need to understand a library, technology or concept in the project you can check the +[references](references.md). + ### Pre-Commit for Version Control Integration We use [pre-commit](https://pre-commit.com) to run several code scans and hooks like linters and formatters, defined in `.pre-commit-config.yaml`, on each staged file that make the development cycle easier. @@ -63,7 +86,7 @@ pre-commit install -t push ``` ## Style guide -`soam` follows [PEP8](https://www.python.org/dev/peps/pep-0008/). +SoaM follows [PEP8](https://www.python.org/dev/peps/pep-0008/). If you installed the [pre-commit hooks](#pre-commit) you shouldn't worry too much about style, since they will fix it for you or warn you about styling errors. We use the following hooks: @@ -82,7 +105,7 @@ We use either [numpy style](https://numpydoc.readthedocs.io/en/latest/format.htm - Method/functions to explain what it does and what it's parameters are ## Testing -`soam` uses the [pytest framework](https://docs.pytest.org/en/latest/) to test `soam`. +SoaM uses the [pytest framework](https://docs.pytest.org/en/latest/) to test SoaM. To run the default test suite run this: ```bash @@ -109,20 +132,34 @@ nox --session tests [Regression testing](https://en.wikipedia.org/wiki/Regression_testing) to ensure new changes have not broken previously working features. ## Documentation -`soam` uses [Sphinx](https://www.sphinx-doc.org/en/master/) to autogenerate it's [docs](https://mutt_data.gitlab.io/soam/) that are automatically built from [docstrings](#docstrings) and pushed by the [CI jobs](#cicd-jobs). Check the [style guide](#style-guide) section for notes on docstrings. Pushing all the docs is too cumbersome. You can generate them locally by doing: +SoaM uses [Sphinx](https://www.sphinx-doc.org/en/master/) to autogenerate it's [docs](https://mutt_data.gitlab.io/soam/) that are automatically built from [docstrings](#docstrings) and pushed by the [CI jobs](#cicd-jobs). Check the [style guide](#style-guide) section for notes on docstrings. Pushing all the docs is too cumbersome. You can generate them locally by doing: ```bash pip install .[all] -cd docs -make html +cd documentation +rm -r build +sphinx-apidoc -f -o source ../soam # To create the modules documentation +make html # To bundle the documentation ``` And open `docs/build/html/index.html` on your browser of choice. +Note that for simple tests that don't depend on external libraries you can install only the Sphinx deps. + +This documentation is created during CI using [GitLab Pages](https://docs.gitlab.com/ee/user/project/pages/). + Alternatively you can see the docs for the `master` branch [here.](https://mutt_data.gitlab.io/soam/index.html) +We are using the following extensions: + - [napoleon](https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html) to + create the rst files from the code documentation. + - [autodoc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) to + include the code documentation that napoleon generates. + - [m2r2](https://github.com/crossnox/m2r2) to easily include markdown files in the + documentation. + ## Versioning -`soam` uses [SemVer](https://semver.org). To keep things easy, we've included [bump2version](https://github.com/c4urself/bump2version/) as a dev dependency. You can use `bump2version minor` to increase the minor number. +SoaM uses [SemVer](https://semver.org). To keep things easy, we've included [bump2version](https://github.com/c4urself/bump2version/) as a dev dependency. You can use `bump2version minor` to increase the minor number. Please remember to bump the version when submitting your PR! @@ -132,7 +169,7 @@ Before fully deprecating a feature or making a breaking change, give users a `De ### Decorator -`soam` uses [deprecated](https://github.com/tantale/deprecated) decorators to implement `DeprecationWarning`. +SoaM uses [deprecated](https://github.com/tantale/deprecated) decorators to implement `DeprecationWarning`. Add a `DeprecationWarning` considering indicate: - How to achieve similar behavior if an alternative is available or a reason for the deprecation if no clear alternative is available. @@ -183,7 +220,7 @@ Deprecation warning must be added in minor releases and EOL will be on the next ## PRs Also called MRs (Merge Requests) in gitlab. -`soam` development follows a simple workflow: +SoaM development follows a simple workflow: - Assign yourself an issue - If there's none, [create it](#issues) - If you can't assign it yourself, ask someone to do it for you @@ -213,7 +250,7 @@ RFC stands for **R**equest **f**or **C**omments. It means you consider the issue ### CI/CD jobs -All commits pushed to branches in pull requests will trigger CI jobs that install `soam` in a gitlab-provided docker-env and all the extras, run all tests and check for linting. Look at [.gitlab-ci.yml](.gitlab-ci.yml) for more details on this and as well as the official [docs](https://docs.gitlab.com/ce/ci/README.html). Note that only PRs that pass the CI will be allowed to merge. +All commits pushed to branches in pull requests will trigger CI jobs that install SoaM in a gitlab-provided docker-env and all the extras, run all tests and check for linting. Look at [.gitlab-ci.yml](.gitlab-ci.yml) for more details on this and as well as the official [docs](https://docs.gitlab.com/ce/ci/README.html). Note that only PRs that pass the CI will be allowed to merge. `NOTE:` If your commit message contains [ci skip] or [skip ci], without capitalization, the job will be skipped i.e. no CI job will be spawned for that push. diff --git a/README.md b/README.md index edf53d8..d8801bf 100644 --- a/README.md +++ b/README.md @@ -2,7 +2,7 @@ [![pipeline status](https://gitlab.com/mutt_data/soam/badges/master/pipeline.svg)](https://gitlab.com/mutt_data/soam/-/commits/master) [![coverage report](https://gitlab.com/mutt_data/soam/badges/master/coverage.svg)](https://gitlab.com/mutt_data/soam/-/commits/master) [![pypi version](https://img.shields.io/pypi/v/soam?color=blue)](https://pypi.org/project/soam/) -SoaM is library created by [Mutt](https://muttdata.ai/). +SoaM is a [Prefect](https://docs.prefect.io/) based library created by [Mutt](https://muttdata.ai/). Its goal is to create a forecasting framework, this tool is developed with conjunctions of experience on previous projects. There come the name: Son of a Mutt = SoaM @@ -30,7 +30,7 @@ The process is structured in different stages: * Postprocessing: modifies the results based on business/real information or create analysis with the predicted values, such as an anomaly detection. -## Overview of the Steps Run in SoaM (planned) +## Overview of the Steps Run in SoaM ### Extraction This stage extracts data from the needed sources to build the condensed dataset for the next steps. This tends to be @@ -54,6 +54,14 @@ A variety of models are currently supported to fit and predict data. They can be * [Prophet](https://pypi.org/project/fbprophet) * [SARIMAX](https://www.statsmodels.org/dev/generated/statsmodels.tsa.statespace.sarimax.SARIMAX.html) +### Backtesting +#### Window policies +To do backtesting the data is splited in train and validation, there are two spliting methods: +- Sliding: create a fixed size window for the training data that ends at the beginning of the validation data. +- Expanding: create the training data from remaining data since the start of the series until the validation data. + +For more information review this document: [backtesting at scale](https://eng.uber.com/backtesting-at-scale/) + ### Postprocessing [//comment]: # (TODO: explain postprocessing stage chaining) This last stage is prepared to work on the forecasts generated by the pipeline. For example: @@ -199,6 +207,38 @@ pytest --mpl ## Contributing We appreciate for considering to help out maintaining this project. If you'd like to contribute please read our [contributing guidelines](https://mutt_data.gitlab.io/soam/CONTRIBUTING.html). +## CI + +To run the CI jobs locally you have to run it with [nox](https://nox.thea.codes/en/stable/): +In the project root directory, there is a noxfile.py file defining all the jobs, these jobs will be executed when calling from CI or you can call them locally. + +You can run all the jobs with the command `nox`, from the project root directory or run just one job with `nox --session test` command, for example. + +[//comment]: # (TODO: Link or explain how to run test and check locally) +[//comment]: # (TODO: Review the following CI explanation) + +The .gitlab-ci.yml file configures the gitlab CI to run nox. +Nox let us execute some test and checks before making the commit. +We are using: +* Linting job: + * [isort](https://pycqa.github.io/isort/) to reorder imports + * [pylint](https://github.com/PyCQA/pylint) to be pep8 compliant + * [black](https://github.com/psf/black) to format for code conventions + * [mypy](http://mypy-lang.org/) for static type checking +* [bandit](https://bandit.readthedocs.io/en/latest/) for security checks +* [pytest](https://docs.pytest.org/) to run all the tests in the test folder. +* [pyreverse](https://pythonhosted.org/theape/documentation/developer/explorations/explore_graphs/explore_pyreverse.html) to create diagrams of the project + +This runs on a gitlab machine after every commit. + +We are caching the environments for each job on each branch. +On every first commit of a branch, you will have to change the policy also if you add dependencies or a new package to the project. +Gitlab cache policy: +* `pull`: pull the cached files from the cloud. +* `push`: push the created files to the cloud. +* `pull-push`: pull the cached files and push the newly created files. + + ## Rules of Thumb This section contains some recommendations when working with SoaM to avoid common mistakes: diff --git a/documentation/source/classes.md b/documentation/source/classes.md index d1bdfd3..7e21fff 100644 --- a/documentation/source/classes.md +++ b/documentation/source/classes.md @@ -17,8 +17,3 @@ Prefect Task and Flow states when they are updated. ##### Forecasting class diagram ![forecaster](../images/Forecaster_class_diagram.png) -https://gitlab.com/mutt_data/onboarding/-/blob/master/docs/modern_python_apps.md#documentation - -[//comment]: # (TODO: create some flow and class diagrams, some expected or possible architecture implementations.) - - diff --git a/documentation/source/developers_starting_point.md b/documentation/source/developers_starting_point.md deleted file mode 100644 index c6b549d..0000000 --- a/documentation/source/developers_starting_point.md +++ /dev/null @@ -1,39 +0,0 @@ -# Setting up the environment -The project runs with Python>=3.6 - -To install the dependencies in [editable mode](https://pip.pypa.io/en/stable/reference/pip_install/#install-editable) -in the root of the project run: - -```bash -pip install -e ".[dev]" -pip install -e ".[test]" -``` - -[//comment]: # (TODO: 'python setup.py develop' is not working, should be the same as 'pip install -e .') -[//comment]: # (TODO: 'python setup.py develop' is failing to obtain muttlib.) - -This will install the package in -[development mode](https://setuptools.readthedocs.io/en/latest/setuptools.html#develop-deploy-the-project-source-in-development-mode). - -[//comment]: # (TODO: We could use some dependency manager) -[//comment]: # (https://packaging.python.org/tutorials/managing-dependencies/#other-tools-for-application-dependency-management) - -In a temporary test folder using the same interpreter make a smoke test running: -```bash -soam init --output -``` -That should create the scaffold for a new project. - -Check that you have an up to date muttlib version or upgrade it: -```bash -pip install --upgrade git+https://gitlab.com/mutt_data/muttlib#egg=muttlib -``` -[//comment]: # (TODO: Include what version we are using.) - -# Next Steps -* If you already have the project running the last step before making your first commit is to review the -[development pipeline](./development_pipeline.html). -* If you want more information about the main classes or patterns in the project go to [classes document](./classes.html). -* If you need to understand a library, technology or concept in the project you can check the -[references](./references.html). - diff --git a/documentation/source/development_pipeline.md b/documentation/source/development_pipeline.md deleted file mode 100644 index 5ab196e..0000000 --- a/documentation/source/development_pipeline.md +++ /dev/null @@ -1,64 +0,0 @@ -# Sprints and tasks -[//comment]: # (TODO: Explain how we are using gitlab boards) -[//comment]: # (TODO: Link or explain the worklow to solver issues) - -# Debugging -[//comment]: # (TODO: Explain the setup to debug the project) - - -# Documentation -Code is documented following [numpydoc docstring]( -https://numpydoc.readthedocs.io/en/latest/format.html) guidelines. - -## Sphinx -We are using [Sphinx](https://www.sphinx-doc.org/en/master/) to bundle the project -documentation. - -We are using the following extensions: - - [napoleon](https://www.sphinx-doc.org/en/master/usage/extensions/napoleon.html) to - create the rst files from the code documentation. - - [autodoc](https://www.sphinx-doc.org/en/master/usage/extensions/autodoc.html) to - include the code documentation that napoleon generates. - - [m2r2](https://github.com/crossnox/m2r2) to easily include markdown files in the - documentation. - -To create the documentation locally. From the documentation dir: - -```bash -sphinx-apidoc -f -o source ../soam # To create the modules documentation -make html # To bundle the documentation -``` - -This documentation is created during CI using [GitLab Pages]( -https://docs.gitlab.com/ee/user/project/pages/). - -# CI - -To run the CI jobs locally you have to run it with [nox](https://nox.thea.codes/en/stable/): -In the project root directory, there is a noxfile.py file defining all the jobs, these jobs will be executed when calling from CI or you can call them locally. - -You can run all the jobs with the command `nox`, from the project root directory or run just one job with `nox --session test` command, for example. - -[//comment]: # (TODO: Link or explain how to run test and check locally) -[//comment]: # (TODO: Review the following CI explanation) - -The .gitlab-ci.yml file configures the gitlab CI to run nox. -Nox let us execute some test and checks before making the commit. -We are using: -* Linting job: - * isort to reorder imports - * pylint to be pep8 compliant - * black to format for code conventions - * mypy for static type checking -* Bandit for security checks -* Pytests to run all the tests in the test folder. -* Pyreverse to create diagrams of the project - -This runs on a gitlab machine after every commit. - -We are caching the environments for each job on each branch. -On every first commit of a branch, you will have to change the policy also if you add dependencies or a new package to the project. -Gitlab cache policy: -* `pull`: pull the cached files from the cloud. -* `push`: push the created files to the cloud. -* `pull-push`: pull the cached files and push the newly created files. diff --git a/documentation/source/index.rst b/documentation/source/index.rst index ca3c73b..4d5495a 100644 --- a/documentation/source/index.rst +++ b/documentation/source/index.rst @@ -1,4 +1,4 @@ -Welcome to soam's documentation! +Welcome to SoaM's documentation! ================================ Contents @@ -10,13 +10,10 @@ Contents README CONTRIBUTING end2end - developers_starting_point - development_pipeline + mlflow_tracking project_structure classes - references modules - mlflow_tracking diff --git a/documentation/source/references.md b/documentation/source/references.md deleted file mode 100644 index 34e570f..0000000 --- a/documentation/source/references.md +++ /dev/null @@ -1,170 +0,0 @@ -# Libraries by relevance -This is the priority order to understand the dependencies: - - 1. prefect - 2. alembic - 3. sqlalchemy - 4. jinja - - - -# Slack -In our slack there is a channel #dev-soam, were we discuss project design -details, libraries and issues. There are also some other project related -documents pinned for you to read on. - - -# Theorical background references - -## Backtesting -### Window policies -To do backtesting the data is splited in train and validation, there are two spliting -methods: - - sliding: create a fixed size window for the training data that ends at the beginning - of the validation data. - - expanding: create the training data from remaining data since the start of the series - until the validation data. - -For more information review this document: [backtesting at scale](https://eng.uber.com/backtesting-at-scale/) - -# Core libraries references -This section contains references to the different libraries and their -associated project files. - -## numpydoc -All the documents use numpydoc to document. - -[numpydoc docs](https://numpydoc.readthedocs.io/en/latest/format.html) - -## sphinx -The numpydoc documentation is collected by sphinx. - -[sphinx docs](https://www.sphinx-doc.org/en/master/) - -## logging -The whole project uses the logging module from the standard library. - -[logging docs](https://docs.python.org/3/library/logging.html) - -## typing -The whole project uses type hinting. - -[typing docs](https://docs.python.org/3/library/typing.html) - - -## muttlib -Used in: cfg.py, helpers.py, savers.py - -[muttlib gitlab](https://gitlab.com/mutt_data/muttlib) - -## pathlib -Used in: cfg.py, forecast_plotter.py, helpers.py, mail_report.py, savers.py, -slack_report.py, utils.py - -[pathlib docs](https://docs.python.org/3/library/pathlib.html)
-[pathlib operators](https://docs.python.org/3/library/pathlib.html#operators) - - -## pkg_resources -Used in: cfg.py, console.py - -[pkg_resources docs](https://setuptools.readthedocs.io/en/latest/pkg_resources.html) - - -## decouple -Used in: cfg.py - -[decouple github](https://github.com/henriquebastos/python-decouple/) - -## click -Used in: console.py - -[click docs](https://click.palletsprojects.com/en/7.x/) - - -## datetime -Used in: constants.py, helpers.py, plot_utils.py, runner.py - -[datetime docs](https://docs.python.org/3/library/datetime.html) - - -## sqlalchemy -Used in: data_models.py, helpers.py - -[sqlalchemy docs](https://docs.sqlalchemy.org/en/13/) - - -## pandas -Used in: forecast_plotter.py, helpers.py, plot_utils.py, savers.py, -slack_report.py, step.py, utils.py - -[pandas docs](https://pandas.pydata.org/pandas-docs/stable/) - - -## contextlib -Used in: helpers.py - -[contextlib docs](https://docs.python.org/3/library/contextlib.html) - - -## Abstract Base Classes -Used in: helpers.py, savers.py, step.py - -[abc docs](https://docs.python.org/3/library/abc.html) - -## scikit-learn -Used in: helpers.py, step.py - -[scikit-learn glossary](https://scikit-learn.org/stable/glossary.html) - -## smtplib -Used in: mail_report.py - -[smtplib docs](https://docs.python.org/3/library/smtplib.html) - -## email std lib -Used in: mail_report.py - -[email std lib docs](https://docs.python.org/3/library/email.html) - - -## matplotlib -Used in: plot_utils.py - -[matplotlib contents](https://matplotlib.org/3.3.1/contents.html) - -## numpy -Used in: plot_utils.py - -[numpy docs](https://numpy.org/doc/) - -## prefect -Used in: runner.py, savers.py, step.py - -[prefect docs](https://docs.prefect.io/core/development/documentation.html) - -## filelock -Used in: savers.py - -[fileloc github](https://github.com/benediktschmitt/py-filelock) - -## slackapi -Used in: slack_report.py - -[slackapi github](https://github.com/slackapi/python-slackclient) - -## copy -Used in: utils.py - -[copy docs](https://docs.python.org/3/library/copy.html) - -# CI libraries references - -- [nox](https://nox.thea.codes/en/stable/) -- [mypy](http://mypy-lang.org/) -- [pylint](https://github.com/PyCQA/pylint) -- [isort](https://pycqa.github.io/isort/) -- [black](https://github.com/psf/black) -- [bandit](https://bandit.readthedocs.io/en/latest/) -- [pyreverse](https://pythonhosted.org/theape/documentation/developer/explorations/explore_graphs/explore_pyreverse.html) \ No newline at end of file diff --git a/soam/__init__.py b/soam/__init__.py index f1f3dca..f7baad6 100644 --- a/soam/__init__.py +++ b/soam/__init__.py @@ -1,3 +1,3 @@ """Version.""" -__version__ = '0.9.2' +__version__ = '0.9.3'