diff --git a/.gitlab-ci.yml b/.gitlab-ci.yml index 09dfbac..e7aa6ae 100644 --- a/.gitlab-ci.yml +++ b/.gitlab-ci.yml @@ -99,6 +99,24 @@ pages: refs: - master +# Deploy from master to package registry +# If PYPI_USERNAME/PYPI_PASSWORD are not set, defaults to gitlab +# package registry, thus using the required variables to do so +# The package will be uploaded to PYPI_URL, which can be overrided +# not to upload to gitlab's PYPI +deploy: + stage: deploy_stage + script: + - pip install twine + - python setup.py sdist bdist_wheel + - export TWINE_USERNAME=${PYPI_USERNAME:=gitlab-ci-token} + - export TWINE_PASSWORD=${PYPI_PASSWORD:=$CI_JOB_TOKEN} + - export PYPI_REPO=${PYPI_URL:=https://gitlab.com/api/v4/projects/${CI_PROJECT_ID}/packages/pypi} + - python -m twine upload --verbose --repository-url ${PYPI_REPO} dist/* + only: + refs: + - master + tag_release_version: stage: version_stage script: @@ -110,7 +128,6 @@ tag_release_version: - '' only: refs: - - merge_requests - master release: @@ -128,7 +145,6 @@ release: - '' only: refs: - - merge_requests - master check_version: diff --git a/notebook/examples/quickstart.ipynb b/notebook/examples/quickstart.ipynb index 38a1ff7..9339650 100644 --- a/notebook/examples/quickstart.ipynb +++ b/notebook/examples/quickstart.ipynb @@ -2,7 +2,6 @@ "cells": [ { "cell_type": "markdown", - "metadata": {}, "source": [ "# SOAM Quickstart\n", "How to make an end to end project using SOAM modules and tools.\n", @@ -19,11 +18,11 @@ "\n", "\n", "*Note: please ignore the # NBVAL_SKIP at the beginning of the cells!*\n" - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "metadata": {}, "source": [ "## Extraction\n", "First of all we need the data, this stage takes care of that by extracting our data from the needed sources to build the condensed dataset for the next steps. This tends to be project dependent.\n", @@ -34,30 +33,29 @@ "2. `SOAM Extractor` object initialization.\n", "3. `SQL Query` constructed
\n", "4. `Extractor.run` method executed to extract the data from the DB based on the specified query and save it as a `pandas.DataFrame` object." - ] + ], + "metadata": {} }, { "cell_type": "markdown", - "metadata": {}, "source": [ "1) DB Connection using `muttlib`. We have to set up the sqlite config and then retrieve it using the muttlib's `get_client()` method." - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 1, - "metadata": {}, - "outputs": [], "source": [ "from soam.workflow.time_series_extractor import TimeSeriesExtractor\n", "from muttlib.dbconn import get_client" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "code", "execution_count": 2, - "metadata": {}, - "outputs": [], "source": [ "sqlite_cfg = {\n", " \"db_type\": \"sqlite\",\n", @@ -65,27 +63,28 @@ "}\n", "\n", "sqlite_client = get_client(sqlite_cfg)[1]" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "metadata": {}, "source": [ "2) `SOAM Extractor` object initialization with the sqlite client of our DB and the table name given as input." - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 3, - "metadata": {}, - "outputs": [], "source": [ "extractor = TimeSeriesExtractor(db=sqlite_client, table_name='stock')" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "metadata": {}, "source": [ "3) `SQL Query` constructed using a dictionary to set the configuration of the extraction query. The idea is to convert the full dataset to the desired time granularity and aggregation level by some categorical attribute/s.\n", "\n", @@ -99,13 +98,12 @@ "\n", "\n", "*Note: The `build_query_kwargs` is a dictionary that sets the configuration of the extraction query to be used for the extraction. To see further options on how to build more complex queries check the `TimeSeriesExtractor` documentation by executing `??TimeSeriesExtractor`*" - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 4, - "metadata": {}, - "outputs": [], "source": [ "build_query_kwargs={\n", " 'columns': '*',\n", @@ -115,22 +113,38 @@ " 'extra_where_conditions': [\"symbol = 'AAPL'\"],\n", " 'order_by': [\"date ASC\"]\n", "}" - ] + ], + "outputs": [], + "metadata": {} }, { "cell_type": "markdown", - "metadata": {}, "source": [ "4) `Extractor.run` method executed to extract the data from the DB based on the specified query *(given as a parameter)* and save it as a `pandas.DataFrame` object to facilitate data manipulation.\n" - ] + ], + "metadata": {} }, { "cell_type": "code", "execution_count": 5, - "metadata": {}, + "source": [ + "import pandas as pd\n", + "df = extractor.run(build_query_kwargs = build_query_kwargs)\n", + "\n", + "df.head()" + ], "outputs": [ { + "output_type": "execute_result", "data": { + "text/plain": [ + " id date symbol avg_num_trades avg_price\n", + "0 1 2021-03-01 AAPL 80000.0 125.0\n", + "1 2 2021-03-02 AAPL 70000.0 126.0\n", + "2 3 2021-03-03 AAPL 80000.0 123.0\n", + "3 4 2021-03-04 AAPL 70000.0 121.0\n", + "4 5 2021-03-05 AAPL 80000.0 119.0" + ], "text/html": [ "
\n", "