Skip to content

Commit 70edbc2

Browse files
authored
Merge pull request #67 from ThibTrip/add_compatibility_for_sqlalchemy2
[FEAT] Add compatibility for sqlalchemy2
2 parents 5856c6a + d42746d commit 70edbc2

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+625
-972
lines changed

.circleci/config.yml

+12-2
Original file line numberDiff line numberDiff line change
@@ -29,6 +29,13 @@ jobs:
2929
python3 -m venv venv
3030
. venv/bin/activate
3131
pip install -r requirements.txt
32+
33+
- run:
34+
name: check linting with flake8
35+
command: |
36+
pip install flake8
37+
flake8 . --exclude venv
38+
3239
- save_cache:
3340
paths:
3441
- ./venv
@@ -40,14 +47,17 @@ jobs:
4047
# install package (fetches setup.py in current directory)
4148
pip install .
4249
# we need cryptography for MySQL
43-
pip install codecov coverage numpy pytest pytest-benchmark pytest-cov
50+
pip install codecov coverage flake8 numpy pytest pytest-benchmark pytest-cov
4451
pip install aiosqlite aiomysql asyncpg psycopg2 pymysql cx_Oracle cryptography tabulate npdoc_to_md
4552
# use pytest
46-
## first test with sqlalchemy latest i.e. sqlalchemy>=1.4 (after API changes, notably engine.has_table being deprecated)
53+
## first test with sqlalchemy latest i.e. sqlalchemy==2
4754
pytest -sxv pangres --cov=pangres --doctest-modules --sqlite_conn=sqlite:// --async_sqlite_conn=sqlite+aiosqlite:///test.db --pg_conn=postgresql://circleci_user:password@localhost:5432/circleci_test?sslmode=disable --async_pg_conn=postgresql+asyncpg://circleci_user:password@localhost:5432/circleci_test --mysql_conn=mysql+pymysql://circleci_user:password@127.0.0.1:3306/circleci_test --async_mysql_conn=mysql+aiomysql://circleci_user:password@127.0.0.1:3306/circleci_test --benchmark-group-by=func,param:engine,param:nb_rows --benchmark-columns=min,max,mean,rounds --benchmark-sort=name --benchmark-name=short
4855
## second test with sqlalchemy<1.4 (before API changes)
4956
pip install sqlalchemy==1.3.24
5057
pytest -sxv pangres --cov=pangres --cov-append --doctest-modules --sqlite_conn=sqlite:// --pg_conn=postgresql://circleci_user:password@localhost:5432/circleci_test?sslmode=disable --mysql_conn=mysql+pymysql://circleci_user:password@127.0.0.1:3306/circleci_test --benchmark-group-by=func,param:engine,param:nb_rows --benchmark-columns=min,max,mean,rounds --benchmark-sort=name --benchmark-name=short
58+
## third test with sqlalchemy==1.4.46
59+
pip install sqlalchemy==1.4.46
60+
pytest -sxv pangres --cov=pangres --cov-append --doctest-modules --sqlite_conn=sqlite:// --pg_conn=postgresql://circleci_user:password@localhost:5432/circleci_test?sslmode=disable --mysql_conn=mysql+pymysql://circleci_user:password@127.0.0.1:3306/circleci_test --benchmark-group-by=func,param:engine,param:nb_rows --benchmark-columns=min,max,mean,rounds --benchmark-sort=name --benchmark-name=short
5161
codecov
5262
workflows:
5363
version: 2

.flake8

+4
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
[flake8]
2+
ignore = E731
3+
max-line-length = 120
4+

.gitignore

+2-1
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
1-
# Jupyter LSP
1+
# Custom
2+
.idea
23
.virtual_documents
34

45
# Byte-compiled / optimized / DLL files

README.md

+30-4
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@
1010
_Thanks to [freesvg.org](https://freesvg.org/) for the logo assets_
1111

1212
Upsert with pandas DataFrames (<code>ON CONFLICT DO NOTHING</code> or <code>ON CONFLICT DO UPDATE</code>) for PostgreSQL, MySQL, SQlite and potentially other databases behaving like SQlite (untested) with some additional optional features (see features). Upserting can be done with **primary keys** or **unique keys**.
13-
Pangres also handles the creation of non existing SQL tables and schemas.
13+
Pangres also handles the creation of non-existing SQL tables and schemas.
1414

1515

1616
# Features
@@ -31,10 +31,22 @@ Pangres also handles the creation of non existing SQL tables and schemas.
3131
* Python >= 3.6.4
3232
* See also ./pangres/requirements.txt
3333

34+
## Requirements for sqlalchemy>=2.0
35+
36+
For using `pangres` together with **`sqlalchemy>=2.0`** (sqlalchemy is one of pangres dependencies
37+
listed in requirements.txt) - you will need the following base requirements:
38+
* `alembic>=1.7.2`
39+
* `pandas>=1.4.0`
40+
* Python >= 3.8 (`pandas>=1.4.0` only supports Python >=3.8)
41+
42+
## Requirements for asynchronous engines
43+
44+
For using asynchronous engines (such as `aiosqlite`, `asyncpg` or `aiomysql`) you will need **Python >= 3.8**.
45+
3446
# Gotchas and caveats
3547

3648
## All flavors
37-
1. We can't create JSON columns automatically but we can insert JSON like objects (list, dict) in existing JSON columns.
49+
1. We can't create JSON columns automatically, but we can insert JSON like objects (list, dict) in existing JSON columns.
3850

3951
## Postgres
4052

@@ -102,10 +114,10 @@ Note:
102114

103115
The wiki is generated with a command which uses my library [npdoc_to_md](https://github.com/ThibTrip/npdoc_to_md).
104116
It must be installed with `pip install npdoc_to_md` and you will also need the extra dependency `fire` which you
105-
can install with `pip install fire`.
117+
can install with `pip install fire`. Replace `$DESTINATION_FOLDER` with the folder of you choice in the command below:
106118

107119
```bash
108-
npdoc-to-md render-folder ./wiki/templates ./wiki
120+
npdoc-to-md render-folder ./wiki_templates $DESTINATION_FOLDER
109121
```
110122

111123
# Contributing
@@ -124,6 +136,8 @@ thanks to [**nb_conda_kernels**](https://github.com/Anaconda-Platform/nb_conda_k
124136

125137
# Testing
126138

139+
## Pytest
140+
127141
You can test one or multiple of the following SQL flavors (you will of course need a live database for this): PostgreSQL, SQlite or MySQL.
128142

129143
NOTE: in one of the tests of `pangres` we will try to drop and then create a PostgreSQL schema called `pangres_create_schema_test`. If the schema existed and was not empty an error will be raised.
@@ -156,3 +170,15 @@ Additionally, the following flags could be of interest for you:
156170
* `-x` for stopping at the first failure
157171
* `--benchmark-only` for only testing benchmarks
158172
* `--benchmark-skip` for skipping benchmarks
173+
174+
## flake8
175+
176+
flake8 must run without errors for pipelines to succeed.
177+
If you are not using the conda environment, you can install flake8 with: `pip install flake8`.
178+
179+
To test flake8 locally you can simply execute this command:
180+
181+
```
182+
flake8 .
183+
```
184+

demos/gotchas_asynchronous_pangres.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -1309,7 +1309,7 @@
13091309
"name": "python",
13101310
"nbconvert_exporter": "python",
13111311
"pygments_lexer": "ipython3",
1312-
"version": "3.9.7"
1312+
"version": "3.11.0"
13131313
}
13141314
},
13151315
"nbformat": 4,

demos/pangres_demo.ipynb

+1-1
Original file line numberDiff line numberDiff line change
@@ -599,7 +599,7 @@
599599
"name": "python",
600600
"nbconvert_exporter": "python",
601601
"pygments_lexer": "ipython3",
602-
"version": "3.9.7"
602+
"version": "3.11.0"
603603
}
604604
},
605605
"nbformat": 4,

environment.yml

+11-6
Original file line numberDiff line numberDiff line change
@@ -1,15 +1,20 @@
11
name: pangres-dev
2+
channels:
3+
- conda-forge
24
dependencies:
35
- pip
4-
- pytest
5-
- pytest-cov
6-
- pytest-benchmark
76
- psycopg2
8-
- pymysql
9-
- tabulate
107
- pip:
118
- asyncpg
129
- aiosqlite
1310
- aiomysql
1411
- cx_Oracle
15-
- npdoc_to_md
12+
- cryptography
13+
- flake8
14+
- mypy
15+
- npdoc_to_md
16+
- pymysql
17+
- pytest
18+
- pytest-benchmark
19+
- pytest-cov
20+
- tabulate

pangres/__init__.py

+8-5
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,8 @@
1-
from pangres.core import aupsert, upsert
2-
from pangres.utils import adjust_chunksize, fix_psycopg2_bad_cols
3-
from pangres.examples import DocsExampleTable
4-
from pangres._version import __version__
5-
from pangres.exceptions import *
1+
from pangres.core import aupsert, upsert # noqa: F401
2+
from pangres.utils import adjust_chunksize, fix_psycopg2_bad_cols # noqa: F401
3+
from pangres.examples import DocsExampleTable # noqa: F401
4+
from pangres._version import __version__ # noqa: F401
5+
from pangres.exceptions import (BadColumnNamesException, HasNoSchemaSystemException, # noqa: F401
6+
UnnamedIndexLevelsException, # noqa: F401
7+
DuplicateValuesInIndexException, DuplicateLabelsException, # noqa: F401
8+
MissingIndexLevelInSqlException, TooManyColumnsForUpsertException) # noqa: F401

pangres/_version.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "4.1.2"
1+
__version__ = "4.1.3"

pangres/core.py

+34-31
Original file line numberDiff line numberDiff line change
@@ -6,30 +6,31 @@
66
that will be directly exposed to its users.
77
"""
88
import pandas as pd
9-
from sqlalchemy.engine.base import Connectable
10-
from typing import Optional, Union
9+
from sqlalchemy.engine import Connectable
10+
from typing import Union
1111

1212
# local imports
1313
from pangres.executor import Executor
1414
from pangres.helpers import validate_chunksize_param
15+
from pangres.pangres_types import AsyncConnectable, AUpsertResult, UpsertResult
1516

1617

1718
# -
1819

1920
# # upsert
2021

21-
def upsert(con:Connectable,
22-
df:pd.DataFrame,
23-
table_name:str,
24-
if_row_exists:str,
25-
schema:Optional[str]=None,
26-
create_schema:bool=False,
27-
create_table:bool=True,
28-
add_new_columns:bool=False,
29-
adapt_dtype_of_empty_db_columns:bool=False,
30-
chunksize:Optional[int]=None,
31-
dtype:Union[dict,None]=None,
32-
yield_chunks:bool=False):
22+
def upsert(con: Connectable,
23+
df: pd.DataFrame,
24+
table_name: str,
25+
if_row_exists: str,
26+
schema: Union[str, None] = None,
27+
create_schema: bool = False,
28+
create_table: bool = True,
29+
add_new_columns: bool = False,
30+
adapt_dtype_of_empty_db_columns: bool = False,
31+
chunksize: Union[int, None] = None,
32+
dtype: Union[dict, None] = None,
33+
yield_chunks: bool = False) -> UpsertResult:
3334
"""
3435
Insert updates/ignores a pandas DataFrame into a SQL table (or
3536
creates a SQL table from the DataFrame if it does not exist).
@@ -210,7 +211,7 @@ def upsert(con:Connectable,
210211
... if_row_exists='update',
211212
... dtype=dtype,
212213
... create_table=False)
213-
>>>
214+
>>>
214215
>>> # Now we read from the database to check what we got and as you can see
215216
>>> # John Travolta was updated and Arnold Schwarzenegger was added!
216217
>>> with engine.connect() as connection:
@@ -299,32 +300,33 @@ def upsert(con:Connectable,
299300
# execute SQL operations
300301
if not yield_chunks:
301302
executor.execute(connectable=con, if_row_exists=if_row_exists, chunksize=chunksize)
303+
return None
302304
else:
303305
return executor.execute_yield(connectable=con, if_row_exists=if_row_exists, chunksize=chunksize)
304306

305307

306308
# # Async upsert
307309

308-
async def aupsert(con,
309-
df:pd.DataFrame,
310-
table_name:str,
311-
if_row_exists:str,
312-
schema:Optional[str]=None,
313-
create_schema:bool=False,
314-
create_table:bool=True,
315-
add_new_columns:bool=False,
316-
adapt_dtype_of_empty_db_columns:bool=False,
317-
chunksize:Optional[int]=None,
318-
dtype:Union[dict,None]=None,
319-
yield_chunks:bool=False):
310+
async def aupsert(con: AsyncConnectable,
311+
df: pd.DataFrame,
312+
table_name: str,
313+
if_row_exists: str,
314+
schema: Union[str, None] = None,
315+
create_schema: bool = False,
316+
create_table: bool = True,
317+
add_new_columns: bool = False,
318+
adapt_dtype_of_empty_db_columns: bool = False,
319+
chunksize: Union[int, None] = None,
320+
dtype: Union[dict, None] = None,
321+
yield_chunks: bool = False) -> AUpsertResult:
320322
"""
321323
Asynchronous variant of `pangres.upsert`. Make sure to read its docstring
322324
before using this function!
323325
324326
The parameters of `pangres.aupsert` are the same but parameter `con`
325327
will require an asynchronous connectable (asynchronous engine or asynchronous connection).
326328
327-
For example you can use PostgreSQL asynchronously with `sqlalchemy` thanks to
329+
For example, you can use PostgreSQL asynchronously with `sqlalchemy` thanks to
328330
the library/driver `asyncpg`, or SQLite with `aiosqlite` or Mysql with `aiomysql`.
329331
330332
**WARNING**
@@ -389,12 +391,12 @@ async def aupsert(con,
389391
>>> df = DocsExampleTable.df
390392
>>>
391393
>>> # Create table before inserting! This will avoid race conditions mentionned above
392-
>>> # (here we are lazy so we'll use pangres to do that but we could also use a sqlalchemy ORM model)
393-
>>> # By using `df.head(0)` we get 0 rows but we have all the information about columns, index levels
394+
>>> # (here we are lazy, so we'll use pangres to do that, but we could also use a sqlalchemy ORM model)
395+
>>> # By using `df.head(0)` we get 0 rows, but we have all the information about columns, index levels
394396
>>> # and data types that we need for creating the table.
395397
>>> # And in a second step (see coroutine `execute_upsert` that we define after)
396398
>>> # we will set all parameters that could cause structure changes
397-
>>> # to False so we can run queries in parallel without worries!
399+
>>> # to False, so we can run queries in parallel without worries!
398400
>>> async def setup():
399401
... await aupsert(con=engine, df=df.head(0),
400402
... table_name='example',
@@ -455,6 +457,7 @@ async def aupsert(con,
455457
# execute SQL operations
456458
if not yield_chunks:
457459
await executor.aexecute(async_connectable=con, if_row_exists=if_row_exists, chunksize=chunksize)
460+
return None
458461
else:
459462
# IMPORTANT! NO `await` because this returns an asynchronous generator
460463
return executor.aexecute_yield(async_connectable=con, if_row_exists=if_row_exists, chunksize=chunksize)

pangres/docs/fix_changelog.py

+16-16
Original file line numberDiff line numberDiff line change
@@ -4,23 +4,27 @@
44
55
Execute the command below first. Make sure to replace the following variables:
66
* $PATH_TO_PANGRES -> path to pangres repo on your computer (you have to clone it)
7-
* `-t $GITHUB_TOKEN` -> optionally give a github token (for much higher API quota)
7+
* `-t $GITHUB_TOKEN` -> optionally give a GitHub token (for much higher API quota)
88
* $OUTPUT_PATH -> where to put the CHANGELOG.md file
99
10-
sudo docker run -it --rm -v "$(pwd)":$PATH_TO_PANGRES githubchangeloggenerator/github-changelog-generator -u ThibTrip -p pangres -t $GITHUB_TOKEN -o $OUTPUT_PATH --release-url https://github.com/ThibTrip/pangres/releases/tag/%s
10+
sudo docker run -it --rm -v "$(pwd)":$PATH_TO_PANGRES githubchangeloggenerator/github-changelog-generator\
11+
-u ThibTrip -p pangres -t $GITHUB_TOKEN -o $OUTPUT_PATH\
12+
--release-url https://github.com/ThibTrip/pangres/releases/tag/%s
1113
1214
Usage:
1315
1416
python fix_changelog.py $PATH_TO_CHANGELOG
1517
"""
1618
import argparse
19+
import logging
1720
import re
1821
import sys
1922
from pathlib import Path
2023

21-
# # Helpers
2224

23-
# +
25+
logging.basicConfig(level=logging.INFO, format='%(asctime)s %(message)s')
26+
27+
# region helpers
2428
re_section_release_notes = re.compile(r'^# [A-Z]{1,}') # e.g. "# New Features" -> "# N"
2529
re_release_title_md = re.compile(r'## \[(?P<version>v[\d\.]+)\]') # see https://regex101.com/r/g6yRM8/1
2630

@@ -43,12 +47,10 @@ def get_release_notes(github_token=None):
4347
kwargs = dict(headers={'Authorization': f'token {github_token}'}) if github_token else {}
4448
response = requests.get('https://api.github.com/repos/ThibTrip/pangres/releases', **kwargs)
4549
response.raise_for_status()
46-
return {d['tag_name']:adjust_levels_release_notes(d['body']) for d in response.json()}
50+
return {d['tag_name']: adjust_levels_release_notes(d['body']) for d in response.json()}
4751

4852

4953
def add_release_notes_to_changelog(filepath, github_token=None, dryrun=False):
50-
from loguru import logger # pip install loguru
51-
5254
# open original file
5355
with open(filepath, mode='r', encoding='utf-8') as fh:
5456
ch = fh.read()
@@ -69,10 +71,10 @@ def add_release_notes_to_changelog(filepath, github_token=None, dryrun=False):
6971
version = match_version['version']
7072
try:
7173
notes = release_notes[version]
72-
logger.info(f'Adding release notes for version {version}')
74+
logging.info(f'Adding release notes for version {version}')
7375
new_ch.extend([line, '\n', '**Release Notes**', '\n', '___', notes, '___', '\n'])
7476
except KeyError:
75-
logger.warning(f'No release notes found for version {version}!')
77+
logging.warning(f'No release notes found for version {version}!')
7678
continue
7779
new_ch = '\n'.join(new_ch)
7880
if not dryrun:
@@ -81,19 +83,17 @@ def add_release_notes_to_changelog(filepath, github_token=None, dryrun=False):
8183
else:
8284
print(new_ch)
8385
return new_ch
86+
# endregion
8487

8588

86-
# -
87-
88-
# # Main
89-
90-
# +
9189
def main():
9290
# parse arguments
9391
parser = argparse.ArgumentParser(description=sys.modules['__main__'].__doc__)
9492
parser.add_argument('filepath_change_log', metavar='filepath_change_log', type=str, help="Path to the changelog")
95-
parser.add_argument('--github_token', action="store", type=str, default=None, help='Optional github token for higher API quota')
96-
parser.add_argument('--dryrun', action="store_true", default=False, help='If True, simply prints what we would save otherwise overwrites the changelog')
93+
parser.add_argument('--github_token', action="store", type=str, default=None,
94+
help='Optional github token for higher API quota')
95+
parser.add_argument('--dryrun', action="store_true", default=False,
96+
help='If True, simply prints what we would save otherwise overwrites the changelog')
9797
args = parser.parse_args()
9898
add_release_notes_to_changelog(filepath=Path(args.filepath_change_log).resolve(), github_token=args.github_token,
9999
dryrun=args.dryrun)

0 commit comments

Comments
 (0)