Skip to content

Commit 9344404

Browse files
Waldemar BauerWaldemar Bauer
Waldemar Bauer
authored and
Waldemar Bauer
committed
[ADB] add lab 3
1 parent af7293c commit 9344404

File tree

1 file changed

+323
-0
lines changed
  • Advanced databases 2022/Lab 3 (Filtering data in SQLAlchemy)

1 file changed

+323
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,323 @@
1+
# Filtering data in SQLAlchemy
2+
3+
The purpose of these laboratory classes is to familiarize participants with methods to creating and execute select query with conditions.
4+
5+
The scope of this classes:
6+
- using select() - to creating select query
7+
- using query() - to creating query
8+
- and_() , or_(), in_() - to add conditions to query
9+
- order_by() - to sort results
10+
- label() - to make alias
11+
- limit() - to limit results of query
12+
13+
## Introduction
14+
From the previous classes we know two methods of creating a database model in SQLAlchemy based on:
15+
- [mapper](https://docs.sqlalchemy.org/en/13/orm/mapping_api.html#sqlalchemy.orm.mapper)
16+
- [Class representation](https://docs.sqlalchemy.org/en/13/orm/tutorial.html)
17+
18+
For both, we must first connect to the database
19+
20+
```python
21+
22+
from sqlalchemy import create_engine
23+
24+
engine = create_engine(url_to_database)
25+
```
26+
27+
We can use a script to initialize mapper operation:
28+
29+
```python
30+
from sqlalchemy import create_engine, MetaData, Table
31+
32+
metadata = MetaData()
33+
34+
dic_table = {}
35+
for table_name in engine.table_names():
36+
dic_table[table_name] = Table(table_name, metadata , autoload=True, autoload_with=engine)
37+
38+
print(repr(dic_table['category']))
39+
```
40+
Where `dic_table` is the dictionary with tables representation where the key is the name of the table. The last line in the script above shows references to the table representation named *category*.
41+
42+
If we want youse Object representation we need run script:
43+
44+
```python
45+
from sqlalchemy.orm import sessionmaker
46+
from sqlalchemy.ext.declarative import declarative_base
47+
48+
from sqlalchemy import Column, Integer, String, Date, ForeignKey
49+
50+
session = (sessionmaker(bind=engine))()
51+
52+
Base = declarative_base()
53+
54+
class Category(Base):
55+
__tablename__ = 'category'
56+
category_id = Column(Integer, primary_key=True)
57+
name = Column(String(50))
58+
last_update = Column(Date)
59+
def __str__(self):
60+
return 'Category id:{0}\nCategory name: {1}\nCategory last_update: {2}'.format(self.category_id,self.name,self.last_update)
61+
```
62+
At the moment we are ready to start creating database queries. The advantage of using ORM is that you don't have to rewrite queries when changing the database engine. The disadvantage, however, is that we are limited by query structures imposed by ORM.
63+
64+
If this does not suit us, we can of course run a query written by us:
65+
66+
```python
67+
stmt = 'select * from category'
68+
69+
results = engine.execute(stmt).fetchall()
70+
71+
print(results)
72+
```
73+
74+
## Basic select
75+
76+
To make query we can use script:
77+
78+
```python
79+
from sqlalchemy import select
80+
81+
# select * from category
82+
83+
mapper_stmt = select([dic_table['category']])
84+
print('Mapper select: ')
85+
print(mapper_stmt)
86+
87+
session_stmt = session.query(Category)
88+
print('\nSession select: ')
89+
print(session_stmt)
90+
```
91+
92+
```sql
93+
Mapper select:
94+
SELECT category.category_id, category.name, category.last_update
95+
FROM category
96+
97+
Session select:
98+
SELECT category.category_id AS category_category_id, category.name AS category_name, category.last_update AS category_last_update
99+
FROM category
100+
```
101+
As can be seen in the case of a query based on the class session, aliases are added to the names of the columns returned. This is the only difference at this stage of building queries.
102+
103+
To run a query based on the select class:
104+
```python
105+
mapper_results = engine.execute(mapper_stmt).fetchall()
106+
print(results)
107+
```
108+
As a result of the script, we get a list of tuples representing the values of table rows. Examples:
109+
110+
```python
111+
[(1, 'Action', datetime.datetime(2006, 2, 15, 9, 46, 27)), (2, 'Animation', datetime.datetime(2006, 2, 15, 9, 46, 27)), (3, 'Children', datetime.datetime(2006, 2, 15, 9, 46, 27)), (4, 'Classics', datetime.datetime(2006, 2, 15, 9, 46, 27)), (5, 'Comedy', datetime.datetime(2006, 2, 15, 9, 46, 27)), (6, 'Documentary', datetime.datetime(2006, 2, 15, 9, 46, 27)), (7, 'Drama', datetime.datetime(2006, 2, 15, 9, 46, 27)), (8, 'Family', datetime.datetime(2006, 2, 15, 9, 46, 27)), (9, 'Foreign', datetime.datetime(2006, 2, 15, 9, 46, 27)), (10, 'Games', datetime.datetime(2006, 2, 15, 9, 46, 27)), (11, 'Horror', datetime.datetime(2006, 2, 15, 9, 46, 27)), (12, 'Music', datetime.datetime(2006, 2, 15, 9, 46, 27)), (13, 'New', datetime.datetime(2006, 2, 15, 9, 46, 27)), (14, 'Sci-Fi', datetime.datetime(2006, 2, 15, 9, 46, 27)), (15, 'Sports', datetime.datetime(2006, 2, 15, 9, 46, 27)), (16, 'Travel', datetime.datetime(2006, 2, 15, 9, 46, 27))]
112+
```
113+
This form of results presentation is inconvenient if we use objectivity in all our software. To return results as a class, use the formula
114+
115+
```python
116+
session_results = session_stmt.all()
117+
# all results print
118+
print(All results: )
119+
print(session_results)
120+
# print information from first category in result list
121+
print(\nFirst category:)
122+
print(session_results[0])
123+
```
124+
125+
```python
126+
All results:
127+
[<__main__.Category object at 0x000001F996CB8588>, <__main__.Category object at 0x000001F996CB83C8>, <__main__.Category object at 0x000001F996CB8FC8>, <__main__.Category object at 0x000001F996CB8948>, <__main__.Category object at 0x000001F996C97F88>, <__main__.Category object at 0x000001F996C97988>, <__main__.Category object at 0x000001F996C97EC8>, <__main__.Category object at 0x000001F996C97DC8>, <__main__.Category object at 0x000001F996C97B08>, <__main__.Category object at 0x000001F996C97C48>, <__main__.Category object at 0x000001F996C97C08>, <__main__.Category object at 0x000001F996C974C8>, <__main__.Category object at 0x000001F996C97CC8>, <__main__.Category object at 0x000001F996C7CB88>, <__main__.Category object at 0x000001F996C7CAC8>, <__main__.Category object at 0x000001F996C6A1C8>]
128+
129+
First category:
130+
Category id:1
131+
Category name: Action
132+
Category last_update: 2006-02-15 09:46:27
133+
```
134+
As you can easily see in this case, the overloaded operator operator ** __ str __ **. This approach is very useful in implementing business logic.
135+
136+
If we want to create a query for selected columns then we use the following pattern:
137+
138+
```python
139+
mapper_stmt = select([dic_table['category'].columns.category_id,dic_table['category'].columns.name])
140+
141+
session_stmt = session.query(Category.category_id, Category.name)
142+
```
143+
In this case, the query will return a list of results in both cases. If you want to use object mapping, create a class and set session query in this way:
144+
145+
```python
146+
class Category_filter(Base):
147+
__tablename__ = 'category'
148+
__table_args__ = {'extend_existing': True}
149+
category_id = Column(Integer, primary_key=True)
150+
name = Column(String(50))
151+
def __str__(self):
152+
return 'Category id:{0}\nCategory name: {1}'.format(self.category_id,self.name)
153+
154+
q = session.query(Category_filter)
155+
print(q)
156+
```
157+
158+
159+
## Select with conditions
160+
161+
To start filtering according to a given criterion:
162+
- mapper option:
163+
```python
164+
mapper_stmt = select([dic_table['category'].columns.category_id,dic_table['category'].columns.name]).where(dic_table['category'].columns.name == 'Games')
165+
166+
```
167+
- session option:
168+
```python
169+
session_stmt = session.query(Category.category_id, Category.name).filter(Category.name == 'Games')
170+
171+
```
172+
173+
We can also use logical conditions, such as::
174+
- or_
175+
- and_
176+
- in_
177+
178+
Example of use or_ and and_ in one query:
179+
```python
180+
from sqlalchemy import or_, and_
181+
182+
mapper_stmt = select([dic_table['category'].columns.category_id,dic_table['category'].columns.name]).\
183+
where(and_(\
184+
or_(dic_table['category'].columns.category_id > 10,dic_table['category'].columns.category_id < 2), \
185+
or_(dic_table['category'].columns.category_id > 3,dic_table['category'].columns.category_id < 5)))
186+
187+
session_stmt = session.query(Category_filter).\
188+
filter(and_(\
189+
or_(Category_filter.category_id > 10,Category_filter.category_id < 2), \
190+
or_(Category_filter.category_id > 3,Category_filter.category_id < 5)))
191+
```
192+
193+
If we also want to use the in_ function:
194+
```python
195+
196+
mapper_stmt = select([dic_table['category'].columns.category_id,dic_table['category'].columns.name]).\
197+
where(and_(\
198+
or_(dic_table['category'].columns.category_id > 10,dic_table['category'].columns.category_id < 2),\
199+
or_(dic_table['category'].columns.category_id > 3,dic_table['category'].columns.category_id < 5),\
200+
dic_table['category'].columns.name.in_(['Sci-Fi','Horror','Action'])
201+
))
202+
203+
session_stmt = session.query(Category_filter).\
204+
filter(and_(\
205+
or_(Category_filter.category_id > 10,Category_filter.category_id < 2), \
206+
or_(Category_filter.category_id > 3,Category_filter.category_id < 5)),\
207+
Category_filter.name.in_(['Sci-Fi','Horror','Action'])
208+
)
209+
```
210+
211+
## Sort results in query
212+
In both cases it is possible to sort using the order_by function. For ascending sorting, the harvest will look like this:
213+
```python
214+
mapper_stmt = select([dic_table['category'].columns.category_id,dic_table['category'].columns.name]).\
215+
where(and_(\
216+
or_(dic_table['category'].columns.category_id > 10,dic_table['category'].columns.category_id < 2), \
217+
or_(dic_table['category'].columns.category_id > 3,dic_table['category'].columns.category_id < 5))).\
218+
order_by(dic_table['category'].columns.name)
219+
220+
mapper_results = db.execute(mapper_stmt).fetchall()
221+
222+
print(mapper_results)
223+
```
224+
```python
225+
[(1, 'Action'), (11, 'Horror'), (12, 'Music'), (13, 'New'), (14, 'Sci-Fi'), (15, 'Sports'), (16, 'Travel')]
226+
227+
```
228+
And in reverse:
229+
230+
```python
231+
mapper_stmt = select([dic_table['category'].columns.category_id,dic_table['category'].columns.name]).\
232+
where(and_(\
233+
or_(dic_table['category'].columns.category_id > 10,dic_table['category'].columns.category_id < 2), \
234+
or_(dic_table['category'].columns.category_id > 3,dic_table['category'].columns.category_id < 5))).\
235+
order_by(dic_table['category'].columns.name.desc())
236+
237+
mapper_results = db.execute(mapper_stmt).fetchall()
238+
239+
print(mapper_results)
240+
```
241+
```python
242+
[(16, 'Travel'), (15, 'Sports'), (14, 'Sci-Fi'), (13, 'New'), (12, 'Music'), (11, 'Horror'), (1, 'Action')]
243+
```
244+
245+
The same applies to sessions:
246+
247+
```python
248+
session_stmt_asc= session.query(Category_filter).\
249+
filter(and_(\
250+
or_(Category_filter.category_id > 10,Category_filter.category_id < 2), \
251+
or_(Category_filter.category_id > 3,Category_filter.category_id < 5))).\
252+
order_by(Category_filter.name)
253+
254+
session_stmt_desc= session.query(Category_filter).\
255+
filter(and_(\
256+
or_(Category_filter.category_id > 10,Category_filter.category_id < 2), \
257+
or_(Category_filter.category_id > 3,Category_filter.category_id < 5))).\
258+
order_by(Category_filter.name.desc())
259+
```
260+
261+
## Alias name
262+
263+
Of course, you can also enter aliases for names via the label function. Examples of use:
264+
```python
265+
mapper_stmt = select([dic_table['category'].columns.category_id.label('id'),dic_table['category'].columns.name.label('category name')])
266+
print(mapper_stmt)
267+
```
268+
```sql
269+
SELECT category.category_id AS id, category.name AS "category name"
270+
FROM category
271+
```
272+
```python
273+
session_stmt= session.query(Category_filter.category_id.label('id'), Category_filter.name.label('category name'))
274+
print(session_stmt)
275+
276+
```
277+
```sql
278+
SELECT category.category_id AS id, category.name AS "category name"
279+
FROM category
280+
```
281+
282+
283+
## Limits on the results in query
284+
To limit the number of records returned by the database, we can use the limit function. Her work is illustrated by examples:
285+
```python
286+
mapper_stmt = select([dic_table['category'].columns.category_id.label('id'),dic_table['category'].columns.name.label('category name')]).limit(3)
287+
print(mapper_stmt)
288+
```
289+
```sql
290+
SELECT category.category_id AS id, category.name AS "category name"
291+
FROM category
292+
LIMIT :param_1
293+
```
294+
```python
295+
session_stmt= session.query(Category_filter.category_id.label('id'), Category_filter.name.label('category name')).limit(3)
296+
print(session_stmt)
297+
298+
```
299+
```sql
300+
SELECT category.category_id AS id, category.name AS "category name"
301+
FROM category
302+
LIMIT %(param_1)s
303+
```
304+
## Exercise
305+
306+
Use all of these methods to create queries for the test database. Check their execution time using the [profiling and timing code methods](https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling.html).
307+
308+
For queries:
309+
1. How many categories of films we have in the rental?
310+
2. Display list of categories in alphabetic order.
311+
3. Find the oldest and youngest film in rental.
312+
4. How many rentals were in between 2005-07-01 and 2005-08-01?
313+
5. How many rentals were in between 2010-01-01 and 2011-02-01?
314+
6. Find the biggest payment in the rental.
315+
7. Find all customers from Polend or Nigeria or Bangladesh.
316+
8. Where live staff memebers?
317+
9. How many staff members live in Argentina or Spain?
318+
10. Which categories of the films were rented by clients?
319+
11. Find all categories of films rented in America.
320+
12. Find all title of films where was playe: Olympia Pfeiffer or Julia Zellweger or Ellen Presley
321+
322+
323+

0 commit comments

Comments
 (0)