- It's the database you already have -
sqlite3
has been built into Python since 2006 - It's screamingly fast, and surprisingly powerful
- Amazing compatibility: bindings for every language, files work on every platform, fantastic track record for backwards compatibility, so it's safe to trust your data to a SQLite file
- Databases are just files on disk. You can create and discard them without any ceremony.
- It handles text (including JSON), integers, floating point numbers, and binary blobs. Which means it can store anything.
- It can handle up to 2.8TB of data(!)
- It has some interesting characteristics, for example Many Small Queries Are Efficient In SQLite
Let's download a database to play with - we'll use the database that powers the https://datasette.io/ website:
wget https://datasette.io/content.db
To access it from Python:
import sqlite3
db = sqlite3.connect("content.db")
print(db.execute("select sqlite_version()").fetchall())
# [('3.39.0',)]
# Show rows from the plugin_repos table
for row in db.execute("SELECT * FROM plugin_repos LIMIT 10"):
print(row)
# Each row is a tuple. We can change that like this:
db.row_factory = sqlite3.Row
for row in db.execute("SELECT * FROM plugin_repos LIMIT 10"):
print(row)
# This outputs <sqlite3.Row object at 0x7f5d3d8a3760>
# We can use dict() to turn those into dictionaries instead
for row in db.execute("SELECT * FROM plugin_repos LIMIT 10"):
print(dict(row))
Let's create a table:
db.execute("""
create table peps (
id integer primary key,
title text,
author text,
status text,
type text,
created text,
body text
);
""")
Here's a function I wrote that can parse a PEP:
def parse_pep(s):
intro, body = s.split("\n\n", 1)
pep = {}
current_key = None
current_value = None
for line in intro.split("\n"):
# If the line starts with whitespace, it's a continuation of the previous value
if line.startswith(" ") or line.startswith("\t"):
if current_key is not None:
current_value += " " + line.strip()
pep[current_key] = current_value.strip()
else:
# Split the line into key and value
parts = line.split(": ", 1)
if len(parts) == 2:
key, value = parts
# Update the current key and value
current_key = key
current_value = value
# Add the key-value pair to the pep dictionary
pep[current_key] = current_value.strip()
pep["Body"] = body.strip()
return pep
Let's fetch and parse the Zen of Python:
import urllib.request
zen = urllib.request.urlopen(
"https://raw.githubusercontent.com/python/peps/main/peps/pep-0020.rst"
).read().decode("utf-8")
pep = parse_pep(zen)
And insert that into our database:
db.execute("""
insert into peps (
id, title, author, status, type, created, body
) values (
?, ?, ?, ?, ?, ?, ?
)
""", (
pep["PEP"],
pep["Title"],
pep["Author"],
pep["Status"],
pep["Type"],
pep["Created"],
pep["Body"],
))
Since this is a dictionary already, we can use alternative syntax like this:
db.execute("delete from peps where id = 20")
db.execute("""
insert into peps (
id, title, author, status, type, created, body
) values (
:PEP, :Title, :Author, :Status, :Type, :Created, :Body
)
""", pep)
To confirm that it was correctly inserted:
print(db.execute("select * from peps").fetchall())
To update a record:
with db:
db.execute("""
update peps set author = ?
where id = ?
""", ["Tim Peters", 20])
This will run in a transaction.
To delete a record:
with db:
db.execute("""
delete from peps
where id = ?
""", [20])
Or to delete everything:
delete from peps
SQLite create table
is easier than many other databases, because there are less types to worry about. There are four types you need to worry about:
integer
real
text
blob
Unlike other databases, length limits are neither required or enforced - so don't worry about varchar(255)
, just use text
.
Tables automatically get an ID column called rowid
- an incrementing integer. This will be the primary key if you don't specify one.
If you specify integer primary key
it will be auto-incrementing and will actually map to that underlying rowid
.
You can set id text primary key
for a text primary key - this will not increment, you will have to set it to a unique value for each row yourself. You could do this with UUIDs generated using uuid.uuid4()
for example.
SQLite is loosely typed by default: you can insert any type into any column, even if it conflicts with the column type!
A lot of people find this very uncomfortable.
As-of SQLite 3.37.0 (2021-11-27) you can set strict mode on a table to opt-out of this loose typing:
create table peps (
id integer primary key,
title text,
author text,
body text
) strict
Here's an example of the impact transactions have on file-based databases:
>>> import sqlite3
>>> db = sqlite3.connect("/tmp/data.db")
>>> db.execute("create table foo (id integer primary key, name text)")
<sqlite3.Cursor object at 0x102ec5c40>
In another window:
% sqlite3 data.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE foo (id integer primary key, name text);
COMMIT;
>>> db.execute('insert into foo (name) values (?)', ['text'])
<sqlite3.Cursor object at 0x102ec5bc0>
In the other window:
% sqlite3 data.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE foo (id integer primary key, name text);
COMMIT;
>>> db.commit()
And now:
% sqlite3 data.db .dump
PRAGMA foreign_keys=OFF;
BEGIN TRANSACTION;
CREATE TABLE foo (id integer primary key, name text);
INSERT INTO foo VALUES(1,'text');
COMMIT;
A nicer pattern is to do this:
>>> with db:
... db.execute('insert into foo (name) values (?)', ['text'])
The with db:
wraps everything inside that block in a transaction.