Python 圈内三大 ORM SQLAlchemy VS Django ORM VS Peewee
那,既然已经可以访问数据库,本着『如无必要,勿增实体』的原则,为什么要不辞劳苦的用个库呢?
https://www.thoughtfulcode.com/orm-active-record-vs-data-mapper/
ORM 通常有 DataMapper 实现和 ActiveRecord 实现两种。
依照我的经验,ActiveRecord 使用起来的更接近对象 (Object) 的操作,DataMapper 使用起来更接近 (Table) 的操作。
SQLAlchemy 是 DataMapper 模式的实现,在该模式下,session 会暴露出来,即 Model 与 session 并不耦合。
DjangoORM 是 ActivityRecord 模式的一种实现,在该模式下,session 并不暴露出来,即 Model 与 session 耦合。
使用 Django ORM 的时候,往往是
b = Blog(**data)
b.save()
使用 SQLAlchemy 的时候,往往是
b = Blog(**data)
session.add(b)
session.session(b)
由于 Django 帮你屏蔽了 session 的操作。
在通常情况下,
举个例子,
一对多,Father 添加两个小孩(其中一个小孩是已经存在的)
在 DjangoORM 里面, 这里更像是一个 Set 的 add 操作。
father.children.add(new_child,exsit_child)
在 SQLAlchemy 里面,这里更像是一个 table 的 insert 操作。(麻蛋,你要说是一个 list 的 append 操作也行)
for child in (new_child,exsit_child):
if child in father.children:
father.children.append(child)
写 SQLAlchemy 更接近操作放在数据库里面的数据记录,而 DjangoORM 更接近操作一批放在数据库里面的对象。
由于 session 的使用姿势不同,所以往往会有很多使用上面的区别。
至于孰优孰劣,难以评判。
技术老大 (Flask 和 React 大神)倾向于使用 SQLAlchemy, 他认为
我 (Django 和 Vue 弱鸡)倾向于使用 DjangoORM, 我认为
Active Record, properly defined, is a design pattern where an object is represented as a record on a table in a relational database.
先看一组模型
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base() # 模型基类
from sqlalchemy import Column, Integer, String
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True) # 主键
name = Column(String)
fullname = Column(String)
password = Column(String)
def __repr__(self):
return "<User(name='%s', fullname='%s', password='%s')>" % (
self.name, self.fullname, self.password)
Base.metadata.create_all(engine)
可以看出,包含如下的部分:
https://github.com/zzzeek/sqlalchemy/blob/master/lib/sqlalchemy/sql/schema.py#L3685
母亲有若干个孩子,外键在孩子上。
class Parent(Base):
#...
children = relationship("Child", backref="parent")
class Child(Base):
#...
parent_id = Column(Integer, ForeignKey('parent.id'))
多个母亲共享一个孩子,外键在母亲上。
class Parent(Base):
child_id = Column(Integer, ForeignKey('child.id'))
child = relationship("Child")
class Child(Base):
# ...
One to One 是 One to Many 或者是 Many to One 的简化版本
# Many To One
class Parent(Base):
# ...
child_id = Column(Integer, ForeignKey('child.id'))
child = relationship("Child", backref=backref("parent", uselist=False))
class Child(Base):
# ...
# One To Many 改 One To One
class Parent(Base):
# ...
class Child(Base):
# ...
parent_id = Column(Integer, ForeignKey('parent.id'))
parent = relationship("Parent", backref=backref("child", uselist=False))
association_table = Table('association', Base.metadata,
Column('left_id', Integer, ForeignKey('left.id')),
Column('right_id', Integer, ForeignKey('right.id'))
)
class Parent(Base):
# ...
children = relationship("Child",
secondary=association_table,
backref="parents")
class Child(Base):
# ...
注意事项
执行删除 mapping 表的时候尽量这样。
myparent.children.remove(somechild)
当你想干掉 somechild 的时候,会执行
session.delete(somechild)
class Node(Base):
__tablename__ = 'node'
id = Column(Integer, primary_key=True)
parent_id = Column(Integer, ForeignKey('node.id'))
data = Column(String(50))
children = relationship("Node",
backref=backref('parent', remote_side=[id])
)
c1 = Child(name="苏轼")
session.add(c1)
session.flush()
p = Parent(name="苏辙")
p.best_child = c1
for c in [c1,c2,c3,c4]:
p.children.append(c)
session.add(c1)
session.commit()
过滤
filter(**kwargs)
filter(Singer.name == "周杰伦")
filter(Singer.name =! "周杰棍")
session.query(Entry).join(Blog,Blog.entry_id == Entry.id).filter(Blog.name = "SqlAlchemy CheatSheet")
query = session.query(Order) # query = session.query(Order)
query = query.filter(Order.name.like(f"%name%"))
我们先 type 一下表达式,找到 eq 的类型
type(model.column_name == 'asdf') → sqlalchemy.sql.elements.BinaryExpression
是一个二进制表达式。
# 等于
query.filter(User.name == 'ed')
# 不等于
query.filter(User.name != 'ed')
# Like(有的数据库不区分大小写)
query.filter(User.name.like('%ed%'))
# ILIKE (case-insensitive LIKE)
query.filter(User.name.ilike('%ed%'))
# IN
query.filter(User.name.in_(['ed', 'wendy', 'jack']))
# Not in
query.filter(~User.name.in_(['ed', 'wendy', 'jack']))
# IS NULL
query.filter(User.name == None)
## 如果你用了 pep8/linter 的话
query.filter(User.name.is_(None))
# IS NOT NULL:
query.filter(User.name != None)
## 如果你用了 pep8/linter 的话
query.filter(User.name.isnot(None))
# AND
## use and_()
query.filter(and_(User.name == 'ed', User.fullname == 'Ed Jones'))
## or send multiple expressions to .filter()
query.filter(User.name == 'ed', User.fullname == 'Ed Jones')
## or chain multiple filter()/filter_by() calls
query.filter(User.name == 'ed').filter(User.fullname == 'Ed Jones')
# OR
query.filter(or_(User.name == 'ed', User.name == 'wendy'))
# MATCH
query.filter(User.name.match('wendy'))
all()
first()
one()
one_or_none()
scalar()
YourModel.query.get((pk1, pk2))
同一个 Session 下面,取到的某一条数据对应的 objects 应该是一样的?
# 查询的是 SomeModel 里面所有的字段 即 select *
query = session.query(SomeModel)
# 查询的是 SomeModel 里面部分的字段 即 select acol, bcol
query = session.query(SomeModel.acol,SomeModel.bcol)
# 即 select acol , bcol
# alias
user_alias = aliased(User, name='user_alias')
for row in session.query(user_alias, user_alias.name).all():
# 即相当于 select name as name_label
print(row.user_alias)
# limit 和 offset
for u in session.query(User).order_by(User.id)[1:3]:
print(u)
# distinct
session.query(model.Name).distinct(model.Name.value).order_by(model.Name.value)
# order_by
User.query.order_by(User.popularity.desc(),User.date_created.desc()).limit(10).all()
单个 object 更新
blog.title = "大宝天天见"
session.add(blog)
session.commit()
批量更新
session.query.filter(Blog.content.like("% 敏感词 %")).update({
Blog.content: "依照相关 XX 无法查看"
})
一对多的更新
append
query.delete()
https://stackoverflow.com/questions/6044309/sqlalchemy-how-to-join-several-tables-by-one-query
for u, a in session.query(User, Address).\
filter(User.id==Address.user_id).\
filter(Address.email_address=='jack@google.com').\
all():
print(u)
print(a)
# <User(name='jack', fullname='Jack Bean', password='gjffdd')>
# <Address(email_address='jack@google.com')>
query.outerjoin(User.addresses) # LEFT OUTER JOIN
session.query(User).filter(User.name.like('%ed')).count()
from sqlalchemy import func
session.query(Table.column, func.count(Table.column)).group_by(Table.column).all()
self.session.query(func.count(Table.column1),Table.column1, Table.column2).group_by(Table.column1, Table.column2).all()
from sqlalchemy.sql import func
session.query(func.avg(Rating.field2).label('average')).filter(Rating.url==url_string.netloc)
Query 对象,下文中,我会聊到这个 Query 对象。这里先跳过。
from sqlalchemy import text
sql = text('select name from penguins')
result = db.engine.execute(sql)
names = []
for row in result:
names.append(row[0])
print names
from collections import namedtuple
Record = namedtuple('Record', result.keys())
records = [Record(*r) for r in result.fetchall()]
for r in records:
print(r)
from sqlalchemy.sql import text
connection = engine.connect()
# recommended
cmd = 'select * from Employees where EmployeeGroup == :group'
employeeGroup = 'Staff'
employees = connection.execute(text(cmd), group = employeeGroup)
get_or_create
def get_or_create(session, model, defaults=None, **kwargs):
instance = session.query(model).filter_by(**kwargs).first()
if instance:
return instance, False
else:
params = dict((k, v) for k, v in kwargs.iteritems() if not isinstance(v, ClauseElement))
params.update(defaults or {})
instance = model(**params)
session.add(instance)
return instance, True
session.query(Stuff).update({Stuff.foo: Stuff.foo + 1})
1) for c in session.query(Stuff).all():
c.foo += 1
session.commit()
2) session.query().\
update({"foo": (Stuff.foo + 1)})
session.commit()
3) conn = engine.connect()
stmt = Stuff.update().\
values(Stuff.foo = (Stuff.foo + 1))
conn.execute(stmt)
1) user.no_of_logins += 1
session.commit()
2) session.query().\
filter(User.username == form.username.data).\
update({"no_of_logins": (User.no_of_logins +1)})
session.commit()
3) conn = engine.connect()
stmt = User.update().\
values(no_of_logins=(User.no_of_logins + 1)).\
where(User.username == form.username.data)
conn.execute(stmt)
4) setattr(user, 'no_of_logins', user.no_of_logins+1)
session.commit()
https://stackoverflow.com/questions/5033547/sqlalchemy-cascade-delete
ondelete=‘CASCADE’))
models.User.query.delete()
如何追踪 Object 的变化?
https://stackoverflow.com/questions/1337095/sqlalchemy-inheritance
Flush 和 commit
https://stackoverflow.com/questions/4201455/sqlalchemy-whats-the-difference-between-flush-and-commit
x = Foo(bar=1)
print x.id
# None
session.add(x)
session.flush()
# BEGIN
# INSERT INTO foo (bar) VALUES(1)
# COMMIT
print x.id
qry = DBSession.query(User).filter(
and_(User.birthday <= '1988-01-17', User.birthday >= '1985-01-17'))
>>> from sqlalchemy.dialects import postgresql
>>> print str(q.statement.compile(dialect=postgresql.dialect()))
Where q is defined as:
https://stackoverflow.com/questions/6297404/multi-threaded-use-of-sqlalchemy
https://stackoverflow.com/questions/9619789/sqlalchemy-proper-session-handling-in-multi-thread-applications
https://stackoverflow.com/questions/34322471/sqlalchemy-engine-connection-and-session-difference
https://stackoverflow.com/questions/11769366/why-is-sqlalchemy-insert-with-sqlite-25-times-slower-than-using-sqlite3-directly
https://stackoverflow.com/questions/12223335/sqlalchemy-creating-vs-reusing-a-session
session 是个容器
https://stackoverflow.com/questions/18199053/example-of-what-sqlalchemy-can-do-and-django-orm-cannot
https://stackoverflow.com/questions/7389759/memory-efficient-built-in-sqlalchemy-iterator-generator
# 日志
import logging
logging.basicConfig()
logging.getLogger('sqlalchemy.engine').setLevel(logging.INFO)
SQLALCHEMY_TRACK_MODIFICATIONS = False
dict(u)
u.dict
https://stackoverflow.com/questions/1171166/how-can-i-profile-a-sqlalchemy-powered-application
如果用上 Flask+SQLAlchemy 一般也要带上,Flask-Migration 与 Flask-SQLAlchemy, 这两个库也是对 Alembic 和 SQLAlchemy 的浅封装。
那么,对于这个 ORM 库还有那些通用性的知识需要了解?
嗯,是时候了解本质了。
http://derrickgilland.com/posts/demystifying-flask-sqlalchemy/
ORM 的本质是 Data Access Layer 上的一层封装。如果你写原生 SQL, 即手写 DAL 的话,开发效率可能会大打折扣。
# ActiveRecord 风格写起来类似于 Django ORM, 大致是这样的
## AR 的模型定义
class User(db.models):
name = db.StringField(verbose="xyz")
## AR 的新增
user = User()
user.name = "123456"
user.save() ## 正好对应数据库中的一行
## AR 的查询
users = User.objects.filter(Q(name="黄老板的小姨子")).all()
# Data Mappers 风格写起来类似于 SQLAlchemy ORM, 大致是这样的
## SA 的定义
from sqlalchemy.ext.declarative import declarative_base
Base = declarative_base()
from sqlalchemy import Column, Integer, String
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String)
Base.metadata.create_all(engine)
## SA 的新增
user = User()
user.name = "123456"
session.add(user)
sessoon.commit() ## 嗯?其实也是对应数据库中的一行。
## SA 的查询
session.query(User).filter(User.name)
问题来了,这两者到底是什么,看起来似乎相差不大。
class Person:
lastname
firstname
children
# 数据操作
def findone(self):
pass
def insert(self):
pass
def update(self):
pass
def delete(self):
pass
# 业务逻辑
def getChildrenTax(self):
pass
lastName firstName numberOfDependents
insert update delete
getExemption isFlaggedForAudit getTaxableEarnings
An object that wraps a row in a database table or view, encapsulates the database access, and adds domain logic on that data.
The essence of an Active Record is a Domain Model (116) in which the classes match very closely the record structure of an underlying database. Each Active Record is responsible for saving and loading to the database and also for any domain logic that acts on the data. This may be all the domain logic in the application, or you may find that some domain logic is held in Transaction Scripts (110) with common and data-oriented code in the Active Record.
The data structure of the Active Record should exactly match that of the database: one field in the class for each column in the table. Type the fields the way the SQL interface gives you the data—don’t do any conversion at this stage. You may consider Foreign Key Mapping (236), but you may also leave the foreign keys as they are. You can use views or tables with Active Record, although updates through views are obviously harder. Views are particularly useful for reporting purposes.
objects correspond directly to the database tables: an isomorphic schema. If your business logic is complex, you’ll soon want to use your object’s direct relationships, collections, inheritance, and so forth. These don’t map easily onto Active Record, and adding them piecemeal gets very messy. That’s what will lead you to use Data Mapper (165) instead.
Another argument against Active Record is the fact that it couples the object design to the database design. This makes it more difficult to refactor either design as a project goes forward.
Active Record is a good pattern to consider if you’re using Transaction Script (110) and are beginning to feel the pain of code duplication and the difficulty in updating scripts and tables that Transaction Script (110) often brings. In this case you can gradually start creating Active Records and then slowly refactor behavior into them. It often helps to wrap the tables as a Gateway (466) first, and then start moving behavior so that the tables evolve to a Active Record.
其实为什么不选择设计成 ActiveRecord , 而是选择设计成 Data Mapper, 其实就可以回答这个问题:
虽然要设计成 ORM, 考虑到数量和性能因素,SQL 数据库(多个表)并不应该是表现像 Object 集合那样(换言之,也就是 AR 表现的像 Object 的集合一样)。
同时,出于更好的抽象,object 集合也应该表现的像表以及行
于是我们可以得出结论,可以在 SQLAlchemy 上面进行一定的封装,使得最后用起来非常的 Django ORM like,其实 SQLAlchemy 稍加定制还是可以很 Django ORM-like 的。
:TODO: 有机会看看那本书再修改一下本小节
这不,果然有人就这么搞了 https://github.com/absent1706/sqlalchemy-mixins
session.query(MyClass).filter(“foo={}”.format(getArgs[‘val’]))
https://www.eversql.com/django-vs-sqlalchemy-which-python-orm-is-better/
除了文档本身,作者在 Stack Overflow 上的回答都是非常值得阅读的。https://stackoverflow.com/users/34549/zzzeek
Patterns of Enterprise Application Architecture - Martin Fowler
ChangeLog: