Value Agnostic Types, Part II

October 29, 2011 at 12:01 PM | Code, SQLAlchemy

In last week's post I illustrated how to use a "value agnostic type" in conjunction with an ORM-mapped class, using the @hybrid_proprerty decorator. Pressed for details about the hypothetically better TypeDecorator approach, we'll illustrate that here. The TypeDecorator approach as I've mentioned has a few non-intuitive edges, though after working out the example they're not so bad. I do in fact use the TypeDecorator approach in my day job as often as I use the pure hybrid approach.

On SQL Objects and Types

The aspect of the approach I'm a little unsatisfied with is that SQLAlchemy currently doesn't provide a first class pathway to linking operators with the types they operate upon. This is a situation I'd like to improve in an upcoming release, but let's see what the general idea is.

SQLAlchemy separates the concerns of a "SQL-representation" object and a "type-representation" object. That is, when you deal with a Column associated with a table, it is distinct from its type:

someint = Column('int_col', Integer)
somestring = Column('str_col', String)

where above, we have two Column objects. Each has a type attribute, which refers to the SQL type of the column. Above, these are illustrated as Integer and String.

I've seen other systems that do things more directly:

someint = Integer("int_col")
somestring = String("str_col")

That approach is at first simpler, since you can now put special operators upon Integer and String specific to those types, thereby linking Python's typing behavior directly to that of the database. However this fails to model how the relational database actually works. Suppose we derive new SQL constructs from someint and somestring, a "label" and a "scalar subquery", respectively:

label_of_someint = someint.label("some label")
subquery = select([somestring]).as_scalar()

Now what's the type-dependent behavior of those two objects? In this case, the rationale for SQLAlchemy's approach is clear - each has a type attribute copied straight from the origin expression, which applies the same behavior to the derived construct as the original. On the other hand, the "expression-is-the-type" system would in theory require IntegerLabel, StringLabel, IntegerSubquery, StringSubquery classes - an explosion of subclasses and behaviors awkwardly linked together. Welding the datatype to the SQL expression type doesn't really work.

SQLAlchemy's approach is more comprehensive, but at the moment only has limited hooks into the type when an expression is generated. When an expression is constructed, the type is consulted specifically for a few things, including coercion of the operator, such as coercing the add() operator to the concat() operator in the case of strings, as well as hints on what the return type of the expression should be, and what kind of type the "other" side should be treated as, if not already known. It doesn't yet have a way to add new operators to the SQL construct which contains the type, or to replace a value with a SQL expression - something that would in particular help a lot with things like PostGIS. These problems are entirely solvable, however, and may occur in a future release.

As it turns out, these limitations aren't terrible in the case of our Amount type, and we'll use the TypeDecorator to create a new "currency" type that will work just as well without the ORM, that is with plain SQL expressions, as with it. ORM-mapped classes no longer will need to use @hybrid_property to coerce columns into the type. And we'll even get Amount objects back directly in result sets. Overall, after I tried this out, I realized it's definitely the better way to go.

Reintroducing the Amount type

We'll reuse the Amount type from last week. It's mostly the same, except we'll remove __clause_element__(), and additionally provide a classmethod version of the as_currency() method, which we'll use when dealing with SQL expressions. Because a SQLAlchemy custom type can't export new operations to the enclosing SQL construct, special operations on currency types will be done via this function called externally to the target, rather than a method call from the SQL expression construct itself:

from sqlalchemy.ext.hybrid import hybrid_method
from sqlalchemy import type_coerce, Numeric

class Amount(object):
    def __init__(self, amount, currency):
        self.currency = currency
        self.amount = amount

    def __add__(self, other):
        return Amount(
                self.amount +
                other.as_currency(self.currency).amount,
                self.currency
            )

    def __sub__(self, other):
        return Amount(
                self.amount -
                other.as_currency(self.currency).amount,
                self.currency
            )

    def __lt__(self, other):
        return self.amount < other.as_currency(self.currency).amount

    def __gt__(self, other):
        return self.amount > other.as_currency(self.currency).amount

    def __eq__(self, other):
        return self.amount == other.as_currency(self.currency).amount

    @hybrid_method
    def as_currency(self, other_currency):
        return Amount(
                    currency_lookup[(self.currency, other_currency)] * self.amount,
                    other_currency)

    @as_currency.expression
    def as_currency(self, target, other_currency):
        if hasattr(target, '__clause_element__'):
            target = target.__clause_element__()
        currency = target.type.currency
        return Amount(
                    currency_lookup[(currency, other_currency)]
                     * type_coerce(target, Numeric),
                    other_currency)

    def __str__(self):
        return "%s %s" % (self.amount, self.currency)

    def __repr__(self):
        return "Amount(%s)" % self

The basic thing we've added above is a class level version of as_currency(), which accepts a target and a currency. This would be used like:

Amount.as_currency(mytable.c.balance, "usd")

This method needs to take a peek at what it gets to see if it's directly a SQL expression, or an ORM-oriented expression that needs the __clause_element__() method to resolve down to the SQL expression. Consider this to be "rough edge" number one, if you will, though there's not much to it.

Another special step in that method is that we're using type_coerce(value, Numeric) to coerce the target, which will be a SQL expression using our new TypeDecorator, so that it's treated as a Numeric when multiplying by the exchange rate. If we didn't do that, the type coercion rule used by our type would take effect and assume the exchange rate is also an Amount type, which is not the case. While that coercion rule can be customized to handle an incoming Decimal value, it's more appropriately left as is, since expressions against Amount should always be using another Amount object.

Oddly enough we used @hybrid_method to provide the "dual purposed" behavior of as_currency() at the instance level and at the class level. Nothing to do with the ORM here, just a handy place to use it.

Our TypeDecorator is then pretty much from the docs:

from sqlalchemy import TypeDecorator, Numeric

class AmountType(TypeDecorator):
    impl = Numeric

    def __init__(self, currency):
        self.currency = currency
        super(AmountType, self).__init__()

    def process_bind_param(self, value, dialect):
        if value is not None:
            value = value.as_currency(self.currency).amount
        return value

    def process_result_value(self, value, dialect):
        if value is not None:
            value = Amount(value, self.currency)
        return value

Now if we modify our mapping from last week, we can take out the hybrid, and just use AmountType directly:

class BankAccount(Base):
    __tablename__ = 'bank_account'
    id = Column(Integer, primary_key=True)

    balance = Column('balance', AmountType("usd"))

    def __repr__(self):
        return "BankAccount(%s)" % self.balance

And that's it ! What changes with this usage, and what stays the same ? Most simple things stay the same, such as creating new BankAccount objects:

session.add_all([
    account,
    BankAccount(balance=Amount(10000, "cad")),
    BankAccount(balance=Amount(5700, "usd")),
    BankAccount(balance=Amount(89682, "aud")),
])

as well as basic comparison operations:

>>> print session.query(BankAccount).\
...    filter(BankAccount.balance == Amount(10000, "cad")).one()
SELECT bank_account.id AS bank_account_id,
        bank_account.balance AS bank_account_balance
FROM bank_account
WHERE bank_account.balance = ?
(9886.11,)
BankAccount(9886.1100 usd)

A change is when we want to use an Amount specific operator on the SQL side, namely as_currency(). We use the new class-level version, which produces a new Amount object given against the underlying balance column, coerced into a Numeric and multiplied by the exchange rate:

>>> print Amount.as_currency(BankAccount.balance, "cad")
:param_1 * bank_account.balance cad

In other words, "(exchange rate * numeric balance) in canadian dollars". The returned Amount object maintains the "hybrid" behavior we saw last week:

>>> print session.query(BankAccount).\
...    filter(Amount.as_currency(BankAccount.balance, "cad") > Amount(9999, "cad")).\
...    filter(Amount.as_currency(BankAccount.balance, "cad") < Amount(10001, "cad")).\
...    one()
SELECT bank_account.id AS bank_account_id, bank_account.balance AS bank_account_balance
FROM bank_account
WHERE ? * bank_account.balance > ? AND ? * bank_account.balance < ?
(1.01152, 9999.0, 1.01152, 10001.0)
BankAccount(9886.1100 usd)

With the typed approach, we can finally get Amount objects straight back from a result set:

>>> print session.query(BankAccount.balance).all()
SELECT bank_account.balance AS bank_account_balance FROM bank_account
[(Amount(4000.0000 usd),), (Amount(9886.1100 usd),), (Amount(5700.0000 usd),), (Amount(92338.3808 usd),)]

and it works without the ORM too !

>>> from sqlalchemy import select
>>> bank_account = BankAccount.__table__
>>> for amount, in session.execute(
...            select([bank_account.c.balance]).\
...            where(Amount.as_currency(bank_account.c.balance, "cad") > Amount(9999, "cad"))
...            ).fetchall():
...     print "Amount:", amount, \
...                 "As USD:", amount.as_currency("usd"), \
...                 "As CAD:", amount.as_currency("cad")
SELECT bank_account.balance FROM bank_account
WHERE ? * bank_account.balance > ?
(1.01152, 9999.0)
Amount: 9886.1100 usd As USD: 9886.1100 usd As CAD: 9999.9980 cad
Amount: 92338.3808 usd As USD: 92338.3808 usd As CAD: 93402.1190 cad

Downloads

Download the revised version of account_currency.py: