I know, "ORMs are great", but they suck too.
Let me cut right to the chase, ORMs model a static relationship, but SQL is about sets. Dynamic sets of data.
I was working on a Django project, and one of the exports on the backend was a little lacking. Okay, it was terrible. It wasn't anything I would have presented to a client in 15 years. It was beyond useless.
Actual output from a website's "export" feature.
What the client actually wants is a list of the profile fields and start times for some activities. The activities are a one-to-many relationship. You know the one:
Table A
-----------------
| id | f | j | k |
------------------
| 1 | a | b | c |
| 2 | d | e | f |
| 3 | h | i | j |
------------------
Table B
---------------------------
| id | q | ordinal | a_id |
---------------------------
| 2 | a | 1 | 1 |
| 3 | b | 2 | 1 |
| 5 | c | 3 | 1 |
| 6 | d | 1 | 2 |
| 7 | e | 2 | 2 |
---------------------------
And then the client wants output like this: (you know the one)
Report 1
-----------------------------------------
| id | f | j | k | b_1_q | b_2_q | b_n_q |
-----------------------------------------
| 1 | a | b | c | a | b | c |
| 2 | d | e | f | d | e | null |
| 3 | h | i | j | null | null | null |
-----------------------------------------
This is a simple activity log related to a profile. The output has to be limtied to N known items from table B - it cannot be dynamic.
The problem with this type of output and ORMs is that the relationship between tables A and B changes. Well, ok, it doesn't actually change, but ORMs cannot view the relationship in any way except their default way of viewing the world.
The SQL for producing Report 1 is simple.
SELECT A.*
,B.q as b_1_q
,C.q as b_2_q
,D.q as b_3_q
FROM table_a A
LEFT JOIN table_b AS B
ON B.a_id = A.id AND ordinal = 1
LEFT JOIN table_b AS C
ON C.a_id = A.id AND ordinal = 2
LEFT JOIN table_b AS D
ON D.a_id = A.id AND ordinal = 3
WHERE -- date range and what not
Traditional "model based" ORMs have no way of flattening out a one-to-many relationship, - at least none that I've seen are capable of doing this.
Build a Model for Reporting
The idea struck me that I could build another model that maps the data the way I need it for a report. But, I can't actually save the data this way. This model would be strictly read-only.
class ProfileExport(models.Model):
"""Complete pseudo-code"""
profile = models.OneToOneField(
'profiles.Profile',
editable=False,
)
activity1 = models.ManyToOneField(
'activity_log.Activity',
editable=False,
)
activity2 = models.ManyToOneField(
'activity_log.Activity',
editable=False,
)
activity3 = models.ManyToOneField(
'activity_log.Activity',
editable=False,
)
Something like this might construct the proper SQL to read and export, but all the methods for saving data would be meaningless at best and dangerous at worst.
I don't want to actually define this model as part of my system, I only want to construct an object to access the data this way.
Reporting Relationships are Fluid
So, if we accept that relationships mutate on an ad-hoc basis when it comes to reporting, systems that provide only a model based ORM are deficient, and will always have one of these three:
- poor data exports (see pic above)
- poor performing exports (thousands of queries per export)
- or poor quality export code (Model hacks for exporting)
One solution would be to hand-write SQL for reporting, but that's not exciting.
Fluid ORM Access
So why can't I define an ORM on the fly?
"""Complete pseudo-code """
profileexport = new models.Model
profileexport.profile = models.ManyToOneField(
'profiles.Profile',
editable=False,
)
profileexport.activity1 = models.ManyToOneField(
'activity_log.Activity',
editable=False,
)
profileexport.activity2 = models.ManyToOneField(
'activity_log.Activity',
editable=False,
)
profileexport.activity3 = models.ManyToOneField(
'activity_log.Activity',
editable=False,
)
Then I could just throw the definition away when I'm done using it. The value of mutating relationships only presents itself during read-only reporting situations, not when validating forms or cascading deletes. See OLTP vs OLAP.
All in all, this method is simply a thin wrapper around SQL, but sometimes you want that wrapper. That wrapper can save you from trying to decrypt hundres of lines of IF statements and hardcoded SQL fragements.
If this sort of on-the-fly ORM definition interests you, checkout my Data Access library MetroDB.
//FYI, here is the MetroDB syntax
$finder = new Metrodb_Dataitem('table_a');
$finder->_cols = ['table_a.*', 'B.q AS b_q_1', 'C.q as b_q_2', 'D.q as b_q_3'];
$finder->hasOne('table_b', 'B' 'a_id')
$finder->hasOne('table_b', 'C' 'a_id')
$finder->hasOne('table_b', 'D' 'a_id')
$finder->andWhere('date', '2016-11-03' '>=');
$finder->andWhere('B.ordinal', 1);
$finder->andWhere('C.ordinal', 2);
$finder->andWhere('D.ordinal', 3);
$finder->_rsltByPkey = false;
$rowList = $finder->findAsArray();