Polymorphic Association in Relational Database

Polymorphic Association

With polymorphic association, a model can belongs_to several models with a single association.

As an example, let’s think of a e-commerce site, both individuals and groups can make an order to buy products. These individuals and groups are not related (other than both being a type of User, maybe), and so they have different data.

Without polymorphic associations, we would have something like this

1
2
3
4
5
6
7
8
9
10
11
12
class Order
belongs_to :person
belongs_to :group
end

class Person
has_many :orders
end

class Group
has_many :orders
end

So, the Order table would have to competing foreign keys: group_id and person_id . This can be a problem, for example, when trying to find the owner of an order, we would have to make a point to check both columns to find correct foreign key, rather than relying on one.

This is a polymorphic association addresses this issue by condensing this functionality into one association. But it’s not easy to correctly represent the association in a relational database. There are four different ways:

  • The Join tables.
  • The Type column.
  • Reverse belongs to.
  • Exclusive arc.

The Type Column

A simple approach to connect an order to a resource is to use two columns on the order table: resource_type and resource_id . This approach was popularized by Ruby on Rails.

Source Code: Branch: type-column

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
class CreateOrders < ActiveRecord::Migration[6.0]
def change
create_table :orders do |t|
t.string :order_ref
t.integer :owner_id
t.string :owner_type

t.timestamps
end
end
end

class Order < ApplicationRecord
belongs_to :owner, polymorphic: true
end

class Person
has_many :orders, as: :owner
end

class Group
has_many :orders, as: :owner
end

# Seed
# Person creating
person = Person.create(name: "John Doe")

# Group creating
group = Group.create(group_name: "Jungles")

# Orders
Order.create(owner: person, order_ref: "Or001")
Order.create(owner: group, order_ref: "Or002")
Order.create(owner: person, order_ref: "Or003")

# Rails console
Person.first.orders
=>[#<Order id: 1, order_ref: "Or001", owner_id: 1, owner_type: "Person", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">,
#<Order id: 3, order_ref: "Or003", owner_id: 1, owner_type: "Person", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">]>

Order.first.owner
=>#<Person id: 1, name: "John Doe", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">

Order.second.owner
=>#<Group id: 1, group_name: "Jungles", created_at: "2020-04-21 06:58:51", updated_at: "2020-04-21 06:58:51">

# SQL query
# select *
# from orders
# where owner_type='Order' and owner_id=1;

The Rails convention for naming a polymorphic association use “-able”. This makes it clear in your relationship which class is polymorphic. But you can use whatever name that you like.

Warning

Polymorphic associations come with one huge red flag: compromised data integrity

In a normal belongs_to relationship, we use foreign keys for reference in an association.

They have more power than just forming a link, though. Foreign keys also prevent referential errors by requiring that the object referenced in the foreign table does, in fact, exist. If someone tries to create an object with a foreign key that references a null object, they will get an error.

Unfortunately, polymorphic classes can’t have foreign keys. We use the resource_type and resource_id columns in place of a foreign key. This means we lose the protection that foreign keys offer.

Rails and ActiveRecord help us out on the surface, but anyone with direct access to the database can create or update objects that reference null objects.

PROS

  • Easy to scale number of models: more models can be easily associated with the polymorphic class.
  • Follow the DRY principle, creates one class that can be used by many other classes.

CONS

  • More tables can make querying more difficult and expensive as the data grows.
  • Cannot have foreign key.
  • Your data integrity is compromised.

The Join tables

In this approach, we don’t create foreign keys on either of the tables that we want to relate together. Instead, we create a join tables to connect them together.

Source Code: Branch: join-tables

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
class CreateOrders < ActiveRecord::Migration[6.0]
def change
create_table :orders do |t|
t.string :order_ref

t.timestamps
end
end
end

class CreateOrdersPeople < ActiveRecord::Migration[6.0]
def change
create_table :orders_people do |t|
t.references :order, foreign_key: true, index: { unique: true }
t.references :person, foreign_key: true, index: true
t.timestamps
end
end
end

class CreateOrdersGroups < ActiveRecord::Migration[6.0]
def change
create_table :orders_groups do |t|
t.references :order, foreign_key: true, index: { unique: true }
t.references :group, foreign_key: true, index: true
t.timestamps
end
end
end

class Order < ApplicationRecord
has_one :orders_person
has_one :orders_group

has_one :person, through: :orders_person
has_one :group, through: :orders_group
end

class OrdersGroup < ApplicationRecord
belongs_to :order
belongs_to :group
end

class OrdersPerson < ApplicationRecord
belongs_to :order
belongs_to :person
end

class Person < ApplicationRecord
has_many :orders_persons
has_many :orders, through: :orders_persons
end

class Group < ApplicationRecord
has_many :orders_groups
has_many :orders, through: :orders_groups
end

# Seed
person = Person.create(name: "John Doe")

group = Group.create(group_name: "Jungles")

order1 = Order.create(order_ref: "Or001")
order2 = Order.create(order_ref: "Or002")
order3 = Order.create(order_ref: "Or003")

OrdersGroup.create(order: order1, group: group)
OrdersPerson.create(order: order2, person: person)
OrdersGroup.create(order: order3, group: group)

# Rails console
Person.first.orders
[#<Order id: 2, order_ref: "Or002", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">]

Group.first.orders
[#<Order id: 1, order_ref: "Or001", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">,
#<Order id: 3, order_ref: "Or003", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">]

Order.first.group
#<Group id: 1, group_name: "Jungles", created_at: "2020-04-21 07:45:03", updated_at: "2020-04-21 07:45:03">

Pros

  • We’re using foreign key constraints, so the database can ensure that any connection between an order and a resource (group/person) is valid.

Cons

  • There is no way to require that the resource (group/person) has an order. And it does not enforce uniqueness across the two owners (e.g one order could be incorrectly connected to both a person and a group).
1
2
3
4
5
6
order4 = Order.create(order_ref: "Or004")
person = Person.first
group = Group.first

OrdersGroup.create(order: order4, group: group)
OrdersPerson.create(order: order4, person: person)

Reverse Belongs-To

We know an order logically belongs to the person or group, but these relationships can be reversed. Instead of having the foreign key on the orders table, we will adding an order_id to people and groups tables.

Source Code: Branch: reverse-belongs-to

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
class CreatePeople < ActiveRecord::Migration[6.0]
def change
create_table :people do |t|
t.string :name
t.integer :order_id

t.timestamps
end
end
end

class CreateGroups < ActiveRecord::Migration[6.0]
def change
create_table :groups do |t|
t.string :group_name
t.integer :order_id

t.timestamps
end
end
end

class CreateOrders < ActiveRecord::Migration[6.0]
def change
create_table :orders do |t|
t.string :order_ref

t.timestamps
end
end
end

class Group < ApplicationRecord
belongs_to :order
end

class Person < ApplicationRecord
belongs_to :order
end

class Order < ApplicationRecord
has_one :group
has_one :person

def owner
@owner ||= group || person
end
end

# Seed
person = Person.create(name: "John Doe")
group = Group.create(group_name: "Jungles")

order1 = Order.create(order_ref: "Or001")
order2 = Order.create(order_ref: "Or002")
order3 = Order.create(order_ref: "Or003")

person.update(order: order1)
group.update(order: order3)

# Rails console
Order.first.person
=>#<Person id: 1, name: "John Doe", order_id: 1, created_at: "2020-04-21 08:48:26", updated_at: "2020-04-21 08:48:26">

Group.first.order
=>#<Order id: 3, order_ref: "Or003", created_at: "2020-04-21 08:48:26", updated_at: "2020-04-21 08:48:26">

Order.last.owner
=>#<Group id: 1, group_name: "Jungles", order_id: 3, created_at: "2020-04-21 08:48:26", updated_at: "2020-04-21 08:48:26">

Pros

  • All resources (Person / Group) are guaranteed to have a valid references to an order.

Cons

  • There is no way to prevent an orphan order record. (e.g order2)
  • There is no way to prevent a Group and a Person from referencing the same order at the database level.

Exclusive Belongs To (aka Exclusive Arc)

Instead of having owner_id and owner_type combo, we will add two foreign keys: group_id and person_id to the Orders table.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class CreatePeople < ActiveRecord::Migration[6.0]
def change
create_table :people do |t|
t.string :name

t.timestamps
end
end
end

class CreateGroups < ActiveRecord::Migration[6.0]
def change
create_table :groups do |t|
t.string :group_name

t.timestamps
end
end
end

class CreateOrders < ActiveRecord::Migration[6.0]
def change
create_table :orders do |t|
t.string :order_ref
t.references :group, foreign_key: true, index: true
t.references :person, foreign_key: true, index: true

t.timestamps
end
end
end

To ensure that an order belongs to exactly one resource (group/person) at anytime, we need to do something. That’s why this way is called Exclusive Arc.

So, technically, there are two ways to enforce the exclusive constraint.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
class Order < ApplicationRecord
belongs_to :group, optional: true
belongs_to :person, optional: true

validate :can_be_only_one

private

def can_be_only_one
return unless existing_group + existing_person != 1

error.add(:base, "can only have one owner")
end

def existing_group
group.present? ? 1 : 0
end

def existing_person
person.present? ? 1 : 0
end
end

class Person < ApplicationRecord
has_many :orders
end

class Group < ApplicationRecord
has_many :orders
end

# Seed
person = Person.create(name: "John Doe")
group = Group.create(group_name: "Jungles")

order1 = Order.create(order_ref: "Or001", person: person)
order2 = Order.create(order_ref: "Or002", group: group)
order3 = Order.create(order_ref: "Or003", group: group)

# Rails console
Person.first.orders
=>[#<Order id: 1, order_ref: "Or001", group_id: nil, person_id: 1, created_at: "2020-04-21 09:32:32", updated_at: "2020-04-21 09:32:32">]

Group.first.orders
=>[#<Order id: 2, order_ref: "Or002", group_id: 1, person_id: nil, created_at: "2020-04-21 09:32:32", updated_at: "2020-04-21 09:32:32">,
#<Order id: 3, order_ref: "Or003", group_id: 1, person_id: nil, created_at: "2020-04-21 09:32:32", updated_at: "2020-04-21 09:32:32">]

# Validates
order = Order.new(order_ref: "Or004", group_id: Group.first.id, person_id: Person.first.id)
order.valid? # false
order.errors
=> #<ActiveModel::Errors:0x00007f90f4351998 @base=#<Order id: nil, order_ref: "Or004", group_id: 1, person_id: 1, created_at: nil, updated_at: nil>,
# @messages={:base=>["can only have one owner"]}, @details={:base=>[{:error=>"can only have one owner"}]}>

NOTE: I’m using PostgreSQL for this way

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
class AddCheckConstraintToOrder < ActiveRecord::Migration[6.0]
def change
execute <<-SQL
ALTER TABLE orders ADD CONSTRAINT order_owner_check CHECK (
(
(group_id is not null)::integer +
(person_id is not null)::integer
) = 1
);
SQL
end
end

class Person < ApplicationRecord
has_many :orders
end

class Group < ApplicationRecord
has_many :orders
end

class Order < ApplicationRecord
belongs_to :group, optional: true
belongs_to :person, optional: true
end

# Seed
person = Person.create(name: "John Doe")
group = Group.create(group_name: "Jungles")

order1 = Order.create(order_ref: "Or001", person: person)
order2 = Order.create(order_ref: "Or002", group: group)
order3 = Order.create(order_ref: "Or003", group: group)

# Rails console
Person.first.orders
=>[#<Order id: 1, order_ref: "Or001", group_id: nil, person_id: 1, created_at: "2020-04-21 09:53:47", updated_at: "2020-04-21 09:53:47">]

Group.first.orders
=>[#<Order id: 2, order_ref: "Or002", group_id: 1, person_id: nil, created_at: "2020-04-21 09:53:47", updated_at: "2020-04-21 09:53:47">,
#<Order id: 3, order_ref: "Or003", group_id: 1, person_id: nil, created_at: "2020-04-21 09:53:47", updated_at: "2020-04-21 09:53:47">]

# Check constraint
order = Order.new(order_ref: "Or004", group_id: Group.first.id, person_id: Person.first.id)
order.save!
(0.4ms) BEGIN
Order Create (1.9ms) INSERT INTO "orders" ("order_ref", "group_id", "person_id", "created_at", "updated_at") VALUES ($1, $2, $3, $4, $5) RETURNING "id" [["order_ref", "Or004"], ["group_id", 1], ["person_id", 1], ["created_at", "2020-04-21 09:56:36.563553"], ["updated_at", "2020-04-21 09:56:36.563553"]]
(0.3ms) ROLLBACK
Traceback (most recent call last):
1: from (irb):10
ActiveRecord::StatementInvalid (PG::CheckViolation: ERROR: new row for relation "orders" violates check constraint "order_owner_check")
DETAIL: Failing row contains (5, Or004, 1, 1, 2020-04-21 09:56:36.563553, 2020-04-21 09:56:36.563553).

Pros

  • Ensure that an order belongs to exactly one resource (group/person) at any type.
  • An order can’t be orphaned.

Cons

  • Take a little extra work to setup
  • It can get a little cumbersome if you have more than three resources.

Conclusion

Due to lack of data integrity guarantees, the Type Column should not be used. The only advantage it has is an ORM such as ActiveRecord may take it very easy to use

The Join tables approach is an improvement over polymorphic joins, but it requires creating extra table for each relationship.

The Reverse Belongs-To models approach has few critical cons.

So, the final approach, the Exclusive Arc. Data integrity is maintained via the check constraint.

However, there are few concerns:

  • Multiple null fields. But in the case of PostgreSQL, null values are almost free.
  • Adding a new table requires adding a column to the exclusive belongs-to table. If this was a large, heavily used table there might be an issue with how long the table would be locked. With PostgreSQL, this is not a problem. Nullable fields can be added quickly regardless of table size. The updated check constraint can also be added without blocking concurrent usage.

In conclusion, I suggest using an Exclusive Arc approach to represent a polymorphic association.

REFERENCE: