Are we being lazy with memoization?
Before reaching for `||=` is there an opportunity to refactor?
Sometimes we need to memoize/cache expensive operations. A typical case is to avoid multiple database calls. A common Ruby technique is to use the lazy evaluation operator ||=
def customer
@customer ||= Customer.find(@customer_id)
end
If @customer
is nil
then the query of the right of ||=
called.
But I have issues with this
It doesn't work when nil
is returned
If the query returns nil
, then @customer
is nil
. The next time ||=
is evaluated a database query will be run again making another unnecessary call to the database.
To avoid this we can check if @customer
is defined:
def customer
return @customer if defined?(@customer)
@customer = Customer.find(@customer_id)
end
However this doesn't look idiomatic. Using meta programming always feel obscure to me. We don't need to be so clever, there must be a simpler way.
We are hiding state mutation
It feels odd to mutate an object' s state during its lifecycle. It makes the object harder to reason about as the mental model changes during its life cycle.
A better way: Create required state in constructor
Now call me old fashioned but if an object always needs a certain element in its state, then we should set that state when we construct the object.
So let's make the query at initialization.
def initialize(customer_id)
@customer_id = customer_id
@customer = Customer.find(customer_id)
end
I believe It's helpful for future readers (this includes you) to know the entire state of the object from reading the constructor. We are also guaranteed that it's only run once.
The best way: Dependancy injection
IMO an even better solution is the pass the dependancy into the object that needs it.
Here the query is made at a higher level and the results passed in.
class AccountData
def initialize(customer)
@customer = customer
end
....
customer = Customer.find(customer_id)
account_data = AccountData.new(customer)
AccountData
is decoupled from the database query. It doesn't need to care how to obtain the customer
or wether it needs caching.
To do this we need to think about the architecture outside of the object that is using the dependancy. Yes while this may be more work this is a good thing.
Opportunities to rethink existing design and rework them are always good opportunities.
Another benefit: Dependancy injection will improve your tests
Say we have a method in the public interface called account_name
that uses the customer.
def account_name
"#{customer.first_name}, #{customer.last_name}"
end
With customer
database query made internal to object
def customer
@customer ||= Customer.find(@customer_id)
end
I know of two options to test account_name
.
Adding to database in test setup
it 'returns customer first name plus last name' do
customer = Factorybot.create(:customer, first_name: 'Lewis', last_name: 'Jones') this
account_data = AccountData.new(customer.id)
expect(account_data.account_name).to eq('Lewis Jones')
end
Downside of this adds a database write to the test slowing the test.
Stubbing the database query
For performance my previous company disallowed calling the database in unit test so unfortunately this was common.
it 'returns customer first name plus last name' do
customer = instance_double('Customer', first_name: 'Lewis', last_name: 'Jones')
allow(Customer).to_receive(:find).and_return(customer)
account_data = AccountData.new('dummy_id')
expect(account_data.account_name).to eq('Lewis Jones')
end
Problems with stubbing with allow
:
- We are stubbing an internal mechanism of the object, the test is coupled to implementation
- We are missing test coverage as we don't check the exact customer id passed to the query. We could add another test for this or use
with
but I often see that missing with this style of testing - Now this may be personal but I really dislike
allow
. If we need to change something that is not under our explicit control we should refactor so it is.
Testing with dependancy injection
When we are injecting the query into the object like:
class AccountData
def initialize(customer)
@customer = customer
end
....
We can take advantage of dependancy injection in our tests:
it 'returns customer first name plus last name' do
customer = instance_double('Customer', first_name: 'Lewis', last_name: 'Jones')
account_data = AccountData.new(customer)
expect(account_data.account_name).to eq('Lewis Jones')
end
Why is this test beautiful:
- It's super easy for a human to parse
- How the input affects the output is obvious
- We don't need to call the database
What if we don't memoize in every flow?
Say for example we have a database query that is only called in certain code flows:
class AccountData
def initailize(customer_id, anonymize)
@customer_id = customer_id
@anonymize = anonymize
end
def data
{
customer: customer_data
}
end
def customer_data
return { name: 'anonymised' } if @anonymise
customer
end
def customer
@customer ||= Customer.find(@customer_id)
end
We only need to query the database if @anonymise
is false.
So rather than memorisation is there an opportunity to refactor?
We could encapsulate the customer returning logic in a query
class CustomerQuery
NullCustomer = Struct.new(:name)
def self.call(id: id, anonymise: false)
return NullCustomer.new('anonymised') if anonymise
Customer.find(id)
end
end
When refactoring I took the opportunity to use the null object pattern to always return an object that behaves as Customer
does.
We can now inject the query into our original object and remove the branching logic.
class AccountData
def initialize(customer)
@customer = customer
end
def data
{
customer: { name: @customer.name }
}
end
...
customer = CustomerQuery.call(customer_id, anonymise)
account_data = AccountData.new(customer)
Removing branching code has made it much easer for AccountData
.
But what if you can't refactor out ||=
Caveat: As with all examples on blog posts this is a very contrived example. It's works as I have written it this way. Real life examples may not be so easy to tame.
If you've considered a refactor to remove ||=
and it's not feasible then by all means use it. However please don't use it as the easy choice. Consider what the pay off is. If 80% of the time you're using the opportunity to refactor and improve your code I think that's good.
Summary
- Think before reaching for
||=
is there a better alternative? - If an expensive operation (db query, calculation, http request) is done on every code flow then construct the object with the state
- Dependancy injection is an elegant solution that will improve tests
- If you have to memoize then do but please make it the last resort