Are we being lazy with memoization?

Before reaching for `||=` is there an opportunity to refactor?

Sometimes we need to memoize/cache expensive operations. A typical case is to avoid multiple database calls. A common Ruby technique is to use the lazy evaluation operator ||=

  def customer
    @customer ||= Customer.find(@customer_id)
  end

If @customer is nil then the query of the right of ||= called.

But I have issues with this

It doesn't work when nil is returned

If the query returns nil , then @customer is nil. The next time ||= is evaluated a database query will be run again making another unnecessary call to the database.

To avoid this we can check if @customer is defined:

  def customer
    return @customer if defined?(@customer)
    @customer = Customer.find(@customer_id)
  end

However this doesn't look idiomatic. Using meta programming always feel obscure to me. We don't need to be so clever, there must be a simpler way.

We are hiding state mutation

It feels odd to mutate an object' s state during its lifecycle. It makes the object harder to reason about as the mental model changes during its life cycle.

A better way: Create required state in constructor

Now call me old fashioned but if an object always needs a certain element in its state, then we should set that state when we construct the object.

So let's make the query at initialization.

  def initialize(customer_id)
    @customer_id = customer_id
    @customer = Customer.find(customer_id)
  end

I believe It's helpful for future readers (this includes you) to know the entire state of the object from reading the constructor. We are also guaranteed that it's only run once.

The best way: Dependancy injection

IMO an even better solution is the pass the dependancy into the object that needs it.

Here the query is made at a higher level and the results passed in.

class AccountData
  def initialize(customer)
    @customer = customer
  end
....

customer = Customer.find(customer_id)
account_data = AccountData.new(customer)

AccountData is decoupled from the database query. It doesn't need to care how to obtain the customer or wether it needs caching.

To do this we need to think about the architecture outside of the object that is using the dependancy. Yes while this may be more work this is a good thing.

Opportunities to rethink existing design and rework them are always good opportunities.

Another benefit: Dependancy injection will improve your tests

Say we have a method in the public interface called account_name that uses the customer.

def account_name
  "#{customer.first_name}, #{customer.last_name}"
end

With customer database query made internal to object

def customer
  @customer ||= Customer.find(@customer_id)
end

I know of two options to test account_name.

Adding to database in test setup

it 'returns customer first name plus last name' do
  customer = Factorybot.create(:customer, first_name: 'Lewis', last_name: 'Jones')  this
  account_data = AccountData.new(customer.id)
  expect(account_data.account_name).to eq('Lewis Jones')
end

Downside of this adds a database write to the test slowing the test.

Stubbing the database query

For performance my previous company disallowed calling the database in unit test so unfortunately this was common.

it 'returns customer first name plus last name' do
  customer = instance_double('Customer', first_name: 'Lewis', last_name: 'Jones')
  allow(Customer).to_receive(:find).and_return(customer)
  account_data = AccountData.new('dummy_id')
  expect(account_data.account_name).to eq('Lewis Jones')
end

Problems with stubbing with allow:

  • We are stubbing an internal mechanism of the object, the test is coupled to implementation
  • We are missing test coverage as we don't check the exact customer id passed to the query. We could add another test for this or use with but I often see that missing with this style of testing
  • Now this may be personal but I really dislike allow. If we need to change something that is not under our explicit control we should refactor so it is.

Testing with dependancy injection

When we are injecting the query into the object like:

class AccountData
  def initialize(customer)
    @customer = customer
  end
....

We can take advantage of dependancy injection in our tests:

it 'returns customer first name plus last name' do
  customer = instance_double('Customer', first_name: 'Lewis', last_name: 'Jones')
  account_data = AccountData.new(customer)
  expect(account_data.account_name).to eq('Lewis Jones')
end

Why is this test beautiful:

  • It's super easy for a human to parse
  • How the input affects the output is obvious
  • We don't need to call the database

What if we don't memoize in every flow?

Say for example we have a database query that is only called in certain code flows:

class AccountData
  def initailize(customer_id, anonymize)
    @customer_id = customer_id
    @anonymize = anonymize
  end 

  def data
    {
      customer: customer_data
    }
  end

  def customer_data
    return { name: 'anonymised' } if @anonymise
    customer
  end

  def customer
    @customer ||= Customer.find(@customer_id)
  end

We only need to query the database if @anonymise is false.

So rather than memorisation is there an opportunity to refactor?

We could encapsulate the customer returning logic in a query

class CustomerQuery
  NullCustomer = Struct.new(:name)

  def self.call(id: id, anonymise: false)
    return NullCustomer.new('anonymised') if anonymise
    Customer.find(id)
  end
end

When refactoring I took the opportunity to use the null object pattern to always return an object that behaves as Customer does.

We can now inject the query into our original object and remove the branching logic.

class AccountData
  def initialize(customer)
    @customer = customer
  end

  def data
    {
      customer: { name: @customer.name }
    }
  end
...
customer = CustomerQuery.call(customer_id, anonymise)
account_data = AccountData.new(customer)

Removing branching code has made it much easer for AccountData.

But what if you can't refactor out ||=

Caveat: As with all examples on blog posts this is a very contrived example. It's works as I have written it this way. Real life examples may not be so easy to tame.

If you've considered a refactor to remove ||= and it's not feasible then by all means use it. However please don't use it as the easy choice. Consider what the pay off is. If 80% of the time you're using the opportunity to refactor and improve your code I think that's good.

Summary

  • Think before reaching for ||= is there a better alternative?
  • If an expensive operation (db query, calculation, http request) is done on every code flow then construct the object with the state
  • Dependancy injection is an elegant solution that will improve tests
  • If you have to memoize then do but please make it the last resort