Don’t use DAO, use Repository

LMAX Exchange

Data Access Object (DAO) is a commonly used pattern to persist domain objects into a database. The most common form of a DAO pattern is a class that contains CRUD methods for a particular domain entity type.

Assumes that I have a domain entity class “Account”:

package com.thinkinginobjects.domainobject;

public class Account {

	private String userName;
	private String firstName;
	private String lastName;
	private String email;
	private int age;

	public boolean hasUseName(String desiredUserName) {
		return this.userName.equals(desiredUserName);
	}

	public boolean ageBetween(int minAge, int maxAge) {
		return age >= minAge && age <= maxAge;
	}
}

Follow the common DAO approach, I create a DAO interface:

package com.thinkinginobjects.dao;

import com.thinkinginobjects.domainobject.Account;

public interface AccountDAO {

	Account get(String userName);
	void create(Account account);
	void update(Account account);
	void delete(String userName);

}

The AccountDAO interface may have multiple implementations which use some kind of O/R mapper or executing plan sql queries.

The pattern has these advantages:

  • It separates the domain logic that use it from any particular persistence mechanism or APIs.
  •  The interface methods signature are independent of the content of the Account class. When you add a telephone number field to the Account, you don’t need to change the AccountDAO interface nor its callers’.

The pattern has many questions unanswered however. What if I need to query a list of accounts having a specific last name? Am I allow to add a method to update only the email field of an account? What if I change to use a long id instead of userName? What exactly a DAO is responsible for?

The problem of the DAO pattern is that it’s responsibility is not well-defined. Many people think it as a gateway to the database and add methods to it when they find potential new ways they’d like to talk to the database. Hence it is not uncommon to see a DAO getting bloated like the one below.

package com.thinkinginobjects.dao;

import java.util.List;
import com.thinkinginobjects.domainobject.Account;

public interface BloatAccountDAO {

	Account get(String userName);
	void create(Account account);
	void update(Account account);
	void delete(String userName);

	List getAccountByLastName(String lastName);
	List getAccountByAgeRange(int minAge, int maxAge);
	void updateEmailAddress(String userName, String newEmailAddress);
	void updateFullName(String userName, String firstName, String lastName);

}

In the BloatAccountDAO, I added two query methods to look up Accounts with different parameters. If I had more fields and more use cases that query the account differently, I may end up with written more query methods. The consequences are:

  1. Mocking the DAO interface becomes harder in unit test. I need to implement more methods in the DAO even my particular test scenario only use one of them.
  2.  The DAO interface becomes more coupled to the fields of Account object. I have to change the interface and all its implementations if I change the type of fields those stored in Account.

To make things even worse, I added two additional update methods to the DAO as well. They are the direct result of two new use cases which update different subset of the fields of an account. They seem like harmless optimisation and fit into the AccountDAO interface if I naively treat the interface as a gateway to the persistence store. Again, the DAO pattern and its class name “AccountDAO” is too loosely defined to stop me doing this.

I end up with a fat DAO interface and I am sure it will only encourages my colleagues to add even more methods to it in the future. One year later I will have a DAO class with 20+ methods and I can only blame myself chosen this weakly defined pattern.

Repository Pattern:

A better pattern is Repository. Eric Evans gave it a precise description in his book [DDD], “A Repository represents all objects of a certain type as a conceptual set. It acts like a collection, except with more elaborate querying capability.”

I go back and design an AccountRepository follow this pattern.

package com.thinkinginobjects.repository;

import java.util.List;
import com.thinkinginobjects.domainobject.Account;

public interface AccountRepository {

	void addAccount(Account account);
	void removeAccount(Account account);
	void updateAccount(Account account); // Think it as replace for set

	List query(AccountSpecification specification);

}

The “add” and “update” methods look identical to the save and update method of my original AccountDAO. The “remove” method differs to the DAO’s delete method by taking an Account object rather than the userName (Account’s identifier). It you think the Repository as a Collection, this change makes a lot of sense. You avoid to expose the type of Accounts identity to the Repository interface. It makes my life easy if I’d like to use long values to identify the accounts.

If you every wonder the contracts of the add/remove/update method, just think about the Collection metaphor. If you ever think about whether to add another update methods to the Repository, think if it make sense to add an extra update method to a Collection.

The “query” method is special however. I wouldn’t expect to see a query method in a Collection class. What does it do?

The Repository is different to a Collection when we consider its querying ability. With in memory collection, it is simple to iterate through and find the one I am interested in. A repository deals with a large set of objects that typical not in memory when the query is performed. It is not feasible to load all the instances of the Account from the database if all I want is an Account with a particular user name. Instead, I pass a criterion to the Repository, and let the repository to find this object/objects that satisfies my criteria in its own way. The Repository may decide to generate a sql against the database if it is backed by a database table, or it may simply iterate through its collection if it is backed by a collection in memory.

One common implementation of a criterion is Specification pattern. A specification is a simple predicate that takes a domain object and returns a boolean.

package com.thinkinginobjects.repository;

import com.thinkinginobjects.domainobject.Account;

public interface AccountSpecification {

	boolean specified(Account account);

}

Therefore, I can create one implementation for each different way I’d like to query AccountRepository.

The standard Specification works well with in memory Repository, but cannot be used with database backed repository because of inefficiency.

To work with a sql backed AccountRepository implementation, my specifications need to implement SqlSpecification interface as well.

package com.thinkinginobjects.repository;

public interface SqlSpecification {

	String toSqlClauses();

}

A plan sql backed repository can take advantage of this interface and use the produced partial sql clauses to perform database query. If I use a hibernate backed repository, I may use the HibernateSpecification interface instead, which generates a hibernate Criteria when invoked.

The sql and hibernate backed repositories does not use the “specified” method, however I found it is very beneficial to implement it in all cases. Therefore I can use the same implementation classes with a stub AccountRepository for testing purpose and also with a caching implementation of the repository before the query hit the real one.

We can even take a step further to composite Specifications together with ConjunctionSpecification and DisjunctionSpecification to perform more complicate queries. However I feel it is out of the scope of this article. You can find more detail and examples about this in Evan’s book [DDD] if you are interested.

package com.thinkinginobjects.specification;

import org.hibernate.criterion.Criterion;
import org.hibernate.criterion.Restrictions;
import com.thinkinginobjects.domainobject.Account;
import com.thinkinginobjects.repository.AccountSpecification;
import com.thinkinginobjects.repository.HibernateSpecification;

public class AccountSpecificationByUserName implements AccountSpecification, HibernateSpecification {

	private String desiredUserName;

	public AccountSpecificationByUserName(String desiredUserName) {
		super();
		this.desiredUserName = desiredUserName;
	}

	@Override
	public boolean specified(Account account) {
		return account.hasUseName(desiredUserName);
	}

    @Override
    public Criterion toCriteria() {
        return Restrictions.eq("userName", desiredUserName);
    }

}

 

package com.thinkinginobjects.specification;

import com.thinkinginobjects.domainobject.Account;
import com.thinkinginobjects.repository.AccountSpecification;
import com.thinkinginobjects.repository.SqlSpecification;

public class AccountSpecificationByAgeRange implements AccountSpecification, SqlSpecification{

	private int minAge;
	private int maxAge;

	public AccountSpecificationByAgeRange(int minAge, int maxAge) {
		super();
		this.minAge = minAge;
		this.maxAge = maxAge;
	}

	@Override
	public boolean specified(Account account) {
		return account.ageBetween(minAge, maxAge);
	}

	@Override
	public String toSqlClauses() {
		return String.format("age between %s and %s", minAge, maxAge);
	}

}

Conclusion:

DAO pattern offers only a loosely defined contract. It suffers from getting potential misused and bloated implementations. The repository pattern uses a metaphor of a Collection. This metaphor gives the pattern a tight contract and make it easier to understand by your fellow colleagues.

References:

[DDD] – Domain-Driven Design Tackling Complexity in the Heart of Software. By Eric Evans.

Any opinions, news, research, analyses, prices or other information ("information") contained on this Blog, constitutes marketing communication and it has not been prepared in accordance with legal requirements designed to promote the independence of investment research. Further, the information contained within this Blog does not contain (and should not be construed as containing) investment advice or an investment recommendation, or an offer of, or solicitation for, a transaction in any financial instrument. LMAX Group has not verified the accuracy or basis-in-fact of any claim or statement made by any third parties as comments for every Blog entry.

LMAX Group will not accept liability for any loss or damage, including without limitation to, any loss of profit, which may arise directly or indirectly from use of or reliance on such information. No representation or warranty is given as to the accuracy or completeness of the above information. While the produced information was obtained from sources deemed to be reliable, LMAX Group does not provide any guarantees about the reliability of such sources. Consequently any person acting on it does so entirely at his or her own risk. It is not a place to slander, use unacceptable language or to promote LMAX Group or any other FX and CFD provider and any such postings, excessive or unjust comments and attacks will not be allowed and will be removed from the site immediately.