RavenDB and the Repository pattern

Written by on on topic of Databases.

I recently had a short email exchange with Ayende Rahien and he suggested something I hadn't considered before: not using a Repository pattern.

Background

Allow me to elaborate. Before trying RavenDB, I was frequently dealing with data APIs that necessitated use of the Repository pattern (or at least some pattern of abstraction):

Reasons for using layers of abstraction boil down to:

I was so used to dealing with these issues that trying to shoehorn RavenDB into a repository just seemed natural. I didn't give it much thought until I spoke to Ayende.

Why you should not use Repository pattern with RavenDB

I wanted to know how this approach would work in real-life. I branched my website's local Git repository, removed all repositories and all the infrastructure that supported it. Now I get what Ayende meant:

Here is an example. This was the old API:

public class BlogPost
{
    public int Id { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public DateTimeOffset PublishedOn { get; set; }
}

public interface IRepository<T>
{
    T GetById(int id);
}

public interface IBlogPostRepository : IRepository<BlogPost>
{
    IList<BlogPost> GetRecentBlogPosts();
}

public abstract class Repository<T>
{
    protected readonly IDocumentSession Session;

    public Repository(IDocumentSession session)
    {
        Session = session;
    }

    public virtual T GetById(int id)
    {
        return Session.Load<T>(id);
    }
}

public class BlogPostRepository : Repository<BlogPost>, IBlogPostRepository
{
    public BlogPostRepository(IDocumentSession session) : base(session)
    {
    }

    public IList<BlogPost> GetRecentBlogPosts()
    {
        var blogPosts = Session.Query<BlogPost>
                               .OrderByDescending(bp => bp.PublishedOn)
                               .ToList();

        return blogPosts;
    }
}

public class BlogController : Controller
{
    private readonly IBlogPostRepository _repository;

    public BlogController(IBlogPostRepository repository)
    {
        _repository = repository;
    }

    public ActionResult ViewBlogPost(int id)
    {
        var blogPost = _repository.GetById(id);

        return View(blogPost);
    }

    public ActionResult ViewRecentBlogPosts()
    {
        var blogPosts = _repository.GetRecentBlogPosts();

        return View(blogPosts);
    }
}

Wow, that's a lot of code just to do a couple of simple queries. This is the new API:

public class BlogPost
{
    public int Id { get; set; }
    public string Title { get; set; }
    public string Content { get; set; }
    public DateTimeOffset PublishedOn { get; set; }
}

public class BlogController : Controller
{
    private readonly IDocumentSession _session;

    public BlogController(IDocumentSession session)
    {
        _session = session;
    }

    public ActionResult ViewBlogPost(int id)
    {
        var blogPost = _session.Load<BlogPost>(id);

        return View(blogPost);
    }

    public ActionResult ViewRecentBlogPosts()
    {
        var blogPosts = _session.Query<BlogPost>
                                .OrderByDescending(bp => bp.PublishedOn)
                                .ToList();

        return View(blogPosts);
    }
}

Can you spot the difference? The repository-free approach brings with it a number of advantages:

When you take above points into consideration, do you really want to use an abstraction?

But what about…

By now you may be thinking: "Hold on a sec. My application is different. I really need that extra layer". I have picked out three common concerns people have about this.

What if later I decide to switch to a relational database?

Relational databases and document databases have very different modelling requirements. You will have to not only rewrite the data access portion of your code, but also adjust internal repository APIs to handle the new reality.

Also, remember this is a strategic decision and doesn't happen overnight (if it does where you work - I feel sorry for you).

What if I later decide to switch to another document database?

Switching to another database is no simple task even if it belongs to the same family of databases. Expect to be dealing with a different API, usage patterns and optimisations. Repository pattern doesn't protect you from that. You still have to rewrite code. You still can't use the advanced features because you are shackled by a layer of abstraction.

Won't this lead to a lot of code duplication?

It won't. Providing you use RavenDB correctly. I have seen plenty examples where people initialise a new DocumentSession for every CRUD operation. Don't do that — session lifetime management is an infrastructure concern and should be handled at different level. Initialise your session once, at the beginning of HTTP request. Close your session at the end of HTTP request. Reuse it across all operations. This way your code is simply performing a CRUD operation and there is nothing else to think about. This also allows RavenDB to optimise writes (via batching) and makes unit testing simpler.

Sometimes it is ok to use abstractions

I am not saying you should never ever use layers of abstraction with RavenDB. If you have good reasons, then by all means go ahead. I just want you to consider next time whether the need to use abstractions outweighs the advantages of using RavenDB API directly.