The Repository Pattern isn’t an Anti-Pattern; You’re just doing it wrong.

Photo by Rob Schreckhise

If you already hate the Repository Pattern, I get it. You had a bad experience with it. Implementing Repos over and over again has no benefits. I 100% agree. I although have not have had the same experience, as I do not recommend that approach. I focus on a zero rework implementation. People don’t give Repositories the benefit of a doubt, as their previous implementations of it had problems. Here I’ll suggest a method of having generic repositories, with plug-in customization, and expanding past the IRepository<T> interface without having to implement the class Repository.

My baseline problem is, did you consider this method before discounting the pattern? I know there is a lot of hate for the pattern. The reason why I’m even making this post is how divergent my implementation was from what the “Gurus” are considering as a bad implementation. If I have had a limited development overhead, and have successfully used the pattern, where a “Guru” had a failure, who’s wrong?

When I say success, I mean 250 developers, 500 million dollar project, 5+ million lines of code, 100k concurrent users. Then though this pattern was more acceptable, and that is why where my debate lies. I posed the question to myself today, should I still be recommending the Repository Pattern for new development? (Got lucky finally a green-field) So, I started reading arguments on various reason not to.

After reading a half dozen blogs, I didn’t have a conclusion but arrived at is everyone just doing it poorly? So let’s talk about common problems in implementations of the Repository Pattern and how we can solve them. At the end, let me know if you still think it is an Anti-Pattern.

Common Mistakes

If you ask 10 developers to design a Repository Pattern from scratch, you would get 10 different patterns. That is the biggest problem with the Repository Pattern. Its such a simple concept, but developers create artificial constraints in their design. Lets go through some common mistakes.

Jon Smith pointed me to MSDN’s implementation of the Repository and Unit Of Work Pattern after reading this post. I personally feel lucky to never have been constrained by that mess. This is probably where most bad practices come from. Their samples definitely are problematic and only in their side notes do they briefly address the issues.

Mistake #1 – Not using generics

Hold on if you already know the problem with using a simple approach to generics. Mistake #3 – Extensibility Points addresses those issues.

So if you were to go into an interview and were asked what is the Repository Pattern, the answer CRUD is acceptable. So at a developer’s first pass at creating a Repository it could look like this.

public interface IPersonRepository
{
    void InsertPerson(Person person);
    void UpdatePerson(Person person);
    void DeletePerson(Person person);
    Person GetById(int personId);
}

Then you look at the database, and it’s 200 tables and you think, “I’m smart I’ll make it generic!”

public interface IRepository<TEntity> where TEntity : class 
{
    void Insert(TEntity entity);
    void Update(TEntity entity);
    void Delete(TEntity entity);

    //Cannot do this anymore since keys can be different per table.
    //TEntity GetById(int id);

    //So you do this
    TEntity Find(params object[] keys);

    //Might as well add this as well.
    IQueryable<TEntity> Query { get; }
}

Mistake #2 – Not using generics enough

But for some reason people stop at the interface with their generics and end up writing this next.

public class PersonRepository : IRepository<Person>
{
    //Hard coded
}

We although can move the generic into the implementation, like this:

public /*sealed*/ class EntityRepository<TEntity> : IRepository<TEntity>
        where TEntity : class 
{
    protected /*private*/ TDbContext Context { get; private set; }

    protected /*private*/ DbSet<TEntity> Set => Context.Set<TEntity>();

    public EntityRepository(TDbContext dbContext)
    {
        Context = dbContext;
    }

    public IQueryable<TEntity> Query => Set;

    public TEntity Find(params object[] keys)
    {
        return Set.Find(keys);
    }

    public void Insert(TEntity entity)
    {
        var entry = Context.Entry(entity);
        if (entry.State == EntityState.Detached)
            Set.Add(entity);
    }

    public void Delete(TEntity entity)
    {
        var entry = Context.Entry(entity);
        if (entry.State == EntityState.Detached)
            Set.Attach(entity);
        Set.Remove(entity);
    }

    public void Update(TEntity entity)
    {
        var entry = Context.Entry(entity);
        if (entry.State == EntityState.Detached)
            Set.Attach(entity);
        entry.State = EntityState.Modified;
    }
}

Note: Do not make this class abstract, even consider making it sealed.

Bonus: You can put IRepository in Framework.Data.dll and EntityRepository in Framework.Data.EF6.dll or Framework.Data.EFCore.dll. This will make it so classes that use IRepository do not have to have a reference to EF.

Looks great. We can use a Container to resolve IRepository<Person> to EntityRepository<MyContext, Person> and life is good. Development can do CRUD on Repositories, and don’t need to explicitly create them.

Side Note: The first time I showed the above class to a Senior Architect seven years ago, he said to me: “Is this some type of dictionary? I expect with two generics it would be a dictionary.” It is interesting that people don’t quite get the full use of generics.

Mistake #3 – Extensibility Points

All is good in the world, until day two of development and someone needs to do something custom. So this happens:

public class EntityRepository<TEntity> : IRepository<TEntity>
        where TEntity : class 
{
    // Sections removed for brevity

    protected virtual void Inserting(TEntity entity)
    {
        //Template method
    }

    protected virtual void Inserted(TEntity entity)
    {
        //Template method
    }
 
    public void Insert(TEntity entity)
    {
        Inserting(entity);
        var entry = Context.Entry(entity);
        if (entry.State == EntityState.Detached)
            Set.Add(entity);
        Inserted(entity);
    }  
}

public interface IPersonRepository : IRepository<Person>
{
    IEnumerable<Person> GetPeopleByName(string firstName, string lastName);
}

public class PersonRepository : EntityRepository<MyContext, Person>, IPersonRepository
{
    protected override void Inserting(TEntity entity)
    {
       if(!CurrentUser.IsAdmin)
       {
           throw new SecurityException();
       }
    }

    public IEnumerable<Person> GetPeopleByName(string firstName, string lastName)
    {
        return Query.Where(p => p.FirstName == firstname && p.LastName == lastName);
    }
}

Now we are kind of back to where we started. Custom interfaces resolving to custom classes, but now we have a little bit of reusable code.

First, don’t create custom interfaces per table. There is no need, use extension methods. Hopefully, you’re going to use this for something more useful than the example though.

public static class PersonRepositoryExtensions
{
    public static IQueryable<Person> GetPeopleByName(this IRepository<Person> repository, 
                string firstName, string lastName)
    {
        return repository.Query.Where(p => p.FirstName == firstname && p.LastName == lastName);
    }
}

You can even take this a bit further if required and apply a convention.

public interface INamedIndividual
{
    string FirstName { get; set; }
    string LastName { get; set; }
}

//Apply the interface to entity via partials
public partial Person : INamedIndividual
{
}

public partial Employee : INamedIndividual 
{
}

public static class IndividualRepositoryExtensions
{
    public static IQueryable<TIndividual> GetByName<TIndividual>
            (this IRepository<TIndividual> repository, string firstName, string lastName) 
                where TIndividual : INamedIndividual
    {
        return repository.Query.Where(p => p.FirstName == firstname && p.LastName == lastName);
    }
}

Now we can add queries to specific generics based off of class or interface. We do not need to create custom interfaces or inherited classes, and it works quite nice.

Side Note: Did you know you could create a constrained generic extension method? I don’t blame you if you didn’t. They work surprisingly well; even with intelli-sense.

Going back the the template method pattern above, did you notice the bug? Attached entities could bypass that check. This is where to me things get a bit more complicated. What I do is a Rules Service, in light of a better name. I’ll talk about that soon.

Mistake #4 – Not sharing the context

If every Repository you create, you create a new DbContext for that Repository, the scope of work you can do is limited by the Repository. You need to come up with a mechanism for multiple Repositories to use the same context. The Unit Of Work pattern can be used to allow this. Imagine the following example.

using(var unitOfWork = _unitOfWorkManager.Start()) //
{
    //var personRepo = locator.GetService<IRepository<Person>>();
    //var employeeRepo = locator.GetService<IRepository<Employee>>();
    //Or constructor injection with stateless Repositories

    _personRepo.Insert(...);
    _employeeRepo.Insert(...);

    unitOfWork.Save();
}

Sharing the context between the two Repositories means one payload of data will be sent over to the database. Additionally you could join between the two queries on the repository and execute one query in SQL. Not having a shared context limits what you can do, and causes performance issues.

Adding Unit Of Work Pattern

Notice I also never defined a Save or Submit method. The Unit Of Work Pattern defines the beginning and end of a transaction, and to me is also responsible for defining where Save actually occurs. So we can define UnitOfWork as such.

Core level abstractions

First in my Unit Of Work I want to be able to event back when it is disposed. Pretty much we need to notify the UnitOfWorkManager the scope has ended to remove the UnitOfWork.

public interface IDisposing : IDisposable
{
    event EventHandler Disposing;
}

public abstract class Disposable : IDisposing
{
    protected virtual void Dispose(bool disposing)
    {
    }

    public void Dispose()
    {
        Disposing?.Invoke(this, new EventArgs());
        Dispose(true);
        GC.SuppressFinalize(this);
    }

    public event EventHandler Disposing;
}

Unit of Work Interface

The Unit Of Work interface is simple itself. It can Submit and Dispose. In the past I’ve had Submit on the Repositories, which really didn’t make sense. The Repositories all shared the context, so saving one would save all. It turned into a mess where developers thought they had to save multiple times.

public interface IUnitOfWork : IDisposing
{
    void Submit();

    bool HasEnded { get; }
}

Unit Of Work Manager

The next thing we have is a UnitOfWorkManager

public static class UnitOfWorkManager : IUnitOfWork
{
    [ThreadStatic] \\[AsyncLocal]
    private static IUnitOfWork _unitOfWork;

    private Func<IUnitOfWork> _unitOfWorkFunc;

    public UnitOfWorkManager(Func<IUnitOfWork> unitOfWorkFunc)
    {
        _unitOfWorkFunc = unitOfWorkFunc;
    }

    public IUnitOfWork CurrentUnitOfWork => _unitOfWork;

    public IUnitOfWork Begin()
    {
        if (_unitOfWork != null)
        {
            return _unitOfWork;
        }
       
        var unitOfWork = _unitOfWorkFunc();

        if (unitOfWork == null)
        {
            throw new InvalidOperationException("Could not resolve Unit Of Work.");
        }

        unitOfWork.Disposing += UnitOfWorkDisposing;
        _unitOfWork = unitOfWork;
        return _unitOfWork;        
    }

    private void UnitOfWorkDisposing(object sender, EventArgs e)
    {
        if (sender != _unitOfWork)
        {
            throw new InvalidOperationException("Sender for ending unit of work mis-matched.");
        }
        _unitOfWork = null;
    }
}

You can use a container as well to have a scoped instance of Unit Of Work, that is fine. This example is managing the scope of it for you. This is important as it will use only one instance of a context between all the repositories used in your Unit Of Work.

The Thread Static \ Async Local instance, will keep all calls in the scope using the same Unit Of Work which uses the same DbContext. The scope is specifically where you want to use it, and isn’t too large or small.

Entity Unit Of Work

The next thing we need to do, is make a specific Unit Of Work for EF. Here are going to implement Submit and Dispose. One additional method is needed, which is to store and create the context that is being used.

public class EntityUnitOfWork : Disposable, IUnitOfWork
{
    private DbContext _context;

    private Func<DbContext> _contextFunc;

    public bool HasEnded { get; private set; }

    public EntityUnitOfWork(Func<DbContext> contextFunc)
    {
        _contextFunc = contextFunc;
    }

    public TDbContext GetContext<TDbContext>() where TDbContext : DbContext
    {
        if (HasEnded)
        {
            throw new ObjectDisposedException("Unit of Work has been disposed.");
        }

        return (TDbContext)(_context ??= _contextFunc());
    }

    public void Submit()
    {
        if (HasEnded)
        {
            throw new ObjectDisposedException("Unit of Work has been disposed.");
        }

        if(_context != null)
        {
            RulesService.ApplyInsertRules(_context.Changes(EntityState.Added));
            RulesService.ApplyDeleteRules(_context.Changes(EntityState.Modified));
            RulesService.ApplyUpdateRules(_context.Changes(EntityState.Deleted));

            _context.SaveChanges();
        }
    }

    protected override void Dispose(bool disposing)
    {
        if (!HasEnded)
        {
            _context?.Dispose();
            HasEnded = true;
        }
        base.Dispose(disposing);
    }
}

public static class DbContextExtensions
{
    public static IEnumerable<object> Changes(this DbContext context, EntityState state)
    {
        return context.ChangeTracker.Entries().Where(x => x.State == state).Select(x => x.Entity);
    }
}

I added an extension method for the DbContext to get the changes, and also send those changes to a RulesService. This is where we will be applying any custom rules per entity type or convention. The ChangeTracker right before Save Changes is the only way to find all changes. The CRUD Repository methods can have attached properties on the entities and the logic could be skipped.

One thing to note is, if lets say a Delete Rule Inserts an entity, the insert rule could be skipped. If this is an important factor for you, you might want to re-work this method. You could deep clone the changeset and run of the rules until the changeset doesn’t have and differences.

Additionally, for simplicity I didn’t account for having multiple DbContexts. One could add logic to allow for these feature. I have in the past but it was never used.

Updating the Repository

One more thing we would need to do to make this work, and that is to update our EntityRepository to use the shared context from the Unit Of Work. The UnitOfWorkManager although just uses a IUnitOfWork and doesn’t know about the context. So some casting is required to get the specific type. You could make this a bit more elaborate of course, but this would work just fine.

public class EntityRepository<TEntity> : IRepository<TEntity>
        where TEntity : class 
{   
    private DbContext _context = ((EntityUnitOfWork)unitOfWorkManagerFunc.CurrentUnitOfWork).GetContext();

    protected TDbContext Context => _context;

    protected DbSet<TEntity> Set => Context.Set<TEntity>();

    private Func<IUnitOfWorkManager> _unitOfWorkManagerFunc;

    public EntityRepository(Func<IUnitOfWorkManager> unitOfWorkManagerFunc)
    {
        _unitOfWorkManagerFunc = unitOfWorkManagerFunc;
    }

    //Rest is removed for brevity
}

Rules Outside of the Repository Pattern

Its pretty apparent that there is a need for rules to be applied for Repositories. Under my outline though, there really isn’t a way to do that. That is based off of design. The problem is if there was a Repository for every table that is too many, if there is a Repository for every concern it doesn’t make sense. Making a hierarchy based off of different rules doesn’t work.

public class AuditableRepository<TDbContext, TEntity> : EntityRepository<TDbContext, TEntity>
{

}

public class SoftDeleteRepository<TDbContext, TEntity> : EntityRepository<TDbContext, TEntity>
{

}

public class AuditableAndSoftDeleteRepository<TDbContext, TEntity> : SoftDeleteRepository?...

So the next logical step would be to centralize that logic, and you could have something like this:

public class PersonRepository : EntityRepository<MyContext, Person>
{
    protected override void Updated(Person entity)
    {
        Auditor.Apply(entity);
    }

    protected override void Deleted(Person entity)
    {
        SoftDeleter.Apply(entity);
    }
}

A paradigm like this would separate the logic out, and allow for tables to use multiple rules. It although is just one step away from a Custom Repository free solution.

My solution has been a Rules Service, this can be a Event Aggregator, a Functional Paradigm or something custom. It’s up to you on how to define this. One recommendation is to account for variance, so rules can be applied based off of convention. Here is a very simple example.

public static class RulesService
{
    private static readonly List<Action<object>> DeleteRules = new List<Action<object>>();

    public static void AddDeleteRule(Action<object> deleteRule)
    {
        DeleteRules.Add(deleteRule);
    }

    public static void AddDeleteRule<TEntity>(Action<TEntity> deleteRule)
    {
        DeleteRules.Add(x =>
        {
            if (x is TEntity entity)
            {
                deleteRule(entity);
            }
        } );
    }

    public static void ApplyDeleteRules(IEnumerable<object> deleting)
    {
        foreach (var entity in deleting)
        {
            foreach (var rule in DeleteRules)
            {
                rule(entity);
            }
        }
    }
}

This is shown for simplicity. The ones I have designed in the past have been a bit more complicated. Using Expression Trees to compile custom statements per type to squeeze that last bits of performance out.

One additional place we can apply the Rules Service is in the Repository for queries.

public class EntityRepository<TEntity> : IRepository<TEntity>
        where TEntity : class 
{
    // Sections removed for brevity
    public IQueryable<TEntity> Query => Set.ApplyReadRules();

    public TEntity Find(params object[] keys)
    {
        var entity = Set.Find(keys);
        return RulesService.CanRead(entity) ? entity : null;
    }
}

Note: Read Rules should be Expressions so they can be applied to queries and compiled for Finds.

Where does this leave us?

Looking at the Repository Pattern as outline. There is zero development outside of Framework level reusable components for each repository. Creating a repository is as simple as configuring your container, or adding the entity with an auto-config container.

Our custom behavior can still be defined as extensions which also can be applied conventionally instead of just off of the entity.

Our custom rules are defined outside the scope of the Repositories, and can be added in based off of how we setup our Rules Service. Again this can work off of convention and have zero development cost when adding a new entity.

Testing

Personally I find using Moq on every unit test becomes a bit overwhelming. I also find in-memory test databases limited. I often find creating a Testing or a TestHelper API that allows data setup and assertions can simplify testing greatly. For example this is pretty simple:

using(var mock = DataMocking.Start())
{
    var person = new Person 
    {
        FirstName = "Ada",
        LastName = "Lovelace"
    };
    mock.AddTestData(person);
    
    Business.DoOperation();

    CollectionAssert.AreEquivalent(new [] { person }, mock.DeletedItems<Person>());
}

Here DataMocking.Start() is re-configuring the container used by the framework. Instead of returning a EntityUnitOfWork and a EntityRepository it will return a MockUnitOfWork and a MockRepository. The returned mock object is a cache of all the used data objects which the MockRepository will use. Disposing of the mock, clears out the data used for the test.

Conclusion

Photo by Kelly Sikkema

The design laid out here doesn’t have the problems that usually are stated. This leaves me thinking, is Repositories really a bad pattern or were there just a ton of bad implementations of it? What are your thoughts now that I outlined more around how to implement this is a different way? Is this still wrong? Maybe, but personally I’m having a hard time separating naysayers from valid opinions.

Just be clear this is a thought exercise for me. I am not leaning one way or the other. It’s just hard for me to accept other people’s failures as a reason, when I’ve been successful with minor tweaks.

Additional Content

In Part 2 I address Typical Anti-Repository Arguments.

In Part 3 I address Retrospective and Clarification.

Please check it out.

12 thoughts on “The Repository Pattern isn’t an Anti-Pattern; You’re just doing it wrong.

  1. Hi. Nice article.
    After you finish all the implementation, have you notice that your end result resembles quite a lot with the IDbSet for repository and the DbContext for the IUnitOfWork?

    If so, don’t you think you are adding just another layer of abstraction to only hide your ORM of choice? I don’t think that was the Repository pattern intention.
    Chances are that if you change your ORM that you’ll need to re-do your queries. Maybe not if you change from relational to relational db, but from relational to non relational databases for sure.

    Like

    1. Everyone seems to like this thought process and they are really attached to it. Abstracting out a layer doesn’t solely have to be because of future replacement of that layer. Abstracting out a layer can be for structure, enhancements, testing, simplifing, and other reasons.

      The design here many can confuse with previous approaches they have seen. This approach is nothing like you’re describing.

      The key aspects are this: The repository is implemented just once as a generic, and exposes IQueryable. I updated to post to even recommend sealing it. Custom queries are just extensions to the IRepository off of IQueryable. An event aggregator is adding any CRUD level operations. Unit of work is a start save and end only item. DI is used to resolve repositories.

      This means custom queries can be applied to anything you can turn into an IQueryable or IEnumerable.AsQueryable() of the same type. I can write an expression tree to convert an xml into an IQueryable, you can probably force it onto something else. Of course paradigm shifting changes would cause havoc anywhere, but this is at least compatible with anything that can support IQueryables.

      Your event aggregator logic can also be applied to a new framework if needed.

      Structuring and centralizing code is the main benifit to me, this of course if you have reusable code that is data layer specific. You can do this different ways with different patterns, and that is fine. I never meant this as a recommendation, but as a thought exercise.

      Also please note, in part two I recommend exposing your underlying ORM if needed. Why not? Cause you have a pattern and there are zero exceptions? That is silly. If you abstract over anything you lose some of its power. If you don’t abstract at all you couple to it. If you abstract everything you’re also coupled. There is some middle ground that should make sense.

      Like

    2. That’s exactly what I was thinking. all that code… oh, if there was something that could already have it… oh, wait.. EntityFramework!

      Like

      1. Then most likely you didn’t understand the article. I made an attempt to clarify things in Part 3, because people are having a hard time separating their past implementations to the outlined one here.

        Like

  2. Great Post in defense of the oft-maligned repository pattern!

    A couple of questions:
    What would the benefit of including the DBContext as a generic type parameter for the EntityFramework implementation as opposed to just having the constructor accept a DBContext base type? Is it to enforce an invariant constraint on the specific DBContext subtype or is there another reason? I ask because I’ve always just gone with a constructor parameter accepting a DBContext?

    What are your thoughts on using the Specification pattern as an abstraction for queryability as opposed to exposing the more EF bound IQueryable interface.

    Like

    1. You know I was debating the DbContext thing. Please note this was just an exercise and not a recommendation. I updated my conclusion in Part 2, if you didn’t see it. I’m upset problems were raised about this pattern and not many people suggested solutions before hopping to the next thing. I truly feel bad for anyone stuck in this paradigm as they are not getting solutions to their problems.

      As far as DbContext, initially 8 years ago when I wrote my 5th repository pattern I put it there, never removed it. The reason then was to ensure via container configuration the right DbContext was paired to the right entity. Back then you had edmxs that would crash if you had too many tables in them, and I feared the need for multiple DbContexts in one solution… Monolithics are bad. I considered removing it this time, but didn’t care to test for side effects. It probably would be better without it now as autoconfig on the container would be easier.

      In the way laid out here, the UnitOfWork is holding onto the DbContext, and is providing a shared instance to the Repositories automatically. When you call UnitOfWorkManager.Start it return the IDisposable UnitOfWork, until that unit of work is disposed of all repositories will ask that one instance for the DbContext. This automatically shares the DbContext between the Repositories. Look at mistake 4, those 2 repositories would share the same context. So what you’d actually have to do is have UnitOfWork take in the DbContext not the Repositories.

      As for the Specification Pattern, I never used it specifically so I wouldn’t be the right person to ask. Right now though, I’m in my head working down a path of IQueryable extension methods.

      Like

  3. Amen, brother! I was just reading a new book on DDD on .NET, and I came across a comment that the Repository pattern is considered an anti-pattern, and I did a double-take, “Say what!”. I have used this pattern since 2013 with excellent results. Not least, as you point out that I can teach a new developer how to be PRODUCTIVE from day one with my architectures using this pattern. Keep up the blogging!

    Like

    1. I’m sorry I was being a bit lazy and didn’t want to focus on DI for this. So, I just said locator to delay that piece. Part 3 might give you a better idea.

      For a framework level item, my initial thoughts was a global services locator instance wouldn’t be horrible if you could have auto config. After thinking about it a bit more I’d might restructure it.

      Like

Leave a reply to brianbu Cancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.