After working again on codebase where Entity Framework Core was used through repository and unit of work patterns I decided to write eye-opener post for next (and maybe even current and previous) generations about what Entity Framework has to offer in the light of these to patterns. In many cases we don’t have to move away from database context approach but stick with it even more than we first planned. Here’s how many useless repositories and units of work born and here’s also how to avoid these and go with implementations offered by Entity Framework Core.
Typical repository implementation on Entity Framework Core
It was one of those days when project with legacy code landed on my table and for sure there were load of repository interfaces and classes. At first sight the code looked all okay – well organized, pretty clean, no single mess. But repositories warned me – there are landmines under the carpet.
Let’s take one repository from the code and dissect it a little bit. First let’s see how things were wired up.
Usually there’s generic interface and abstract base class for repositories.
public interface IRepository<T> where T : Entity
{
T Get(Guid id);
IList<T> List();
IList<T> List(Expression<Func<T, bool>> expression);
void Insert(T entity);
void Update(T entity);
void Delete(T entity);
}
public abstract class BaseRepository<T> : IRepository<T> where T : Entity
{
private readonly MyDbContext _dataContext;
public BaseRepository(MyDbContext dataContext)
{
_dataContext = dataContext;
}
public void Delete(T entity)
{
_dataContext.Set<T>().Remove(entity);
_dataContext.SaveChanges();
}
public T Get(Guid id)
{
return _dataContext.Set<T>().Find(id);
}
public void Insert(T entity)
{
_dataContext.Set<T>().Add(entity);
_dataContext.SaveChanges();
}
public IList<T> List()
{
return _dataContext.Set<T>().ToList();
}
public IList<T> List(Expression<Func<T, bool>> expression)
{
return _dataContext.Set<T>().Where(expression).ToList();
}
public void Update(T entity)
{
_dataContext.Entry<T>(entity).State = EntityState.Modified;
_dataContext.SaveChanges();
}
}
Every repository has its own interface. It’s used with dependency injection so it’s easier to change implementation if it’s (ever) needed.
public interface IInvoiceRepository : IRepository<Invoice>
{
List<Invoice> GetOverDueInvoice();
List<Invoice> GetCreditInvoices();
}
public class InvoiceRepository : BaseRepository<Invoice>, IInvoiceRepository
{
public List<Invoice> GetCreditInvoices()
{
// return list of credit invoices
}
public List<Invoice> GetOverDueInvoice()
{
// return list of over-due invoices
}
}
Those familiar with repository pattern find it probably all okay. Nice and clean generalization, code reusing, dependency injection etc – like a good arhictecture of code should be.
Finding a landmine under carpet
Now let’s take a good look at how operations with repositories work. Let’s take update and delete methods.
public void Delete(T entity)
{
_dataContext.Set<T>().Remove(entity);
_dataContext.SaveChanges();
}
public void Update(T entity)
{
_dataContext.Entry<T>(entity).State = EntityState.Modified;
_dataContext.SaveChanges();
}
What’s wrong here or why do I want to whine about this code? Well.. SaveChanges() method sends all changes to database. Not only changes related to given entity are saved but also changes to all other entities. If we have service class using this repository and it implements some more complex use case where update and delete are called multiple times we are screwed. If one update or delete fails then there’s no way to roll back all changes done to database before.
Unit of Work to help?
We can move SaveChanges() method out to unit of work container and this way we can make changes we need and save them when we commit unit of work.
public class UnitOfWork
{
private readonly MyDbContext _dataContext;
public UnitOfWork(MyDbContext dataContext)
{
_dataContext = dataContext;
}
public void Save()
{
_dataContext.SaveChanges();
}
}
This is how we can use our unit of work in imaginary service class.
public class InvoiceService
{
private readonly IInvoiceRepository _invoiceRepository;
private readonly UnitOfWork _unitOfWork;
public InvoiceService(IInvoiceRepository invoiceRepository, UnitOfWork unitOfWork)
{
_invoiceRepository = invoiceRepository;
_unitOfWork = unitOfWork;
}
public void CreditInvoice(Guid id)
{
var invoice = _invoiceRepository.Get(id);
Invoice creditInvoice;
// create credit invoice
_invoiceRepository.Insert(creditInvoice);
_unitOfWork.Save();
}
}
But still things doesn’t get better. We have now additional instance to inject to service classes and our repositories are just dummy wrappers to Entity Framework database context.
Disposing custom repositories
After thinking hard if there’s any reason to keep unit of work and repositories in code I found out that it’s not worth it. DbContext class provides us with mix of unit of work and repositories. We don’t need repository classes that are just containers to DbSet<Something> properties of DbContext. Also we don’t need new class to wrap call to SaveChanges();
Even better – custom database context is class written by us. We extend DbContext of EF Core which means we have pretty much free hands on building up the custom context. It can work as unit of context too. And if we need it can work as one-shot-repository too.
public class LasteDbContext : DbContext
{
public LasteDbContext(DbContextOptions<LasteDbContext> options)
: base(options)
{
}
public DbSet<Comment> Comments { get; set; }
public DbSet<Invoice> Invoices { get; set; }
public DbSet<InvoiceLine> InvoiceLines { get; set; }
private IDbContextTransaction _transaction;
public void BeginTransaction()
{
_transaction = Database.BeginTransaction();
}
public void Commit()
{
try
{
SaveChanges();
_transaction.Commit();
}
finally
{
_transaction.Dispose();
}
}
public void Rollback()
{
_transaction.Rollback();
_transaction.Dispose();
}
}
One can point out now that we are operating on concrete type and it’s not good for dependency injection. Okay, let’s apply interface and call it as IDataContext.
public interface IDataContext
{
DbSet<Comment> Comments { get; set; }
DbSet<Invoice> Invoices { get; set; }
DbSet<InvoiceLine> InvoiceLines { get; set; }
void BeginTransaction();
void Commit();
void Rollback();
}
So, no blasphemies of technical design anymore. We can inject instance of IDataContext to our services classes, commands or controllers and use just this one class with simple interface.
It’s not hard to use IDataContext with other object-relational mappers like NHibernate. My blog post NHibernate on ASP.NET Core demonstrates how to imitate DbContext-like mapper interface for NHibernate.
But what about custom queries?
One benefit of repository classes was keeping custom querying methods there. It was ideal place for this as those methods were implemented close to where data is handled. Not a perfect approach if there are many queries but still something.
public class InvoiceRepository : BaseRepository<Invoice>, IInvoiceRepository
{
public List<Invoice> GetCreditInvoices()
{
// return list of credit invoices
}
public List<Invoice> GetOverDueInvoice()
{
// return list of over-due invoices
}
}
Adding such methods to custom database context will grow it huge. So huge that we need some better solution. We have some options to consider:
- Query classes – classes where stored queries are implemented. Downside is the same as with repositories – the more querying methods you have the bigger the class grows. In one point you still need some better solution. Of course, in C# we can always go with partial classes and group queries to separate files.
- Query object – pattern introduced by Martin Fowler in his famous Patterns of Enterprise Application Architecture book. Query object is object hierarchy to build query that is transformed to SQL (or whatever the back-end data store is). It works if we don’t have too many conditions to consider.
- Extension methods – it’s also possible to have a static class per entity type where queries are defined as extension methods to their corresponding DbSet. It can be also something general like my Entity Framework paging example.
- Applying specification pattern – I have seen some solution where specifications to include child objects and collection and specifications to where-clause and order-by clause were defined. It’s not a good solution in long run as over time also queries tend to grow and we will easily end up with specification hell.
In practice we often run sprint with growing complexities in database, amounts of data and expenses to host database. Sooner or later we want to optimize queries to ask only the data we need. It means our LINQ queries get more complex because of calls to Include() method. Sometimes we need change tracking on EF Core query and sometimes we don’t.
I have seen it happening. In one project after all small optimizations and tweaks to LINQ queries we had some considerable wins on database performance but we ended up with lengthy queries. Queries were mostly long and ugly enough to not pile these together in one class. As most of queries were model specific we added factory methods to models. For some models we created factory class to not bloat model itself with details about how to build it. In both cases the queries were directly hosted in factories because there was no point to add some additional layer to host it all. We had also shared queries in implemented as extension methods in data access layer. But we had no single custom repository or unit of work class. Still we managed to have nice and understandable code structure.
Wrapping up
It’s tempting to create unit of work and repository interfaces and classes for Entity Framework Core because everybody is doing so. Still we have database specific implementations for both of these available with DbContext class.
- Unit of Work – use database transactions to avoid queries accidentally modify data before transaction is committed and be careful with SaveChanges() method of database context.
- CRUD – use DBSet<Something> properties of database context. These sets have all methods needed to modify and query the data in database.
- Stored queries – general ones can be part of database context and specialized ones can live in query classes, factories or extension methods.
There’s no actual need to implement your own unit of work and repositories if they just wrap DbContext functionalities without adding any new value. Simple idea – use what you already have.
View Comments (22)
There are some counterpoints to consider: https://brianbu.com/2019/09/25/the-repository-pattern-isnt-an-anti-pattern-youre-just-doing-it-wrong/
Thanks for reference, George.
What I see there is classic repository implementation that works as a wrapper to EF Core database context. Unit of work - it's wrapper too and it doesn't add any value besides wrapping one method call to class. It doesn't seem to use database transactions and it's weird for a system of this size. It's even more hard to understand why they built unit of work pool. There's no need for it. Any DI/IoC container can do it.
All and all - I still don't see why custom implementations of repository and unit of work are necessary.
I didn't implement the repository pattern in my project and I regret because now I want to migrate to dapper and would be much easier if I had implemented it before
I like to wrap the DbContext so I only expose the methods I'm interested in. I would then add a property to the IUnityOfWork something like IRepository Repository() where T : Entity. This means you don't have to inject every repository in the constructor.
The only solution is CQRS with E/S. Don't use ORM for DDD but only for the read part. While people may argue CQRS/ES is necessary only for the complex scenarios it is actually the only way to tackle problems like above. Then on the read side, you may use IQueryables directly.
It still doesn't give final answer on the read side. What if we have to support mapper that doesn't have LINQ support? In this case we still need some classes where we can put queries. Just take example by Aron above.
Then you simply don't use LINQ. On the read side you can use what ever you want, keep them restricted to be readonly.
A typical registration process would like like:
//Write side
public interface IAuthenticationService
{
Task Register(string userName, string password);
}
//ReadSide
public interface IReadOnlyUserRepo
{
Task CheckIfUserExists(string userName)
}
You first call CheckIfUserExist then call the registration, on the off chance, if same user name conflict happens 1:billion chance, then you apologize and undo. It is possible to make this 100% non conflcting too, by creating an Aggregate which book keeps the user names.
BTW above Tasks should be Task of bool, but your page clears generics for XSS protection.
Sorry for angle brackets mess. There's one security plugin that prevents tons of bad things to happen but it has some side effects I have to take care of.
So there’s already a comment here that illustrates this, but I don’t think this really covers the actual reason for another abstraction between EF and your application code - decoupling the two. For a lot of projects, it really might not make sense to bother with the additional complexity of wrapping EF with your own repository and unit of work. But if you might want/need to move away EF some day, that extra layer ensures your application code isn’t even aware of the underlying DbContext and DbSets.
Well, if you plan to replace ORM right from start or if you want to support different ORM-s right from start then you need custom unit of work and repositories. I have found that replacing of ORM is not anything common if it doesn't happen in very early stages of development. The later you want to replace ORM the more work you have. Teams usually first try to get most out of current ORM before changing over to another ORM. Been there, done that.
I dont have a lot of experience, but I already saw a case of a lot of people replacing ORM EF to EF Core. Ok I did myself and it wasnt that complicate since msft tried the maximum to facilitate this migration. The point is that you shouldnt be so coupled and dependent to a specific framework if you want to keep maintaining and updating your software for a long time
"Stored queries – general ones can be part of database context and specialized ones can live in query classes, factories or extension methods"
Have you a code example on how to create query class, factories please. I'm just starting a project and like your approach, but have some very well optimised queries which I need run with raw sql. I would usually use a repo.
Here's my experiment to get queries injected into DbContext: https://gunnarpeipman.com/ef-core-dbcontext-repository/
Hi Mister. https://gunnarpeipman.com/ef-core-repository-unit-of-work/ please github link send me.))
For this post I don't have samples at Github.
This article, like so many on this exact same topic, use examples that perfectly match the problem it is trying to solve. Yes, in this very specific example you give repos are redundant. However there is a huge set of other problems in real world apps that prove that (properly done) repos are a good idea. But first note that the "use query pattern", "use CQS", etc arguments, while generally correct, simply are trying to justify the answer. We could just as easily say "don't use an ORM" and win the same argument. Therefore these "use something else" answers aren't actual answers.
Also as an aside, a generic repo is bad. It serves no purpose over an ORM. However this is an implementation detail and needs to be excluded from any architectural discussions of repos. Within a formal repo you would use the ORM directly.
Let's start with the easy one. Your persistence model doesn't match your domain model. Yes many ORMs support transformations on the persistence data but not necessarily enough to resolve this in the ORM directly. As an example you have a lookup table that is normalized beyond belief such that string columns are normalized into other tables (e.g. Category is in CategoryLookup, Topic is in TopicLookup, etc). In the domain model these are just strings but not in the persistence model. So you need to normalize. That would be difficult to do in a DbSet directly. But since the repo is part of the domain layer it has to be. It doesn't make sense to expose the persistence model from the repo and then normalize it while inside a domain service.
Another case is for things that DbSet cannot do like execute sprocs (even in EF Core). You can execute a sproc on the context but you cannot get from the DbSet to the DbContext directly (especially in older EF versions). So then you end up with entity-specific objects on your context. That breaks using the repo as the parameter to services. You could argue an extension method would work but again you cannot get from DbSet to DbContext easily, if at all.
And yet another case is limiting the operations. With a DbSet you can CRUD anything from any domain service. What if you have read only tables that should never be written to in an app? You cannot prevent it via DbSet. Maybe a table supports CRU but not D, cannot stop that either. Therefore the rules have to be baked into each service. To be fair using repos can have this same issue but repos are closer to the DB wire so having a better understanding of the DB model makes sense there.
Repos shine when you want to provide focused access to the entities in the persistence layer. Using a repo interface always makes sense to me even if the actual implementation is simply a pre-built wrapper around the ORM. And once you start using repos then you need to be consistent otherwise some code will use DbContext/DbSet directly while others the repo. This is bad all around.
So ultimately, like all the other articles I see on this topic, the author is complaining about implementation details rather than the pattern itself but the "don't use a repo" argument is simply opening the door for misuse and, even worse, harder code to maintain. Ultimately I think the query objects with perhaps CQS is the best solution but for existing code that is hard to switch to.
To be fair, I would rather avoid exposing DbSet from my IdbContext abstraction. Wrapping EF code in my own UoW still looks like the way to go. Moreover, when properly applying DDD and CQRS, you don't absolutely need a generic repository. Most of the times you end up writing only a FindById() and a Write() method. Maybe you might add few others, but still, those are very tied to the domain.
Still, very good article, thanks for sharing!
I also don't like redundant layers, but repository is a good place for caching
Don't throw the baby out with the bathwater. I agree with UOW being redundant with EF since your UOW class essentially becomes a wrapper for the DbContext but not so much repositories and services. Your alternative to repositories are not ideal either. Repositories are clean and efficient and consistent for creating your data layer. All they need to work with is your bare DbContext and they allow for a nice layering with your business logic (services).
For example extension methods pollute the de-coupling between your UI and other layers and it may not be easy to switch out your data layer in the future without having to do a bunch of refactoring.
At first you need to know what is responsibility of repository pattern. Responsibility of this pattern is reincarnate whole tree (not simple entity) of entities. It means it doesn't return Invoice entity but it returns Invoice entity with Invoice Items entities and next related entities to Invoice entity. Tree of entities makes sense from business because why you want to use Invoice entity without Items?
Generic repository or DbContext extension methods create Table Gateway pattern. Table Gateway pattern and Repository pattern has different usage. Table Gateway pattern will useful for application with small complex or without business logic. It is in relation Table Module pattern in business layer. But Repository pattern will useful for application with very complex business logic because it support domain model which can handle very complex business logic. Use Table Gateway pattern with domain model will be very hard (not impossible) and it leads to create amount of bugs.
There is second approach why use repository pattern. Table Gateway pattern organizes CRUD operations. Typical methods for this pattern are Create, Insert, Update, Delete and Get. When you need to create whole tree (i.e. Invoice with Addresses) you need to have two Table Gateways - Invoice and Address. Composing of two entities (Invoice and Address) together is not in data layer but it is shifted to application or business layer. Problem of this approach is that composing may not be easy so you will return IQueryable from Table Gateway pattern. Now data layer is not responsible for return entity because more complex query is composed in another layer.
I working with more stored procedures. I can see complex select operations in stored procedures. Problem of this approach is that you cannot say what data are returned by this select operation. Similarly when you compose complex query with huge amount of entities or conditions out of data layer you cannot say what data are returned and this query is not reusable in another use case.
Instead of table gateway repository can work with complex queries because this operation is closed in data layer - repository returns tree instead of single entity. This tree has name because of method in repository pattern - GetNotApprovedInvoices, GetApprovedInvoices, GetCancelledInvoices, ... This name are given by business description. This naming is known as Supply Design and it removes "layer" between programmer and business keeper because there are business terms in code so programmer cannot translate business terms into programing language and back (when there is bug or revers engineering).
In conclusion use table gateway pattern (known as generic repository or can be implemented as extension method of DbSet or DbContext) when you have application with low business logic. When your logic is more complex use repository pattern which can works with entity tree. Use these two patterns with right patterns in business layer. In large application you can split application into query part and command part (CQS\CQRS) and each part has own model and mapping profile.
Thanks for this helpful conent