Optimizing Entity Framework Using View-Backed Entities

I was profiling a Web application built on Entity Framework 6 and MVC 5, using the excellent Glimpse. I found that a page with three lists of five entities each was causing over a hundred query executions, eventually loading a huge object graph with hundreds of entities. I could eliminate the round trips using Include(), but that still left me loading way too much data when all I needed was aggregate/summary data.

The problem was that the aggregates I needed were complex and involved calculated properties, some of which were based on aggregates of navigation collection properties: a parent had sums of its children’s properties, which in turn had sums of their children’s properties, and in some cases parents had properties that were calculated partly based on aggregates of children’s properties. You can see how this quickly spun out of control.

My requirements were that the solution had to perform better, at returning the same data, while allowing me to use standard entity framework, code first, with migrations. My solution was to calculate this data on the server side, using entities backed by views that did the joining, grouping, and aggregation. I also found a neat trick for backward-compatible View releases:

IF NOT EXISTS (SELECT Table_Name FROM INFORMATION_SCHEMA.VIEWS WHERE Table_Name = 'MyView')
EXEC sp_executesql N'create view [dbo].[MyView] as select test = 1'
GO
ALTER VIEW [dbo].[MyView] AS
SELECT ...

It’s effectively upsert for views – it’s safe to run whether or not the view already exists, doesn’t ever drop the view if it does exist (leaving no period where a missing view might cause an error), and it doesn’t require keeping separate create and alter scripts in sync when changes are made.

I then created the entities that would represent the views, using unit tests to ensure that the properties now calculated on the server matched expected values the same way that the original, app-calculated properties did. Creating entities backed by views is fairly straightforward; they behave just like tables, but obviously can’t be modified – I made the property setters protected to enforce this at compile time. Because my View includes an entry for every “real” entity, any query against the entity type can be cast to the View-backed type and it will pull full statistics (there is no possibility of an entity existing in the base table but not in the view).

Next I had to create a one to one association between the now bare entity type and the view type holding the aggregate statistics. The only ID I had for the view was the ID of the raw entity it was connected to. This turned out to be easier said than done – entity framework expects that, in a one to one relationship, it will be managing the ID at one end of the relationship; in my case, the ID’s at both ends were DB-generated, even though they were guaranteed to match (since the ID in the view was pulled directly from the ID in the entity table).

I ended up abandoning the one-to-one mapping idea after a couple days’ struggle, instead opting to map the statistics objects as subclasses of the real types in a table per type structure. This wound up being relatively easy to accomplish – I added a table attribute to the sub type, giving the name of the view, and it was off to the races. I went through updating references to the statistics throughout LINQ queries, views, and unit tests. The unit and integration tests proved very helpful in validating the output of the views and offering confidence in the changes.

I then ran my benchmarks again and found that pages that had required over a hundred queries to generate now used only ten to twenty, and were rendering in half to a third the time – a one to two hundred percent improvement, using views designed purely to mimic the existing functionality – I hadn’t even gone about optimizing them for performance yet!

After benchmarking, it looks even better (times are in milliseconds, min/avg/max):
EF + LINQ EF + Views
3 lists of 5 entities (3 types) 360/785/1675 60/105/675
2 lists of 6 entities (1 type) 325/790/1935 90/140/740
1 entity’s details + 1 list of 50 entities 465/975/2685 90/140/650
These tests were conducted by running Apache JMeter on my own machine against the application running on Windows Azure, across a sampling of 500 requests per page per run. That’s a phenomenal 450 to 650 percent improvement across the board on the most intensive pages in the application, and has them all responding to 100% of requests in under 1 second. The performance gap will only widen as data sets grow; using views will make the scaling much more linear.
I’m very pleased with the performance improvement I’ve gotten. Calculating fields on the app side works for prototyping, but it just can’t meet the efficiency requirements of a production application. View-backed entities came to the rescue in a big way. Give it a try!

You’re Being Held Hostage and You May Not Even Know It

To me, net neutrality isn’t about fair business practices between businesses. That’s certainly part of it, but it’s not the crux of the issue. To me, net neutrality is about consumer protection.

Your broadband provider would like to charge companies – particularly content companies – extra in order to bring you their content. Setting aside the utterly delirious reasoning behind this for the moment, let’s think about this from the consumer’s perspective. You’re paying your ISP to provide you access to the internet – the whole thing. When you sign up for service, you’re signing up for just that: access to the internet. Period. What your ISP fails to disclose, at least in any useful detail, is how they intend to shape that access.

For your $40, $50, $60 or more each month, you might get high-speed access to some things, and not to others. You don’t get to know what ahead of time, or even after you sign up – the last thing your ISP wants is for you to be well-informed about your purchase in this regard. They’ll do whatever they can to convince you that your service is plain, simple, high-speed access to the whole internet.

Then, in negotiations behind closed doors, they’re using you as a hostage to extort money from the businesses you’re a customer of. Take Netflix as an example: you pay your ISP for internet service. Netflix also has an ISP, or several, that they pay for internet service. Those ISPs have what are called “peering arrangements” that determine who, if anyone, pays, and how much, when traffic travels between their networks on behalf of their customers. This is part and parcel of what you and Netflix pay for your service. You pay Netflix a monthly fee to receive Netflix service, which you access using your ISP. Netflix uses some part of that monthly fee to pay for their own internet service.

Your ISP has gone to Netflix and said “hey, if you want to deliver high-definition video to your customers who are also my customers, you have to pay me extra, otherwise my customers which are also your customers will receive a sub-par experience, and they might cancel their Netflix account.” They’re using you as a bargaining chip without your knowledge or consent, in order to demand money they never earned to begin with; everyone involved is already paying their fair share for their connection to the global network, and for the interconnections between parts of that global network.

To me, when a company I do business with uses me, and degrades my experience of their product, without my knowledge or consent, that’s fraud from a consumer standpoint. Whatever Netflix might think about the deal, whether Netflix is right or wrong in the matter, doesn’t enter into it; I’m paying for broadband so that I can watch Netflix movies, I’m paying for Netflix so that I can watch movies over my broadband connection, and my ISP is going behind my back and threatening to make my experience worse if Netflix doesn’t do what they want. Nobody asked me how I feel about it.

Of course, they could give full disclosure to their customers (though they never would), and it wouldn’t matter a whole lot, because your options as a broadband consumer are extremely limited; in the majority of cases, the only viable solution is cable, and when there is competition, it comes from exactly one place: the phone company. The cable companies and phone companies are alike in their use of their customers as hostages in negotiations.

What about fiber broadband? It’s a red herring – it’s provided by the phone company anyway. Calling fiber competition is like saying Coke in cans competes with Coke in bottles – it’s all Coke, and whichever one you buy, your money goes into Coke’s pocket.

What about wireless? Wireless will never, ever be able to compete with wired service, due to simple physics. The bandwidth just isn’t there, the spectrum isn’t there, there’s noise to contend with, and usage caps make wireless broadband a non-starter for many cases, especially streaming HD video. Besides, the majority of truly high-speed wireless service is provided by the phone companies anyway; see the previous paragraph.

Why aren’t they regulated? The FCC is trying, in its own way, but there’s little traction; the cable and telephone companies have the government in their collective pockets with millions of dollars of lobbying money, and We The People haven’t convinced Our Government that we care enough for them to even consider turning down that money.

In the United States, we pay many, many times what people pay in much of the developed world, and we get many, many times less for what we spend. On top of that, our ISPs are using us as bargaining chips, threatening to make our already overpriced, underpowered service even worse if the companies we actually chose in a competitive market – unlike our ISPs – don’t pay up. This is absolutely preposterous, it’s bordering on consumer fraud, and you should be angry about it. You should be angry enough to write your congressman, your senator, the president, the FCC, and your ISP (not that the last will do you much good, but it can’t hurt.)

Some excellent places to find more information: