Thursday, July 24, 2014

Escaping '&' (ampersand) in razor view engine


Recently I ran into a really annoying problem with the asp.net razor view engine -
I was generating some url's on the server side, and trying to print them inside html tag attributes like 'href' or 'src'.

The problem was that all the ampersands ('&') were being encoded to '&'.
First thing I tried to do was print it out using the Html 'Raw' helper method, like this :
Some Link


This didn't work... :/
The weird thing about this was that when I searched the internet and found questions on stackoverflow, some people wrote that Html.Raw() worked for them and some said it didn't.

After a little more research (mostly based on some trial & error), I realized that razor will always encode strings inserted in attribute values. This is done for security reasons. The proper workaround is to simply put the whole tag inside the 'Raw()' method, like this:
@Html.Raw("Some Link)


This basically tells razor - "I know what I'm doing, just let me do it my way!" :)

Sunday, July 13, 2014

Saving prices as decimal in mongodb


When working with prices in C#, you should always work with the 'decimal' type.
Working with the 'Double' type can lead to a variety of rounding errors when doing calculations with them, and is more intended for mathematical equations.

(I don't want to go into details about what problems this can cause exactly, but you can read more about it here :
http://stackoverflow.com/questions/2129804/rounding-double-values-in-c-sharp
http://stackoverflow.com/questions/15330988/double-vs-decimal-rounding-in-c-sharp
http://stackoverflow.com/questions/693372/what-is-the-best-data-type-to-use-for-money-in-c
http://pagehalffull.wordpress.com/2012/10/30/rounding-doubles-in-c/ )

I am currently working on a project that involves commerce and prices, so naturally I used 'decimal' for all price types.
Then I headed to my db, which in my case is mongodb, and the problem arose.
MongoDB doesn't support 'decimal'!! It only supports the double type.

Since I rather avoid saving it as a double for reasons stated above, I had to think of a better solution.
I decided to save all the prices in the db as Int32 saving the prices in 'cents'.

This means I just need to multiply the values by 100 when inserting to the db, and dividing by 100 when retrieving. This should never cause any rounding problems, and is pretty much straight-forward. I even don't need to worry about sorting, or any other query for that matter.

But... I don't want ugly code doing all these conversions from cents to dollars in every place...

I'm using the standard C# mongo db driver (https://github.com/mongodb/mongo-csharp-driver), which gives me the ability to write a custom serializer for a specific field.
This is a great solution, since it's the lowest level part of the code that deals with the db, and that means all my entities will be using 'decimal' everywhere.

This is the code for the serializer :
[BsonSerializer(typeof(MongoDbMoneyFieldSerializer))]
public class MongoDbMoneyFieldSerializer : IBsonSerializer
{
    public object Deserialize(BsonReader bsonReader, Type nominalType, IBsonSerializationOptions options)
    {
        var dbData = bsonReader.ReadInt32();
        return (decimal)dbData / (decimal)100;
    }

    public object Deserialize(BsonReader bsonReader, Type nominalType, Type actualType, IBsonSerializationOptions options)
    {
        var dbData = bsonReader.ReadInt32();
        return (decimal)dbData / (decimal)100;
    }

    public IBsonSerializationOptions GetDefaultSerializationOptions()
    {
        return new DocumentSerializationOptions();
    }

    public void Serialize(BsonWriter bsonWriter, Type nominalType, object value, IBsonSerializationOptions options)
    {
        var realValue = (decimal) value;
        bsonWriter.WriteInt32(Convert.ToInt32(realValue * 100));
    }
}


And then all you need to do is add the custom serializer to the fields which are prices, like this:
public class Product
{
    public string Title{ get; set; }
    public string Description { get; set; }

    [BsonSerializer(typeof(MongoDbMoneyFieldSerializer))]
    public decimal Price { get; set; }

    [BsonSerializer(typeof(MongoDbMoneyFieldSerializer))]
    public decimal MemberPrice { get; set; }

    public int Quantity { get; set; }
}

That's all there is to it.

Monday, June 23, 2014

Drastically improving 'First Byte' and 'Page Load' (for SEO)


Improving your 'first byte' speed, and in general your 'page load' can be crucial for SEO. Google likes pages that render faster to the user, and, in some cases, will prioritize them higher than other pages in search results.

If you're not familiar with this, then here are some articles on the subject :
http://googlewebmastercentral.blogspot.co.il/2010/04/using-site-speed-in-web-search-ranking.html
http://blog.kissmetrics.com/speed-is-a-killer/
http://www.quicksprout.com/2012/12/10/how-load-time-affects-google-rankings/

Improving your site's performance can be a daunting task. There are probably many easy wins you can do that will improve the speed by a little, but quickly you will realize that better results will take much longer. Some improvements can take days, weeks and even months of infrastructure changes.

But why should your SEO suffer from this ?? Why not be a step ahead of google ??
Your site doesn't really need to be fast for you to get good SEO scores, you just need google to think your site is fast!

But how do you do that ?
Google will scan your site once every few days/weeks and cache the results for indexing. So let's beat google to it's own game.
Why don't we crawl our site first, cache the results to text files even, and when google comes around, just serve it the static pages we cached without any server calculations.

You can easily build a crawler using Selenium, phantomjs, zombiejs or pure nodejs. You don't even need to implement all the logic of a regular crawler since you're familiar with your site's domain.

For a real world example :
If your site is a big commerce site, then you know the structure of all your product pages. They're probably something like this :
http://www.YourCommerceSite.com/product/Product-Name/:Product-ID:

You can invoke this endpoint, while scanning all your different product id's from your db.
Then you can save them all to text files like this :
Product_.txt

When the google bot comes around (which you can easily detect by it's 'User-Agent' header) and requests a product page, then quickly give it the cached product page you stored on disk.
This might be stale by a few days/hours (as frequent as you decide to scan) but will still be good enough for indexing in google (since google's indexing isn't realtime anyway) and should be super fast!

Saturday, April 12, 2014

Debugging and solving the 'Forced Synchronous Layout' problem


If you're using Google Developer tools to profile your website's performance, you might have realized that Chrome warns you about doing 'forced layouts'.
This looks something like this :
In this screenshot, I marked all the warning signs chrome tries to give you so you can realize this problem.

So, what does this mean ?
When the browser constructs a model of the page in memory, it builds 2 trees that represent the DOM in memory. One is the DOM structure itself, and the other is a tree that represents the way the elements should be rendered on the screen.
This tree needs to always stay updated, so when you change an element's css properties for example, the browser might need to update these trees in memory to make sure the next time you request a css property, the browser will know it has updated information.

Why should you care about this ?
Updating both these trees in memory may take some time. Although they are in memory, most pages these days have quite a big DOM so the tree will be pretty big. It also depends on which element you change, since updating different elements might mean only updating part of the tree or the whole tree in different cases.

Can we avoid this ?
The browser can realize that you're trying to update many elements at once, and will optimize itself so that a whole tree update won't happen after each update, but only when the browser knows it needs relevant data. In order for this to work correctly, we need to help it out a little.
A very simple example of this scenario might be setting and getting 2 different properties, one after the other, as so :
var a = document.getElementById('element-a');
var b = document.getElementById('element-b');

a.clientWidth = 100;
var aWidth = a.clientWidth;

b.clientWidth = 200;
var bWidth = b.clientWidth;

In this simple example, the browser will update the whole layout twice. This is because after setting the first element's width, you are asking to retrieve an element's width. When retrieving the css property, the browser know's it needs updated data, so it then goes and updates the whole DOM tree in memory. Only then, it will continue to the next line, which will soon after cause another update because of the same reasons.

This can simply be fixed by changing around the order of the code, as so :
var a = document.getElementById('element-a');
var b = document.getElementById('element-b');

a.clientWidth = 100;
b.clientWidth = 200;

var aWidth = a.clientWidth;
var bWidth = b.clientWidth;

Now, the browser will update both properties one after the other without updating the tree. Only when asking for the width on the 7th line, it will update the DOM tree in memory, and will keep it updated for line number 8 as well. We easily saved one update.


Is this a 'real' problem ?
There are a few blogs out there talking about this problem, and they all seem like textbook examples of the problem. When I first read about this, I too thought it was a little far fetched and not really practical.
Recently though I actually ran into this on a site I'm working on...

Looking at the profiling timeline, I realized the same pattern (which was a bunch of rows alternating between 'Layout' and 'Recalculate Style').
Clicking on the marker showed that this was actually taking around ~300ms.













I can see that the evaluation of the script was taking ~70ms which I could handle, but over 200ms was being wasted on what?!...

Luckily, when clicking on the script in that dialog, it displays a JS stacktrace of the problematic call. This was really helpful, and directed me exactly to the spot.

It turned out I had a piece of code that was going over a loop of elements, checking each element's height, and setting the container height according to the aggregated height. This was being set and get in each loop iteration, causing a performance hit.

The problematic code looked something like this :
for (var i=0; i<containerItems.length; i++) {
   var item = containerItems[i];
   appendItemToContainer(item);
}

var appendItemToContainer = function(item) {
   container.clientHeight += item.clientHeight;
}

You can see that the 'for' loop has a call to the method 'appendItemToContainer' which sets the container's height according to the previous height - which means setting and getting in the same line.

I fixed this by looping over all the item's in the container, and building an array of their height's. Then I aggregated them all together and set the container's height once. This saved many DOM tree updates, and only left one which is necessary.

The fixed code looked something like this :
// collect the height of all elements
var totalHeight = 0;
for (var i=0; i<containerItems.length; i++) {
   totalHeight += containerItems[i].clientHeight;
}

// set the container's height once
container.clientHeight = totalHeight;

After fixing the code, I saw that the time spent was actually much less now -













As you can see, I managed to save a little over 150ms which is great for such a simple fix!!


Friday, February 21, 2014

Chrome developer tools profiling flame charts

I just recently, and totally coincidentally, found out that Chrome developer tools can generate flame charts while profiling js code!
Recently it seems like generating flame charts from profiling data has become popular in languages like Ruby, python and php, so i'm excited to see that chrome has this option for js code as well.

The default view for profiling data in the dev tools is the 'tree view', but you can easily change it to 'flame chart' by selecting it on the drop down in the bottom part of the window.

Like here :


Then you will be able to see the profiling results, in a way that sometimes is easier to look at.
You can use the mouse scroll button to zoom in on a specific area of the flame chart, and see what's going on there.

In case you're not familiar with reading flame charts, then here's a simple explanation -
  • Each colored line is a method call
  • The method calls above one another represent the call stack
  • The width of the lines represents how long each call was

And here you can see an example of a flame chart, and I marked a few sections that the flame chart points out for us, that are non-optimized TryCatchBlocks. In this case it's comfortable viewing it in a flame chart because you can see nicely how many method calls each try/catch block is surrounding.


Wednesday, February 19, 2014

Preloading resources - the right way (for me)


Looking through my 'client side performance glasses' when browsing the web, I see that many sites spend too much time downloading resources, mostly on the homepage, but sometimes the main bulk is on subsequent pages as well.

Starting to optimize
When trying to optimize your page, you might think that it's most important that your landing page is the fastest since it defines your users' first impression. So what do you do ? You probably cut down on all the js and css resources you can and leave only what's definitely required for your landing page. You minimize those and then you're left with one file each. You might even be putting the js at the end of the body so it doesn't block the browser from rendering the page, and you're set!

But there's still a problem
Now, your users go onto the next page, probably an inner page of your site, and this one is filled with much more content. On this page you use some jquery plugins and other frameworks you found useful and probably saved yourself hours of javascript coding, but your users are paying the price...

My suggestion
I ran into this same exact problem a few times in the past, and the best way I found of solving this was to preload the resources on the homepage. I can do this after 'page load' so it doesn't block the homepage from rendering, and while the user is looking at the homepage, a little extra time is spent in the background downloading resources they'll probably need on the next pages they browse.

How do we do this ?
Well, there are several techniques, but before choosing the right one, lets take a look at the requirements/constraints we have -
  • We want download js/css files in a non-blocking way
  • Trigger the download ourselves so we can defer it to after 'page load'
  • Download the resources in a way that won't execute them (css and js) (This is really important and the reason we can't just dynamically create a '<script/>' tag and append it to the '<head/>' tag!)
  • Make sure they stay in the browser's cache (this is the whole point!)
  • Work with resources that are stored on secure servers (https). This is important since I would like it to preload resources from my secured registration/login page too if I can.
  • Work with resources on a different domain. This is very important since all of my resources are hosted on an external CDN server with a different subdomain.

The different techniques are (I have tested all of these, and these are my notes)
1. Creating an iframe and appending the script/stylesheet file inside it
var iframe = document.createElement('iframe');
iframe.setAttribute("width", "0");
iframe.setAttribute("height", "0");
iframe.setAttribute("frameborder", "0");
iframe.setAttribute("name", "preload");
iframe.id = "preload";
iframe.src = "about:blank";
document.body.appendChild(iframe);

// gymnastics to get reference to the iframe document
iframe = document.all ? document.all.preload.contentWindow : window.frames.preload;
var doc = iframe.document;
doc.open();
doc.writeln("");
doc.close();

var iFrameAddFile = function(filename) {
    var css = doc.createElement('link');
    css.type = 'text/css';
    css.rel = 'stylesheet';
    css.href = filename;
    doc.body.appendChild(css);
}
    
iFrameAddFile('http://ourFileName.js');
This works on Chrome and FF but on some versions of IE it wouldn't cache the secure resources (https).
So, close, but no cigar here (at least, fully).

2. Creating a javascript Image object
new Image().src = 'http://myResourceFile.js';
This only works properly on Chrome. On FireFox and IE it would either not download the secure resources or download them but without caching.

3. Building an <object/> tag with file in data attribute
var createObjectTag = function(filename) {
    var o = document.createElement('object');
    o.data = filename;

    // IE stuff, otherwise 0x0 is OK
    if (isIE) {
        o.width = 1;
        o.height = 1;
        o.style.visibility = "hidden";
        o.type = "text/plain";
    }
    else {
        o.width  = 0;
        o.height = 0;
    }

    document.body.appendChild(o);
}
   
createObjectTag('http://myResourceFile.js');
This worked nicely on Chrome and FF, but not on some versions of IE.

4. XMLHttpRequest a.k.a. ajax
var ajaxRequest = function(filename) {
    var xhr = new XMLHttpRequest();
    xhr.open('GET', filename);
    xhr.send('');
}

ajaxRequest('http://myResourceFile.js');
This technique won't work with files on a different domain, so I immediately dropped this.

5. Creating a 'prefetch' tag
var prefetchTag = function(filename) {
    var link = document.createElement('link');
    link.href = filename;
    link.rel = "prefetch";
    document.getElementsByTagName('head')[0].appendChild(link);
}

prefetchTag('http://myResourceFile.js');


6. 'script' tag with invalid 'type' attribute
// creates a script tag with an invalid type, like 'script/cache'
// I realized this technique is used by LabJS for some browsers
var invalidScript = function(filename) {
    var s = document.createElement('script');
    s.src = filename;
    s.type = 'script/cache';
    document.getElementsByTagName('head')[0].appendChild(s);
}

invalidScript('http://myJsResource.js');
This barely worked in any browser properly. It would download the resources, but wouldn't cache them for the next request.


Conclusion
So, first I must say, that given all the constraints that I have, this is more complicated than I thought would be at first.
Some of the techniques worked well on all of the browsers for non-secured resources (non SSL) but only on some browsers for secured resources. In my specific case I just decided to go with one of those, and figure that some users will not have cached resources that are for SSL pages (these are a minority in my case).
But, I guess that given your circumstances, you might choose a different technique. I had quite a few constraints that I'm sure not everyone has.
Another thing worth mentioning is that I didn't test Safari on any technique. Again, this was less interesting for me in my case.
I also didn't think about solving this problem on mobile devices yet. Since mobile bandwidth is also usually much slower I might tackle this problem differently for mobile devices...

Friday, December 20, 2013

Prebrowsing - Not all that...

Six weeks ago, Steve Souders, an amazing performance expert published a post called "Prebrowsing".
In the post he talks about some really simple techniques you can use to make your site quicker. These techniques rely on the fact that you know the next page that the user will browse to, so you 'hint' to the browser, and the browser will start downloading needed resources earlier. This will make the next page navigation appear much quicker to the user.

There are three ways presented to do this - They all use the 'link' tag, with a different 'rel' value.

The first technique is 'dns-prefetch'. This is really easy to add, and can improve performance on your site. Don't expect a major improvement though, the dns resolution itself usually doesn't take more than 150ms (from my experience).
I wrote about this to, in this blog post: Prefetching dns lookups

The second two techniques shown are 'prefetch' and 'prerender'.
Since these are really easy to add, once I read about this, I immediately added this to my site.
A little information about the site I'm working on : The anonymous homepage doesn't have SSL. From this page, most users sign-in or register. Both of these actions redirects the user to a SSL secured page. Since the protocol on these pages are https, the browser doesn't gain from the cached resources it has already since it thinks it's a different domain (and should). This causes the user to wait a long time on these pages just to have it's client download the same resources again but from a secured connection this time.

So I thought it would be perfect to have the browser prerender (or prefetch) the sign-in page or the register page. I have a WebPageTest that runs a script measuring the performance of this page, after the user was at the anonymous homepage. This test improved by a LOT. This was great! It was only a day after that I realized that the anonymous homepage itself was much slower... :/
I guess this is because while the browser takes up some of it's resources to prerender the next page, it affects the performance of the current page. Looking at multiple tests of the same page I couldn't detect any point of failure except that each resource on the page was taking just a little longer to download. Another annoyance is that you can't even see what's happening with the prerendered page on utilities like WebPageTest, so you just see the effect on the current page.

After reading a little more on the subject I found more cons to this technique. First, it's still not supported in all browsers, not even FF or Opera. Another thing is that Chrome can only prerender one page across all processes. This means I can't do this for 2 pages and I don't know how the browser will react if another site that is opened also requested to prerender some pages. You also won't see the progress the browser makes on prerendering the page, and what happens if the user browses to the next page before the prerendered page finished ? Will some of the resources be cached already ? I don't know, and honestly I don't think it's worth testing yet to see how all browsers act on these scenarios.
I think we need to wait a little longer with these two techniques for them to mature a bit...

What is the best solution ?
Well, like every performance improvement - I don't believe there is a 'best solution' as there are no 'silver bullets'.
However, the best solution for the site I'm working on so far, is to preload the resources we know the user will need ourselves. This means we use javascript to have the browser download resources we know the user will need throughout the site on the first page they land, so on subsequent pages, the user's client will have much less to download.

What are the pros with this technique ?
1. I have much more control over it - This means I can detect which browser the user has, and use the appropriate technique so it will work for all users.
2. I can trigger it after the 'page load' event. This way I know it won't block or slow down any other work the client is doing for the current page.
3. I can do this for as many resources I need. css, js, images and even fonts if I want to. Basically anything goes.
4. Downloading resources doesn't limit me to guessing the one page that the user will be heading after this one. On most sites there are many common resources used among different pages, so this gives me a bigger win.
5. I don't care about other tabs the user has open that aren't my site. :)

Of course the drawback with this is that opposed to the 'prerender' technique, the browser will still have to download the html, parse & execute the js/css files and finally render the page.

Unfortunately, doing this correctly isn't that easy. I will write about how to do this in detail in the next post (I promise!).

I want to sum up for now so this post won't be too long -
In conclusion I would say that there are many techniques out there and many of them fit different scenarios. Don't implement any technique just because it's easy and because someone else told you it works. Some of them might not be a good fit for your site and some might even cause damage. Steve Souder's blog is great and an amazing fountain of information on performance. I learned the hard way that each performance improvement I make needs to be properly analyzed and tested before implementing.


Some great resources on the subject :
- Prebrowsing by Steve Souders
- High performance networking in Google Chrome by Ilya Grigorik
- Controlling DNS prefetching by MDN