[DebuggerStepThrough]: 2013

Friday, December 20, 2013

Prebrowsing - Not all that...

Six weeks ago, Steve Souders, an amazing performance expert published a post called "Prebrowsing".
In the post he talks about some really simple techniques you can use to make your site quicker. These techniques rely on the fact that you know the next page that the user will browse to, so you 'hint' to the browser, and the browser will start downloading needed resources earlier. This will make the next page navigation appear much quicker to the user.

There are three ways presented to do this - They all use the 'link' tag, with a different 'rel' value.

The first technique is 'dns-prefetch'. This is really easy to add, and can improve performance on your site. Don't expect a major improvement though, the dns resolution itself usually doesn't take more than 150ms (from my experience).
I wrote about this to, in this blog post: Prefetching dns lookups

The second two techniques shown are 'prefetch' and 'prerender'.
Since these are really easy to add, once I read about this, I immediately added this to my site.
A little information about the site I'm working on : The anonymous homepage doesn't have SSL. From this page, most users sign-in or register. Both of these actions redirects the user to a SSL secured page. Since the protocol on these pages are https, the browser doesn't gain from the cached resources it has already since it thinks it's a different domain (and should). This causes the user to wait a long time on these pages just to have it's client download the same resources again but from a secured connection this time.

So I thought it would be perfect to have the browser prerender (or prefetch) the sign-in page or the register page. I have a WebPageTest that runs a script measuring the performance of this page, after the user was at the anonymous homepage. This test improved by a LOT. This was great! It was only a day after that I realized that the anonymous homepage itself was much slower... :/
I guess this is because while the browser takes up some of it's resources to prerender the next page, it affects the performance of the current page. Looking at multiple tests of the same page I couldn't detect any point of failure except that each resource on the page was taking just a little longer to download. Another annoyance is that you can't even see what's happening with the prerendered page on utilities like WebPageTest, so you just see the effect on the current page.

After reading a little more on the subject I found more cons to this technique. First, it's still not supported in all browsers, not even FF or Opera. Another thing is that Chrome can only prerender one page across all processes. This means I can't do this for 2 pages and I don't know how the browser will react if another site that is opened also requested to prerender some pages. You also won't see the progress the browser makes on prerendering the page, and what happens if the user browses to the next page before the prerendered page finished ? Will some of the resources be cached already ? I don't know, and honestly I don't think it's worth testing yet to see how all browsers act on these scenarios.
I think we need to wait a little longer with these two techniques for them to mature a bit...

What is the best solution ?
Well, like every performance improvement - I don't believe there is a 'best solution' as there are no 'silver bullets'.
However, the best solution for the site I'm working on so far, is to preload the resources we know the user will need ourselves. This means we use javascript to have the browser download resources we know the user will need throughout the site on the first page they land, so on subsequent pages, the user's client will have much less to download.

What are the pros with this technique ?
1. I have much more control over it - This means I can detect which browser the user has, and use the appropriate technique so it will work for all users.
2. I can trigger it after the 'page load' event. This way I know it won't block or slow down any other work the client is doing for the current page.
3. I can do this for as many resources I need. css, js, images and even fonts if I want to. Basically anything goes.
4. Downloading resources doesn't limit me to guessing the one page that the user will be heading after this one. On most sites there are many common resources used among different pages, so this gives me a bigger win.
5. I don't care about other tabs the user has open that aren't my site. :)

Of course the drawback with this is that opposed to the 'prerender' technique, the browser will still have to download the html, parse & execute the js/css files and finally render the page.

Unfortunately, doing this correctly isn't that easy. I will write about how to do this in detail in the next post (I promise!).

I want to sum up for now so this post won't be too long -
In conclusion I would say that there are many techniques out there and many of them fit different scenarios. Don't implement any technique just because it's easy and because someone else told you it works. Some of them might not be a good fit for your site and some might even cause damage. Steve Souder's blog is great and an amazing fountain of information on performance. I learned the hard way that each performance improvement I make needs to be properly analyzed and tested before implementing.

Some great resources on the subject :
- Prebrowsing by Steve Souders
- High performance networking in Google Chrome by Ilya Grigorik
- Controlling DNS prefetching by MDN

Monday, November 11, 2013

Some jQuery getters are setters as well

A couple of days ago I ran into an interesting characteristic of jQuery -
Some methods which are 'getters' are also 'setters' behind the scenes.

I know this sounds weird, and you might even be wondering why the hell this matters... Just keep reading and I hope you'll understand... :)

If you call the element dimension methods in jquery (which are height(), innerHeight(), outerHeight(), width(), innerWidth() & outerWidth() ) you'll probably be expecting it to just check the javascript object properties using simple javascript and return the result.
The reality of this is that sometimes it needs to do more complicated work in the background...

The problem :
If you have an object which is defined as 'display:none', calling 'element.clientHeight' in javascript, which should return the object's height will return '0'. This is because a 'hidden' object using 'display:none' isn't rendered on the screen and therefore the client never knows how much space it visually actually takes, leading it to think it's dimensions are 0x0 (which is right in some sense).

How jquery solves the problem for you :
When asking jquery what the height of a 'display:none' element is (by calling $(element).height() ), it's more clever than that.
It can identify that the element is defined as 'display:none', and takes some steps to get the actual height of the element :
- It copies all the element's styles to a temporary object
- Defines the object as position:absolute
- Defines the object as visibility:hidden
- Removes 'display:none' from the element. After this, the browser is forced to 'render' the object, although it doesn't actually display it on the screen because it is still defined as 'visibility:hidden'.
- Now the jquery knows what the actual height of your element is
- Swaps back the original styles and returns the value.

Okay, so now that you know this, why should you even care ?
The step that jquery changes the styles of your element without you knowing, which forces the browser to 'render' the element in the background can take time. Not a lot of time, but still take some time. Probably a few milliseconds. Doing this once wouldn't matter to anyone, but doing this many times, lets say in a loop, might cause performance issues.

Real life!
I recently found a performance issue on our site that was caused by this exact reason. The 'outerHeight()' method was being called in a loop many times, and fixing this caused an improvement of ~200ms. (Why saving 200ms can save save millions of dollars!)

I will soon write a fully detailed post about how I discovered this performance issue, how I tracked it down, and how I fixed it.

Always a good tip!
Learn how your libraries are working under the hood. This will give you great power and a great understanding of how to efficiently use them.

Saturday, November 9, 2013

Humanity wasted 14,526 years watching Gangnam Style

"Humanity wasted 14,526 years watching Gangnam Style"...
This was the title of a link I posted on Hacker News about a week ago which linked to a website I created with a friend of mine (Gil Cohen) - http://www.WastedHumanity.com

It seems like I managed to really annoy some people, and some even claim to hate me!
(The whole discussion can be seen here : https://news.ycombinator.com/item?id=6663474)

Well, I just wanted to say about this whole thing a few words -
The whole idea of this site was only a joke. Just me and my friend sitting in the living room one boring Friday watching some silly YouTube videos ourselves when I started thinking about how many times these videos were watched. It amazed me, so I started calculating it in my head. The programmer I am wouldn't allow me to just calculate this data manually so I started building a site that would do it for me. When we saw the numbers we were amazed and started joking about the things we could have done instead of this 'wasted time'...

I didn't mean to laugh about how you decide to spend your time or make fun of anyone in anyway. I myself 'waste' a lot of time on YouTube, sometimes on silly videos while doing nothing, and sometimes countless hours listening to music as I work. I added at least a few views myself to each one of the videos seen on the site, and many more not on the site. I don't see that time as 'wasted'.

I also know the calculation isn't a bit accurate, and that each (or at least most) of the facts on that site wasn't accomplished by one person so in reality it took much more than written.

So, sorry if I hurt you. I know I made a lot of people laugh in the process, so it was totally worth it! :)

Monday, October 21, 2013

A tale of asp.net, IIS 7.5, chunked responses and keep-alive

A while ago I posted about chunked responses - what they are and the importance of them. It turns out that we (where I work) were getting it all wrong.

We implemented chunked responses (or at least thought so) quite a while ago, and it WAS working, in the beginning, but all of a sudden stopped.

How did I come to realize this ?
While analyzing waterfall charts of our site, which I've been doing regularly for quite a while now, I realized that the response doesn't look chunked.
It's not trivial realizing this from a waterfall chart, but if you look closely and you're familiar with your site's performance you should notice this. Since the first chunk we send the client is just the html 'head' tag, this requires almost no processing so it can be sent to the client immediately, and it immediately causes the browser to start downloading resources that are requested in the 'head' tag. If a response is chunked, in the waterfall, you should see the resources starting to be downloaded before the client even finishes downloading the html response from the site.

A proper chunked response should look like this :

v
If you look closely you will realize that the response took long to download, which doesn't match the internet connection we chose for this test, which means the download didn't actually take that long, but the server sent part of the response, processed more of it, and then sent the rest.

Here's an image of a response that isn't chunked :

You can see that the client only starts downloading the resources required in the 'head' after the whole page is downloaded. We could've saved some precious time here, and have our server work parallel to the client that is downloading resources from our CDN.

What happened ?
Like I said, once this used to work and now it doesn't. We looked back at what was done lately and realized that we switched load balancers recently. Since we weren't sending the chunks properly, the new load balancer doesn't know how to deal with this and therefore just passes the request on without chunks to the client.
In order to investigate this properly, I started working directly with the IIS server...

What was happening ?
I looked at the response with Fiddler and WireShark and realized the response was coming in chunks, but not 'properly'. This means the 'Transfer-Encoding' header wasn't set, and the chunks weren't being received in the correct format. The response was just being streamed, and each part we had we passed on to the client. Before switching load balancer, it was being passed like this to the client, and luckily most clients were dealing with this gracefully. :)

So why weren't our chunks being formatted properly ?
When using asp.net, mvc, and IIS 7.5 you shouldn't have to worry about the format of the chunks. All you need to do is call 'HttpContext.Response.Flush()' and the response should be formatted correctly for you. For some reason this wasn't happening...
Since we're not using the classic Microsoft MVC framework, but something we custom built here, I started digging into our framework. I realized it had nothing to do with the framework, and was more low level in Microsoft's web assemblies, so I started digging deeper into Microsoft's code.

Using dotPeek, I looked into the code of 'Response.Flush()'...
This is what I saw :

As you can see, the code for the IIS 6 worker is exposed, but when using IIS7 and above it goes to some unmanaged dll, and that's where I stopped going down that path.

I started looking for other headers that might interfere, and started searching the internet for help... Couldn't find anything on the internet that was useful (which is why I'm writing this...), so I just dug into our settings.
All of a sudden I realized my IIS settings had the 'Enable HTTP keep-alive' setting disabled. This was adding the header 'Connection: close' which was interfering with this.

I read the whole HTTP 1.1 spec about the 'Transfer-Encoding' and 'Connection' headers and there is no reference to any connection between the two. Whether it makes sense or not, It seems like IIS 7.5 (I'm guessing IIS 7 too, although I didn't test it) doesn't format the chunks properly, nor add the 'Transfer-Encoding' header if you don't have the 'Connection' header set to 'keep-alive'.

Jesus! @Microsoft - Couldn't you state that somewhere, in some documentation, or at least as an error message or a warning to the output when running into those colliding settings?!!

Well, what does this all mean ?
The 'Connection' header indicates to the client what type of connection it's dealing with. If the connection is set to 'Close' it indicates that the connection is not persistent and will be closed immediately when it's done sending. When specifying 'keep-alive' this means the connection will stay opened, and the client might need to close it.
In the case of a chunked response, you should indicate the last chunk by sending a chunk with size '0', telling the client it's the end, and they should close the connection. This should be tested properly to make sure you're not leaving connections hanging and just wasting precious resources on your servers.
(btw - by not specifying a connection type, the default will be 'Keep-Alive').

If you want to take extra precaution, and I suggest you do, you can add the 'Keep-Alive' header which indicates that the connection will be closed after a certain amount of time of inactivity.

Whatever you do, make sure to run proper tests under stress/load to make sure your servers are managing their resources correctly.

Additional helpful resources :
- 'Keep-Alive' header protocol
- HTTP/1.1 headers spec

Thursday, September 19, 2013

Prefetching dns lookups

Since I've been working hard on latency and client side performance at my company, I've been analyzing several pages a day of our site and other big sites on the web, using mainly WebPageTest, looking for ways to optimize their performance. Viewing hundreds of waterfall charts, your eyes tend to get used to looking at the same kind of patterns and the same kind of requests.

The DNS resolution, or 'DNS lookup' phase in the request was something I always thought should just be ignored. I mean, it pissed the hell out of me that it was there, but I honestly thought that there was nothing I can do about it...

A while ago I thought about simply inserting the IP addresses of our CDN domains and other sub-domains we might have directly in the code to solve this. This is bad for 2 main reasons:
1. If your IP changes for some reason it forces you to change your code accordingly. (maybe not a scenario that should happen often or even at all, but still might)
2. (and this is much more important!) When using a CDN service like akamai, the dns lookup will give you different results according to where you are in the world. Since they have servers strategically placed in different geographical locations, a user from the USA will probably get a different IP than a user from Europe or Asia.

Well, recently that all changed - I realized that you can direct the browser to prefetch the dns lookup at the beginning of the request, so that when the browser runs into the new domain it won't have to lookup up the dns again.

To do this, all you need to add is this tag at the beginning of your page :

Doing this on the domain you're currently on has no effect since the browser already did the dns lookup, but it can help when you know that in your page source you have calls to multiple sub-domains (for cdn's), calls to 3rd party libraries or ajax calls you make to other domains. Even if you know of a call that will happen on the next page the user lands on, you should still prefetch the dns lookup since the browser caches the results for a couple of minutes at least, and this should have no effect on the current page performance.

The most common response I get when telling people about this, or reading about this on the internet is that the DNS lookup alone doesn't take that long. From my tests, I can say that the average DNS lookup time is under 100ms, although usually above 20ms, and sometimes it passes the 100ms. Even though this isn't the common case, you can still make sure time is saved for those 'unlucky' users.
...and besides, this is one of the easiest performance wins you have - It requires almost no work to implement!

Just while writing this article I happened to test facebook.com, and check out how long the DNS lookup took on those 3 last requests!

(You can view the full results of this test here : http://www.webpagetest.org/result/130919_17_B60/1/details/) Yep, you better believe your eyes - The DNS lookup on those last requests seemed to take 2 seconds!!
Now, I don't know why they took 2 seconds in that case, and I bet this is really rare, but it still happens sometimes, you can't argue with that.
But hey, If they would've requested to prefetch that last domain, it would still take that long! That's right, but it would've started much earlier, and could've still save hundreds of valuable milliseconds.

So, my suggestions to you is, lets say you have 4 sub-domains for CDN's and you know you're going to call facebook's api at some point, you should put something like this in the head tag of your source :

This will tell the browser to immediately start the dns fetching so that when the browser reaches those domains it will have the ip stored in the cache already.

If you want to see what it looks like when you're prefetching the dns lookup properly, take a look at these WebPageTest results from amazon : http://www.webpagetest.org/result/130919_ZQ_J6V/1/details/
You can clearly see that the dns lookup part of the request on some of the domains happen a lot before the browser reaches the actual resource on the timeline, and when it does, it doesn't need to wait for the dns lookup.
As usual, great work amazon! :)

Some more resources on the subject :
- MDN - Controlling DNS prefetching
- Chromium Blog - DNS prefetching
- Performance Calendar - Speed up your site using DNS prefetching

Wednesday, September 4, 2013

All about http chunked responses

A short background on HTTP and the 'Content-Length' header :
When sending requests over HTTP (hence, 'the web'), we send an HTTP request which consists of two main parts - the header of the request and the body. The header defines various details of the request body (e.g.: encoding type, cookies, request method, etc.). One of these details is the 'Content-Length' specifying the size of the body. If you're building a website and aren't specifying this explicitly then chances are the framework you're using is doing this for you. Once you send the response to the client, the framework measures the size of the response and adds it to this header.

In a normal request, looking at the headers with FireBug or Chrome developer tools, it should look like this (looking at google.com) :

So, what is a 'chunked response' ?
A 'chunked' response means that instead of processing the whole page, generating all of the html and sending it to the client, we can split the html into 'chunks' and send one after the other, without telling the browser how big the response will be ahead of time.

Why would anyone want to do this ?
Well, some pages on the site can take a long time to process. While the server is working hard to generate the output, the user sees a white screen and the browser is pretty much hopeless during this time with nothing to do and just displays a boring white screen to the user.
The work the server is doing might be to generate a specific part of the content on the page, and we might have a lot ready that we can already give the client to work with. If you have scripts & stylesheets in the <head/> of your page, you can send the first chunk with the 'head' tag html content to the user's machine, then the browser will have something to work with, meaning it will start downloading the scripts and resources it needs and during this time, and your servers can continue crunching numbers to generate the content to be displayed.
You are actually gaining parallelism by sending the client this first chunk without waiting for the rest of the page to be ready!

Taking this further, you can split the page into several chunks. In practice, you can send one chunk with the 'head' of the page. The browser can then start downloading scripts and stylesheets, while your server is processing lets say the categories from your db to display in your header menu/navigation. Then you can send this as a chunk to the browser so it will have something to start rendering on the screen, and your server can continue processing the rest of the page.

Even if the user only sees part of the content, and it isn't enough to work with, the user still gets a 'sense' of better performance - something we call 'perceived performance' which has almost the same impact.

Many big sites are doing this, since this will most definitely improve the client side performance of your site. Even if it's only by a few milliseconds, in the ecommerce world we know that time is money!

How does this work ?
Since the response is chunked, you cannot send the 'Content-Length' response header because you don't necessarily know how long the response will be. Usually you won't know how big the response will be, and even if you do, the browser doesn't care at this point.
So, to notify the browser about the chunked response, you need to omit the 'Content-Length' header, and add the header 'Transfer-Encoding: chunked'. Giving this information to the browser, the browser will now expect to receive the chunks in a very specific format.
At the beginning of each chunk you need to add the length of the current chunk in hexadecimal format, followed by '\r\n' and then the chunk itself, followed by another '\r\n'.

FireBug and Chrome dev tools both combine the chunks for you, so you won't be able to see them as they are really received by the browser. In order to see this properly you will need to use a more low level tool like Fiddler.

This is how the raw response of amazon.com looks like using fiddler :

Note : I marked the required 'Transfer-Encoding: chunked' header, and the first line with the size of the chunk. In this case the first chunk is 0xd7c bytes long, which in human-readable format is 3452 bytes.
Also, it's interesting to note that you cannot really read the first chunk since it's encoded via gzip (which is also automatically decoded when using browser dev tools). When using fiddler, you can see the message at the top telling you this, and you can click it and have it decoded, but then the chunks are removed and you'll see the whole html output.

How can we achieve this with asp.net ?
When you want to flush the content of your site, all you need to do in the middle of a view is call 'HttpContext.Current.Response.Flush()'.
It's that easy! Without you having to worry about it, the .net framework will take care of the details and send the response to the browser in the correct format.

Some things that might interfere with this working properly :
- You might have to configure 'Response.BufferOutput = false;' at the beginning of your request so the output won't be buffered and will be flushed as you call it.
- If you specifically add the 'Content-Length' header yourself then this won't work.

For more helpful resources on chunked responses :
Wikipedia, and the spec details : http://en.wikipedia.org/wiki/Chunked_transfer_encoding
How to write chunked responses in .net (but not asp.net) - http://blogs.msdn.com/b/asiatech/archive/2011/04/26/how-to-write-chunked-transfer-encoding-web-response.aspx
Implementing chunked with IHttpListener - http://www.differentpla.net/content/2012/07/streaming-http-responses-net

Wednesday, August 14, 2013

My talk on Latency & Client Side Performance

As an engineer in the 'Core team' of the company, we are responsible for making the site as available as we can, while having great performance and standing heavy load. We set high goals, and we're working hard to achieve them.

Up until a while ago, we were focusing mainly on server side performance - Looking at graphs under various load and stress tests, and seeing how the servers perform, each time making more and more improvements in the code.

A few weeks ago we started putting a lot of focus on latency and client side performance. I have taken control in this area and am following the results and creating tasks that will improve the performance every day.

Since I've been reading a lot about it lately, and working on it a lot, I decided to create a presentation on the subject to teach others some lessons learned from the short time I've been at it...

Here are the slides : http://slid.es/gillyb/latency

There are many details you'll be missing by just looking at the slides, but if this interests you than you should take a look anyway. The last slide also has many of the references from which I took the information for the presentation. I strongly recommend reading them. They are all interesting! :)

I might add some future posts about specific client side performance tips and go much more into details.
I'm also thinking about presenting this at some meetup that will be open to the public... :)

Saturday, July 27, 2013

Improving website latency by converting images to WebP format

A couple of years ago Google published a new image format called WebP (*.webp). This format is supposed to be much smaller in size, without losing from the quality (or at least no noticeable quality). You can convert jpeg images to webp without noticing the difference, with a smaller image file size and even support transparency.
According to Ilya Grigorik (performance engineer at google) - you can save 25%-35% on jpeg and png formats, and 60%+ on png files with transparency! (http://www.igvita.com/2013/05/01/deploying-webp-via-accept-content-negotiation/)

Why should we care about this ?
Your web site latency is super important! If you don't measure it by now, then you really need to start. In commerce sites it's already been proven that better latency directly equals more revenue (Amazon makes 1% more in revenue by saving 100ms).

How is this new image format related to latency ?
If your site has many images, then your average user is probably spending a fair amount of time downloading those images. Think of a site like pinterest which is mostly contrived of user based images, then the user is downloading many new images with each page view.
While on a PC at your home, with a DSL connection this might not seem like a lot, but we all know that a big percentage of our users are using mobile devices, with 3G internet connection, which is much slower and they suffer from much longer download times.

What are our options ?
Just converting all our images to WebP is clearly not an option. Why ? Well, some people in the world have special needs. In this case i'm referring to people with outdated browsers (We all know who they are!).
BUT, we can still let some of our users enjoy the benefit of a faster site, and this includes many mobile users as well!

We will need to make some changes to our site in order for us to support this, so lets see what we can do -
(Technical details on implementation at the end)

Option #1 - Server side detection :
When our server gets the request, we can detect if the user's browser supports webp, and if so reply with an html source that has '*.webp' image files in it.
This option comes with a major downside - You will no longer be able to cache the images on the server (via OutputCaching or a CDN like Akamai) since different users can get a different source code for the same exact page.

Option #2 - Server side detection of image request :
This means we can always request the same file name, like 'myImage.png'. Add code to detect if this client can support webp then just send it the same file but in webp format.
This option has a similar downside - Now we can cache the html output, but when sending the image files to the user we must mark them as 'non-cacheable' too since the contents can vary depending on the user's browser.

Option #3 - Client side detection :
Many big sites defer the downloading of images on the client only until the document is ready. This is also a trick to improve latency - It means the client will download all the resources they need, the browser will render everything, and only then start downloading the images. Again, for image intensive sites this is crucial, since it allows the user to start interacting with the site before waiting for the downloading of many images that might not be relevant at the moment.
This is done by inserting a client side script that will detect if the browser supports webp format. If so, you can change the image requests to request the *.webp version of the image.
The downside to this option is that you can only use it if the browser supports the webp format.
(btw - you can decide to go extreme with this and always download the webp version, and if the client doesn't support it, there are js decoders that will allow you to convert the image on the client. This seems a little extreme to me, and you probably will be spending a lot of time decoding in js anyway).

The gritty details -

How can we detect if our browser supports webp ?
Don't worry, there's no need here for looking up which browsers support webp and testing against a list. Browsers that support webp format should claim they do when requesting images. We can see this done by Chrome (in the newer versions) :

You can see in the request headers 'Accept: image/webp'

How do we do this on the client ?
In javascript we don't have access to the request headers, so we need to get creative.
There is a trick that can be done by actually rendering an image on the client, using base64 to store the image in the code, and then detect if the browser loaded the image successfully.
This will do the trick :

$("")
    .attr('src', 'data:image/webp;base64,UklGRh4AAABXRUJQVlA4TBEAAAAvAQAAAAfQ//73v/+BiOh/AAA=')
    .on("load", function() {
        // the images should have these dimensions
        if (this.width === 2 || this.height === 1) {
            alert('webp format supported');
        }
        else {
            alert('webp format not supported');
        }
    }).on("error", function() {
        alert('webp format not supported');
    });

How do we convert our images to webp format ?
We can do it manually using Google's converter - https://developers.google.com/speed/webp/docs/cwebp
Doing it programatically depends on what language you're using.
There's a wrapper for C# - http://webp.codeplex.com/
(and there are more for other languages, but not all - I'm actually looking for a java wrapper, and couldn't find one yet)

So, should I run ahead and do this ?
All this good does come with a price, as all good things do... :)
There might be side affects you didn't think of yet. Some of them being the fact that if a user sends a link to an image that ends with webp and the user that receives this is using a browser that doesn't support it, then they won't be able to open the image.
More what, even if the user does use a new browser (e.i.: a new version of Chrome) and they save a webp file to disk, they probably won't be able to open it on their computer.
These are problems that facebook ran into, and eventually retreated from the idea of using webp. You can read all about that here.

Which browsers did you say support this ?
According to www.caniuse.com - Chrome has obviously been supporting it for a while. Opera also supports it, and FireFox is supposed to start supporting this really soon as well. The most important news is that Android browsers, Chrome for Android and Opera mobile all support this which means many of your mobile users can gain from this change.

If you're still reading and want more information -
- Ilya Grigorik explains how to implement this using your CDN and NginX
- An excellent presentation on web image optimization by Guy Podjarny

Sunday, February 17, 2013

Getting started with nodejs - building an MVC site

A couple of weeks ago I started getting into nodejs. At first I was quite skeptic, I don't even recall why, but after playing with it for just a couple hours I started loving it. Seriously, It's so simple to use, and it seems like the nodejs eco-system is growing really fast. I'm not going to go into what nodejs is or how it works, so if you don't know, you should start by reading this.

What I am going to show here, is a really quick and simple tutorial as to how to get started building a website on nodejs using the MVC design pattern. I'll go over the quick installation of nodejs and walk through getting the very basic mvc wireframe website up and running.
(Since I've been a .net developer for quite a while, I might be comparing some of the terms used to the terminology .net programmers are familiar with)

Installing nodejs
First, download and install nodejs.
On ubuntu, this would be :

sudo apt-get install nodejs

(If your ubuntu version is lower than 12.10 than you need to add the official PPA. Read this)

Now, you need to install the npm (nodejs package manager) :

sudo apt-get install nodejs npm

This will help us install packages built for nodejs. (exactly like 'Nuget' for Visual Studio users).

Starting our website
I've looked up quite a few mvc frameworks for nodejs, and I would say that the best one, by far, is expressjs. It's really easy to use and it's being actively updated.

Create a directory for your website, navigate their in the terminal, and type

sudo npm install express

Now we need to tell nodejs how to configure our application, where are the controllers/models/views, and what port to listen to...

Create a file called index.js in the website directory you created -

First things first :

var express = require('express');
app = express();

This defines 'app' as our expressjs web application, and gives us all the cool functionality that comes with the expressjs framework.

After that we need to configure our application :

app.configure(function() {
    app.set('view engine', 'jade');
    app.set('views', __dirname + '/views');
    
    app.use(express.logger());
    app.use(express.bodyParser());
    app.use(express.cookieParser());

    app.use(express.static(__dirname + '/scripts'));
    app.use(express.static(__dirname + '/css'));
    app.use(express.static(__dirname + '/img'));

    app.use(app.router);
});

The first two lines tells express we're going to use the 'jade' view engine to render our views. (This is like 'razor' but a little different, for people coming from the .net mvc). You can read about how the view engine works over here. The next 3 lines tell express to use certain middleware. ('middleware' are like 'filters' in the asp.net mvc world) middleware intercept each request, and can do what it wants, including manipulating the request. Basically, each middleware is a method being called with the request object, response object, and 'next' object respectively.
The 'next' object is a function that calls the next middleware in line.
All the middleware are called in the same order that they are defined. The middlewares I use here are basic middlewares that comes with the expressjs framework, and just make our life much easier (by parsing the request object, the cookies to our request/response objects and logging each request for us).

The final 3 lines of code, tell expressjs which directories have static files in them. This means that each request to a filename that exists in one of these files will be served as static content.
Note : if we put a file called 'main.css' in the '/css' folder, we request it by going to http://ourdomain.com/main.css and NOT by going to http://ourdomain.com/css/main.css. (This got me confused a little at first...)

After all that, we need to add our models and controllers...

require('./models');
require('./controllers');

The nodejs default when requiring a directory is to look for the file 'index.js' in that directory, so what I did is create an index.js file in each of those directories, and inside it just add a couple of 'require()' calls to specific files in that directory.

For models you can create javascript object however you like. On the projects I'm working on, I started using mongoose - which is like an ORM for mongodb. It's really simple to use, but I won't go into it for now...

Finally, in our init.js file, we need to tell our app to listen to a certain port -

app.listen(8888);

Controllers
Defining controllers is really easy with express - Each 'action' is a method, defined by GET or POST, the url (which can include dynamic parameters in it), and the function to call. A typical controller looks like this :

app.get('/about', function(request, response) {
    // just render the view called 'about'
    // this requires us to have a file called 'about.jade' in our 'views' folder we defined
    response.render('about');
});

app.get('/user/:userId', function(request, response) {
    // userId is a parameter in the url request
    response.writeHead(200); // return 200 HTTP OK status
    response.end('You are looking for user ' + request.route.params['userId']);
});

app.post('/user/delete/:userId', function(request, response) {
    // just a POST url sample
    // going to this url in the browser won't return anything..

    // do some work...
    response.render('user-deleted'); // again, render our jade view file
});

So, that's the end of this. It's really basic I know, but I hope it will help you get started... :)
The main idea of this post was to show just how easy it is to get started with nodejs.
I think I will be posting a lot more about nodejs in the near future! :)

Have fun! :)

Thursday, January 31, 2013

Problems with Google Analytics

Most of the Google utilities I use are great - they usually have an intuitive design that make them frictionless and have most of the features someone needs. The features they have usually work as expected too, which isn't trivial with some other competing utilities.

Lately I've been using Google Analytics and the truth is, I don't like what I see... :(

The most annoying part of using Google Analytics is that there's no way of testing it!
It would seem like a trivial feature to me, but apparently not to the people at Google. Maybe most people don't have this problem, since you set up the analytics reports when first designing the website, and then the testing process is done on your production environment which could be really easy, and if you have no stats, then you obviously have nothing to ruin.
When I was trying to make some of the most minor changes to the way we report some of the stuff on the website I work on at my job, the first thing that interested me was how I was going to test the changes.

When you have many users in production, there's no chance you'll notice the change you made when you login. Even if you would, I could by accidentally affect other analytics and I was obviously afraid to do so. So, I set up a test account, and tried reporting to the test account from my local machine. This didn't work since Google makes sure you make the request from the domain you registered in the GS (Google Analytics) account, which is great! After looking into this a little, I found out that I can tell GA to ignore the domain that the request is coming from so that this will work. From their documentation, this feature was meant to be for using multiple subdomains, but it works for reporting from any domain. Since this helped my cause, and I'm not afraid of others causing harm to my test account, I won't go into why this is a bad idea, and can be harmful to some other sites using this... :/
After doing all that, I came to realize that the analytics aren't reported in real time, which is also logical, since an analytics system usually needs to deal with large amounts of data, and it takes time to handle the load. (Not only it's not real-time, but it's pretty far from being almost real-time as well) BUT, this doesn't mean there shouldn't be a way around this for testing, like an option I could turn on, just so the reports effect will be seen in real time, even if it's limited for a really small number, just for testing!

In case someone reading this ran into the same problem - The configuration setting I used like this :

_gaq.push(['_setDomainName', 'none']);

By the way - From my experience with the Adobe Omniture utility, they have a great 'debugging' utility that you can use a bookmarklet. It opens on any site, and shows you the live reports going out, which is a GREAT tool for testing, and should've been implemented by Google in the same way.

Another issue I had (and frankly, still have) with GA, is that some of their documentation isn't full... For example : There are some pages (like 'Page Timings') where you can view the stats of different pages, and the average. You can sort this list by 'page title' or some other parameters. The problem is that when you have many pages that are the same, but with dynamic content (meaning all the 'page titles' are different), you might want to group them by a 'user defined variable' that you report on that page.
Great! You have this option. ...BUT, in the documentation, the way you report a 'User defined variable' is by using the '_setVar' method. It continues on by stating that the '_setVar' method is soon to be deprecated and they don't recommend using it. Instead you should '_setCustomVar'. The problem here is that 'Custom Var' and 'User Defined Variable' aren't the same, and in some pages you can view one and in some the latter. There is no documentation anymore for the '_setVar' method, so I searched various blogs about people writing about this in the past, and found the way to use it, but it works in a different way, and I couldn't find a way to define it's lifespan (per session/page/user/etc.) like you can do with '_setCustomVar'.

Long story short... It seems like they have quite some work to do on this before it's perfect, or close to being perfect, and I'm not 100% I'll be using this again as a full site solution for web page analytics.

Friday, January 18, 2013

It's time for yet another change...

I started out programming ~13 years ago, when I was only 12 years old. The first language I learned was VB. I was building some small winforms utilities for myself. Then (about a year after) I started learning php and went into web development, building some dynamic websites (mostly php & MySQL) for the next 6-7 years...

Once I drafted to the military (mandatory in Israel), I did some programming for the Air Force. It was then when I got into the .net world, and was doing most of the programming in C#. I got used to it really fast, and learned to like it, so after the military (~3.5 years), I went on to work for ICC (the Israeli VISA company) as a .net programmer for 2 and a half years. After that I immediately got a job for Sears Israel, mostly developing in C# but also doing some java.

The past couple of years I learned a lot of new technologies and new programming paradigms, but it was mostly using the Microsoft platform. I didn't get to work with other programming languages, not to mention working on Linux platforms...

So... I think it's time for yet another change... :)
I'm not talking about quiting my job (note to my employer: no need to worry yet), but I'm talking about learning some new languages and technologies on my own. I'm not kidding anyone here - I've been programming since a young age and have been loving every moment of it. I do it a lot at home and a lot in my spare time just for fun!
I recently decided that I will dedicate my spare time to learning some new technologies and new languages. I'm sure, even this won't pay off immediately at work, the knowledge I will be gaining will still be priceless and will definitely help me in the future, possible even at my current work place.

First step - A couple of weeks ago I installed Linux on my home computer. I went for the Ubuntu distribution, since I heard it's the most user-friendly, and this way I could hit the ground running.
It's only been a couple of weeks, and I already know how to do everything I need from the terminal. Manage my files, install new packages and even work with vim. Seriously, if I didn't need a web browser, I would be installing the server version. Now, I started copying all my old files I want to backup from my Windows installation, and planning on removing Windows all together.

Next steps - I decided I need to learn new technologies as well. Recently reading about nodejs a lot got me really excited and so I started learning it. I also figured it would be easy for me to get into, since after years of web development I can say I know a fair amount of javascript.
I'm already working on a small project just for myself which is coming out really cool (might blog about this soon). I'm working with the expressjs mvc framework, jade view engines, stylus and even implementing OpenId in javascript!

I'm always thinking about how I can use this information to leverage some utilities we have at work... :) Every programming language has its strengths and it's all about choosing the right tool for the job.

What's next ? Well, I think after I finish my project in nodejs I'll go onto learning python which has also been interesting me for a quite a while now...
I already read a lot about it, and now I just need to get my hands dirty a bit too.

In conclusion I will say that learning new technologies is always a good idea, can always be a lot of fun, will always give you new tools to deal with everyday problems and put you ahead of many other programmers that are 'stuck with the same technology for years'. I already feel I have quite the experience - programming in VB, php, C#, javascript, using MySQL, Oracle, SQL Server, (and many frameworks within each language) and having at least a couple of years of experience with each of these. I really believe this puts me ahead of most, especially when it comes to solving new problems - I have a better point of view then others and a lot of experience to lean on.

Monday, January 14, 2013

Some good to know C# attributes

Just felt like writing about a couple of C# framework attributes that I happened to use lately, and not enough programmers know about (in my opinion, and this usually surprises me).

The first (and the name of my blog) :
DebuggerStepThroughAttribute
This attribute can be used to tell the debugger not to "step-into" a piece of code, and instead, it will skip over it. The best use for this that I can think of is when using an IoC/AOP framework and you have method interceptors, debugging can be a big pain in the ass... You keep on going through each interceptor, on every method call!
All you need to do is add [DebuggerStepThrough] above your interceptor, and the code won't be debugged.
e.g.:

[DebuggerStepThrough]
public class DontDebugInterceptor : IInterceptor
{
    // do something...
}

DebuggerDisplayAttribute
This attribute is used to tell the debugger what to display when in the 'Watch' window or when hovering over the variable in debug mode. Most people know that if you override the 'ToString()' method than you get the same effect, but sometimes this just isn't possible, since you might need the 'ToString()' method for something else.
All you need to do is add [DebuggerDisplay('Some string representation')] to the field/property/class you want to modify. You can also evaluate code inside the string given to the attribute constructor, just by wrapping it with curly braces.
e.g.:

[DebuggerDisplay('This class is : {OutputClass()}')]
public class MyClass
{
    private string OutputClass()
    {
        return "Whatever you want here...";
    }
}