Applications Architect @ National Geographic Education, Depressed Optimist and other oxymorons
7 stories
·
8 followers

Cache Invalidation Strategies With Varnish Cache

2 Shares

Phil Karlton once said, “There are only two hard things in Computer Science: cache invalidation and naming things.” This article is about the harder of these two: cache invalidation. invalidation. It’s directed at readers who already work with Varnish Cache. To learn more about it, you’ll find background information in “Speed Up Your Mobile Website With Varnish.”

10 microseconds (or 250 milliseconds): milliseconds): That’s the difference between delivering a cache hit and delivering a cache miss. How often you get the latter will depend on the efficiency of the cache — this is known as the “hit rate.” A cache miss depends on two factors: the volume of traffic and the average time to live (TTL), which is a number indicating how long the cache is allowed to keep an object. As system administrators and developers, we can’t do much about the traffic, but we can influence the TTL.

However, to have a high TTL, we need to be able to invalidate objects from the cache so that we avoid serving stale content. With Varnish Cache, there are myriad ways to do this. We’ll explore the most common ways and how to deploy them.

Varnish does a whole lot of other stuff as well, but its caching services are most popular. Caches speed up Web services by serving cached static content. When Varnish Cache is delivering a cache hit, it usually just dumps a chunk of memory into a socket. Varnish Cache is so fast that, on modern hardware, we actually measure response time in microseconds!


Caching isn’t always as simple as we think; a few gotchas and problems may take quite some of our time to master. (Image credits)

Varnish Cache is one among a wide variety of products by Varnish Software. Image credits

When using a cache, you need to know when to evict content from the cache. If you have no way to evict content, then you would rely on the cache to time-out the object after a predetermined amount of time. This is one method, but hardly the most optimal solution. The best way would be to let Varnish Cache keep the object in memory forever (mostly) and then tell the object when to refresh. Let’s go into detail on how to achieve this.

HTTP Purging

HTTP Purging is the most straightforward of these methods. Instead of sending a GET /url to Varnish, you would send PURGE /url. Varnish would then discard that object from the cache. Add an access control list to Varnish so that not just anyone can purge objects from your cache; other than that, though, you’re home free.

acl purge { "localhost"; "192.168.55.0"/24; } sub vcl_recv { # allow PURGE from localhost and 192.168.55... if (req.request == "PURGE") { if (!client.ip ~ purge) { error 405 "Not allowed."; } return (lookup); } } sub vcl_hit { if (req.request == "PURGE") { purge; error 200 "Purged."; } } sub vcl_miss { if (req.request == "PURGE") { purge; error 200 "Purged."; } } 

Shortcomings of Purging

HTTP purging falls short when a piece of content has a complex relationship to the URLs it appears on. A news article, for instance, might show up on a number of URLs. The article might have a desktop view and a mobile view, and it might show up on a section page and on the front page. Therefore, you would have to either get the content management system to keep track of all of these manifestations or let Varnish do it for you. To let Varnish do it, you would use bans, which we’ll get into now.

Bans

A ban is a feature specific to Varnish and one that is frequently misunderstood. It enables you to ban Varnish from serving certain content in memory, forcing Varnish to fetch new versions of these pages.

An interesting aspect is how you specify which pages to ban. Varnish has a language that provides quite a bit of flexibility. You could tell Varnish to ban by giving the ban command in the command-line interface, typically connecting to it with varnishadm. You could also do it through the Varnish configuration language (VCL), which provides a simple way to implement HTTP-based banning.

Let’s start with an example. Suppose we need to purge our website of all images.

> ban obj.http.content-type ~ “^image/” 

The result of this is that, for all objects in memory, the HTTP response header Content-Type would match the regular expression ^image/, which would invalidate immediately.

Here’s what happens in Varnish. First, the ban command puts the ban on the “ban list.” When this command is on the ban list, every cache hit that serves an object older than the ban itself will start to look at the ban list and compare the object to the bans on the list. If the object matches, then Varnish kills it and fetches a newer one. If the object doesn’t match, then Varnish will make a note of it so that it does not check again.

Let’s build on our example. Now, we’ll only ban images that are placed somewhere in the /feature URL. Note the logical “and” operator, &&.

> ban obj.http.content-type ~ “^image/” && req.url ~ “^/feature” 

You’ll notice that it says obj.http.content-type and req.url. In the first part of the ban, we refer to an attribute of an object stored in Varnish. In the latter, we refer to a part of a request for an object. This might be a bit unconventional, but you can actually use attributes on the request to invalidate objects in cache. Now, req.url isn’t normally stored in the object, so referring to the request is the only thing we can do here. You could use this to do crazy things, like ban everything being requested by a particular client’s IP address, or ban everything being requested by the Chromium browser. As these requests hit Varnish, objects are invalidated and refreshed from the originating server.

Issuing bans that depend on the request opens up some interesting possibilities. However, there is one downside to the process: A very long list of bans could slow down content delivery.

There is a worker thread assigned to the task of shortening the list of bans, “the ban lurker”. The ban lurker tries to match a ban against applicable objects. When a ban has been matched against all objects older than itself, it is discarded.

As the ban lurker iterates through the bans, it doesn’t have an HTTP request that it is trying to serve. So, any bans that rely on data from the request cannot be tested by the ban lurker. To keep ban performance up, then, we would recommend not using request data in the bans. If you need to ban something that is typically in the request, like the URL, you can copy the data from the request to the object in vcl_fetch, like this:

set beresp.http.x-url = req.url; 

Now, you’ll be able to use bans on obj.http.x-url. Remember that the beresp objects turn into obj as it gets stored in cache.

Tagging Content For Bans

Bans are often a lot more effective when you give Varnish a bit of help. If the object has an X-Article-id header, then you don’t need to know all of the URLs that the object is presented as.

For pages that depend on several objects, you could have the content management system add an X-depends-on header. Here, you’d list the objects that should trigger an update of the current document. To take our news website again, you might use this to list all articles mentioned on the front page:

X-depends-on: 3483 4376 32095 28372 

Naturally, then, if one of the articles changes, you would issue a ban, like this:

ban obj.http.x-depends-on ~ “\D4376\D” 

This is potentially very powerful. Imagine making the database issue these invalidation requests through triggers, thus eliminating the need to change the middleware layer. Neat, eh?

Graceful Cache Invalidations

Imagine purging something from Varnish and then the origin server that was supposed to replace the content suddenly crashes. You’ve just thrown away your only workable copy of the content. What have you done?! Turns out that quite a few content management systems crash on a regular basis.

Ideally, we would want to put the object in a third state — to invalidate it on the condition that we’re able to get some new content. This third state exists in Varnish: It is called “grace,” and it is used with TTL-based invalidations. After an object expires, it is kept in memory in case the back-end server crashes. If Varnish can’t talk to the back end, then it checks to see whether any graced objects match, and it serves those instead.

One Varnish module (or VMOD), named softpurge, allows you to invalidate an object by putting it into the grace state. Using it is simple. Just replace the PURGE VCL with the VCL that uses the softpurge VMOD.

import softpurge; sub vcl_hit { if (req.method == "PURGE") { softpurge.softpurge(); error 200 “Successful softpurge”; } } sub vcl_miss { if (req.method == "PURGE) { softpurge.softpurge(); error 200 "Successful softpurge"; } } 

Distributing Cache Invalidations Events

All of the methods listed above describe the process of invalidating content on a single cache server. Most serious configurations would have more than one Varnish server. If you have two, which should give enough oomph for most websites, then you would want to issue one invalidation event for each server. However, if you have 20 or 30 Varnish servers, then you really wouldn’t want to bog down the application by having it loop through a huge list of servers.

Instead, you would want a single API end point to which you can send your purges, having it distribute the invalidation event to all of your Varnish servers. For reference, here is a very simple invalidation service written in shell script. It will listen on port 2000 and invalidate URLs to three different servers (alfa, beta and gamma) using cURL.

nc -l 2000 | while true do read url for srv in “alfa” “beta” “gamma” do curl -m 2 -x $srv -X PURGE $url done done 

It might not be suitable for production because the error handling leaves something to be desired!

Cache invalidation is almost as important as caching. Therefore, having a sound strategy for invalidating the content is crucial to maintaining high performance and having a high cache-hit ratio. If you maintain a high hit rate, then you’ll need fewer servers and will have happier users and probably less downtime. With this, you’re hopefully more comfortable using tools like these to get stale content out of your cache.

(al, ml, il)

The post Cache Invalidation Strategies With Varnish Cache appeared first on Smashing Magazine.

Read the whole story
Share this story
Delete

Karen McGrane on Content: The Alternative is Nothing

2 Shares

The history of technology innovation is the history of disruption. New technologies become available and disrupt the market for more-established, higher-end products.

We’re witnessing one of the latest waves of technological disruption, as mobile devices put access to the internet in the hands of people who previously never had that power. Always-available connectivity through PCs and broadband connections has already transformed the lives of people who have it. Mobile internet will do the same for an even larger population worldwide.

Despite examples from countless industries where disruption has taken place, it’s easy to pretend that it won’t happen to the web. Today’s mobile internet is janky. It’s slow. It’s hard to navigate. It offers only a paltry subset of what’s available on the desktop. It’s hard to imagine anyone truly preferring it.

Clayton Christensen, author of The Innovator’s Dilemma, argues that lower quality and less-than-adequate performance is, in fact, at the heart of what makes disruptive innovation happen:

In industry after industry, Christensen discovered, the new technologies that had brought the big, established companies to their knees weren’t better or more advanced—they were actually worse. The new products were low-end, dumb, shoddy, and in almost every way inferior. But the new products were usually cheaper and easier to use, and so people or companies who were not rich or sophisticated enough for the old ones started buying the new ones, and there were so many more of the regular people than there were of the rich, sophisticated people that the companies making the new products prospered. Christensen called these low-end products “disruptive technologies,” because, rather than sustaining technological progress toward better performance, they disrupted it.
Larissa MacFarquahar, The New Yorker

Disruptive technologies aren’t competitive at the start

In terms of quality, disruptive technologies don’t compete. They often have a less-polished design or are crafted of lower-quality materials, equivalent functionality (like bandwidth or memory) costs more compared to earlier products, and they don’t perform as well on key metrics.

People often point at the failings of the mobile internet as rationale for why it won’t overtake the desktop web. “No one will ever want to do that on mobile” gets used to justify short-sighted decisions. Truth is, we can’t predict all the ways that people will want to use mobile in the future. Jason Grigsby, co-author of Head First Mobile Web (with Lyza Danger Gardner) says “We can’t predict future behavior from a current experience that sucks.”

Disruption happens from the low end

Disruptive technologies take off because they create a new market for a product. People who previously could not afford a particular technology get access to it, in a form that (at least at the start) is less powerful and of lower quality. These people aren’t comparing between the more established technology and the new one. They have no other alternative.

McKinsey estimates that the mobile internet could bring billions of people online:

However, the full potential of the mobile Internet is yet to be realized; over the coming decade, this technology could fuel significant transformation and disruption, not least from the possibility that the mobile Internet could bring two billion to three billion more people into the connected world and the global economy.
Disruptive technologies: Advances that will transform life, business, and the global economy

Disruptive technologies eventually improve

Over time, the quality of low-end technology improves. As more and more people buy into a cheaper, less-capable technology, more attention and focus goes toward refining it. Eventually, it overtakes its larger, more capable predecessor.

This is the challenge we face in mobile right now. Mobile won’t always be a secondary device or a limited, on-the-go use case. Mobile will be the internet. Comparing its shortcomings to what the desktop web does well is missing the point. Mobile will be better than the desktop—but it will succeed on what it does uniquely well.

McKinsey estimates the astonishing potential economic upside of the mobile internet:

We estimate that for the applications we have sized, the mobile Internet could generate annual economic impact of $3.7 trillion to $10.8 trillion globally by 2025. This value would come from three main sources: improved delivery of services, productivity increases in select work categories, and the value from Internet use for the new Internet users who are likely to be added in 2025, assuming that they will use wireless access either all or part of the time.
Disruptive technologies: Advances that will transform life, business, and the global economy

Today, the mobile internet provides a lousy experience. For billions of people coming online across the world, it will be their first (and only) way to access the web. The history of disruptive innovation shows that it’s okay if the mobile internet provides a less-than-adequate experience today. Most mobile internet users won’t be comparing between the desktop web and the mobile web. For these people, the alternative is nothing.

Tomorrow, the mobile internet will provide a better experience. It’s up to us to make it happen.

Read the whole story
Share this story
Delete

Unheap: A Tidy Repository of jQuery Plugins

2 Shares
» Unheap: A Tidy Repository of jQuery Plugins

A nice-to-look-at, easy-to-use reference library of jQuery plugins.

@candicodeit
Read the whole story
Share this story
Delete

One Less JPG

1 Comment and 2 Shares
» One Less JPG

Before you go worrying about how to minify every last library or shave tests out of Modernizr, try and see if you can remove just one photo from your design. It will make a bigger difference.

@rupl
Read the whole story
Share this story
Delete
1 public comment
acdha
4284 days ago
reply
Topical for me: I've been using SASS / Compass to finally sprite more resources and inline a few images. So far I'm at ~8 fewer requests and a net 1KB increase in CSS size pre-gzip.

One other interesting option: progressive JPEG compresses significantly better than baseline JPEG for all but small images (see e.g. http://calendar.perfplanet.com/2012/progressive-jpegs-a-new-best-practice/); convert one large image and you've saved more transfer than a fair amount code.

The major caveat to all of this advice: CSS and JS are blocking, unlike images, so unless you take care to package and load them efficiently the difference is more like 3-4 JPGs.
Washington, DC

Flawed Survey Tries To Diss Open Source, Fails

1 Comment and 2 Shares

Two surveys surfaced last week that paint widely divergent pictures of enterprise adoption of open source. But based on the continued rise of open source in the enterprise, only one is likely correct.

The first comes from Univa, a data center automation company that also offers an open-source version of its Grid Engine product. Univa found that while 76% of enterprises surveyed are using open source, a full 75% experience problems running it in mission-critical workloads.

Given that so many enterprises apparently struggle to use open source successfully, one might wonder why so many persist in doing so. Back in 2008, Gartner found that 85% of enterprises were using open source, but even that high number is surely underreporting actual adoption of open source because, according to Forrester, "developers adopt open source products tactically without the explicit approval of their managers."

Conflicted Much?

Fortunately, Univa doesn't leave us to guess how to resolve this seeming conflict between mass adoption and poor quality. While open source is rarely mentioned on its website, the one page that gets a lot of open source mentions presents a highly conflicted view on open source, like the following customer testimonials:

"...we were finally able to switch our focus away from a malfunctioning [open source] Grid Engine."

"If I went to another company that was using purely an open-source Grid Engine, I would take Univa with me to assure this kind of flexibility and security. I know Univa has my back."

And this product pitch:

"Univa Grid Engine is the next generation product that open source Grid Engine users have been waiting for.

These sorts of statements would be a great way to bash one's competition, but in this case Univa's marketing is designed to bash itself. Or rather, the open-source project upon which it is based. This message carries through in its survey, which found that 64% of enterprises will pay for better quality, which translates to stability (25%) and enterprise-grade support (22%).

"That open-source product we give away? It's not very good! You should pay us instead of using our open-source software" seems to be the message.

Different Survey, Very Different Results

It's a very different message conveyed by the results of Black Duck Software and North Bridge Venture Partners 2013 Future of Open Source survey.  While vendor support was a top-three consideration in 2012 for adopting open source, in 2013 it falls to number 11, well behind competitive functionality, solid security, and better TCO as reasons to use open source.

In fact, this survey finds that "Better Quality Software," which was the fifth-placed reason for using open source in 2011, is now the top reason:

So open source goes from quality nightmare for 75% of enterprisesr in Univa's survey to quality king in Black Duck's survey. What gives?

Reading Between The Lines

Well, vendor motivations may help to sway the kinds of questions asked, and the recipients of the survey itself. I'm not suggesting that either company set out to skew results, but neither data sample is likely purely random.

Still, I'm more inclined to give credence to Black Duck's results, despite it being an open-source management and consulting firm. After all, open source is driving the top-three trends in enterprise computing: Big Data, cloud, and mobile. If enterprises were struggling to make open source work, they wouldn't be using so much of it, and in such business-critical areas.

Which is not to suggest that open source has "won" and all proprietary software is doomed. Indeed, according to a recent Barclays survey of IT executives, a mix of proprietary and open-source software will likely persist for some time:

But let's not kid ourselves: the days of open source failing because of a lack of enterprise support or insufficient quality are well behind us. There is no shortage of quality companies providing support for leading edge open-source software. And there is no shortage of exceptional enterprise-grade open-source software.

The proof? Open source is being adopted in droves. That's really the only number that matters in figuring out whether open source provides high-quality software.

Image courtesy of Shutterstock.

Read the whole story
Share this story
Delete
1 public comment
acdha
4284 days ago
reply
There's a major gap here: what's the success rate for non-open-source software? I'd be surprised if the initial numbers weren't comparable, particularly if you compared TCO for OSS + developer time to enterprise software + consultant time. Most businesses run into problems because they want significant customizations, which usually applies to either approach.
Washington, DC
petrilli
4284 days ago
Depends, do you mean the failures people own up to? When you spend $15M on Siebel or SAP, how much more money are you going to throw down the drain chasing the implementation faerie? People will call SugarCRM a "failure" if they can't get it perfect in an afternoon. They'll dump $45M into Siebel and call it a "success".
acdha
4284 days ago
That was rather where I was going with TCO – it amazes me how many people will balk at, say, paying for a month or two of staff time configuring something but spend 10x the cost and get worse results paying Oracle/Siebel/SAP consultants for incredibly invasive hacks which have to repeated on every upgrade

April 07, 2013

7 Comments and 19 Shares

Last day at Skeptech! Come see me!
Read the whole story
chachra
4288 days ago
reply
San Francisco, CA
Share this story
Delete
7 public comments
normd
4297 days ago
reply
This can be applied to just about anything and everything.
iambyteman
4301 days ago
reply
This is probably true and we should all keep it in mind when talking politics and religion.
Baltimore, MD
tedgould
4302 days ago
reply
OSS in a nutshell.
Texas, USA
MourningStar888
4303 days ago
reply
Accuracy!
Fredonia, NY
sheppy
4303 days ago
reply
I've made this same observation. It's especially common in politics.
Maryville, Tennessee, USA
adamgurri
4303 days ago
reply
The random shape theory of political discourse
New York, NY
Next Page of Stories