Main

April 14, 2008

Persistent Storage for Amazon EC2

Last night, Amazon announced that they're adding a persistent storage capability to their EC2 service. To review, EC2 provides the ability to create virtual servers on the fly. These servers are a bit ephemeral, however. They can fail at any time and don't provide any persistent, local storage of their own. If an EC2 instance fails, you have to completely restart it, losing any data it may have been working on. Amazon's S3 service is persistent storage, but it is not designed to be accessed as local storage by EC2 instances. The newly announced persistent storage capability is designed to solve this issue. It's like an on-demand S.A.N., but with more flexibility. One of the really nice things about it is the ability to checkpoint a persistent volume to S3. This is great for database backups, among other things. No performance numbers have been published yet, but those who have been using it say the performance is good. This makes Amazon Web Services even more interesting, because it's now easier to run a normal MySQL instance without having to do something like running some kind of replication just to deal with the non-persistent local storage. And it scales up.

See Werner Vogels' announcement of the persistent storage service, and RightScale's analysis of it, for more information.

March 07, 2008

Laptop Bag Recommendations?

Dear Lazyweb,

I'm in need of a new laptop bag, something on the smallish side. It needs to fit my Macbook Air and its power adapter, my Kindle and its power adapter, my Bose headphones, possibly a small mouse, and a couple of cords. So, not much. Any recommendations? I'd like to see it in person before I buy it, and I'd like to get it this weekend, so that eliminates mail-order places like WaterField.

December 14, 2007

Database Developments (new post on Startupping)

I just wrote a new post over on Startupping about two items related to databases and Internet services. I talk about SSDs and the launch of Amazon's new SimpleDB, which I think is a very big deal.

November 12, 2007

Keyboard of Choice

As someone who has done a considerable amount of typing in his lifetime, I've become a bit of a keyboard connoisseur. My first computer was one of the original IBM PCs and one of the great things about that computer was the keyboard. It was this heavy, steel beast of a thing. The keys were well spaced and had the perfect amount of tactile feedback and travel. Maybe it was because I learned to type on that keyboard, but I haven't found a better keyboard since. Luckily, a company called PC Keyboard purchased the technology/designs of this keyboard and are selling new versions. I just ordered my third 'Customizer 104/105' keyboard from PC Keyboard for a new iMac. If you're unsatisfied with your current 'board, I recommend checking one of these keyboards out.

June 22, 2005

AlwaysOn Open Media 100

I'm flattered to be included in the AO/Technorati Open Media 100 in the Toolsmiths category. Thanks to everyone involved!

October 28, 2004

HP and Sleepycat

Previously I had mentioned how we were having problems with Seagate drives. On the flip side, I'd like to point out two exceptional companies, HP and Sleepycat Software.

In production, we use several HP Procurve switches. These are devices that connect the various machines in the Bloglines cluster. Like some other aspects of our cluster, we've bought several of these off eBay, which even several years after the bubble burst, is still a good source for cheap computer hardware. Recently, a newly purchased used HP switch died on us. There were two exceptional things about this. First, the switch continued to operate, but in a reduced way. Specifically, we couldn't access the management functions of the switch.So even though the switch had 'lost its brain', it continued to do the basic functions of keeping the network going. This highlights exceptional design on HP's part. The second exceptional thing about this event was HP's support. They overnighted us a replacement switch, no questions asked, no receipt needed. The switches have lifetime warranties, which apparently apply even to secondary owners. Amazing. They made true believers out of us. At least for our networking gear, we're HP purchasers for life.

The second company I want to mention is Sleepycat Software. Sleepycat makes the database software that powers large parts of Bloglines. Sleepycat perhaps isn't as well known as MySQL or Postgres, but their software is very fast, bulletproof, and their support is top notch. They deserve more attention. Some people dismiss their database because it doesn't have a SQL query engine. SQL is fine for ad-hoc queries. But I can guarantee that not a single query that we run on our databases is ad-hoc, by definition. We have a defined set of database APIs, and we want them to run as fast and reliably as possible. So why take the 10x or more performance hit of a SQL engine or risk the bugs inherent in a more complicated system? You get a fully ACID compliant database system, with hot and cold backup capabilities, as well as a full replication system. And it's open source.

I don't want to start sounding like a commercial, but I thought both of these companies deserved to be highlighted based on my experiences.

September 24, 2004

Seagate Has A Problem

At Bloglines, we have 3 classes of machines in our cluster. We've got web boxes, which are pretty lightweight. We've got storage class machines, which as you can guess have big drives and medium speed processors. And we have database class machines, which have fast processors, fast disk, and lots of ECC memory.

Fast disk, in general, means some form of SCSI. The database machines use Ultra SCSI drives, specifically Seagate Cheetah Ultra320s in a RAID configuration. Unfortunately, we've experienced something like a 40% failure rate on these drives. Because of the RAIDs, this hasn't resulted in any loss of data or downtime, but it's still extremely unacceptable.

The drives have a 5 year warranty, so we've been shipping them back to Seagate. In return, we receive 'repaired' drives from Seagate. Recently, one of those repaired drives failed within one minute when installed in a machine. My suspicion is that part of the problem is that Seagate isn't doing much of a job to fix drives that are sent back for repair.

Speaking of which, when sending a drive back to Seagate for replacement, you can call them up and ask for the 'advance replacement option'. This means that they send out a 'new' drive before they receive your old drive. This speeds up the replacement process. Before today, we were able to get a customer support rep on the phone directly and specify the advance replacement option immediately. But now, apparently Seagate is outsourcing their first-tier customer support, so now when you call them up, they ask for your details and then say someone will be in touch within 24 hours. Which, if calling on a Friday, probably means Monday.

We'll never purchase Seagate Ultra SCSI drives again. The risk is too high.

April 01, 2004

GMail

So I guess that Google's new Gmail web-mail service isn't a hoax after all. Kudos to them for the publicity stunt of announcing on April Fools.

More importantly, it sounds like they've got the right idea about storage, giving each user 1 gigabyte of storage. I think this is absolutely the correct thing to do. Economically, it doesn't cost Google much to provide this (storage approaches free over time, and most people won't use up that gig, at least not immediately). And it really ties the user to Google's service. If I've got a gigabyte of old email on Google, that's a very strong incentive to continue to use the service.

It will be interesting to see how Yahoo and MSN/Hotmail respond to this. They've both made a business out of charging extra for more than a very small amount of storage.

I've said it before, but I'll say it again, modified slightly. When designing a service, assume hardware is free. Assume processing power and storage are infinite. Because they approach that over time, and limiting them does your service more harm than good. In addition, I think at this point you can also assume that bandwidth is free. That certainly wasn't the case in the mid-1990s. But there's now a glut, you can get very good deals on bandwidth these days, and it's only getting better.

March 20, 2004

Technorati

I attended the Future Salon presentation by Dave Sifry on Technorati last night. This was the first time I had met Dave, and he struck me as one of those instantly likeable types. It was a good presentation and I have a lot of respect for what they're doing at Technorati. Even though there is some functionality overlap between Bloglines and Technorati, which will probably increase over time, I think we serve different audiences. Anyways, Dave had some very nice things to say about Bloglines, and I was quite flattered.

March 09, 2004

Better To Be Lucky Than Good?

During the recent move of Bloglines from Equinex in San Jose to AT&T, we retired a couple of machines and added several new machines. Yesterday, we were reconfiguring what used to be one of the primary database machines at the old co-lo. While swapping out drives, I noticed that the wire for the speaker was completely melted. Looking more closely, the speaker had shorted against the metal chassis. That couldn't have been good for things. And now, for whatever reason, that machine is completely flakey and crashes every hour or so. While it was functioning as one of the database machines at the old co-lo, it crashed a grand total of once in about 1 year of heavy operation.

This was one of the machines I originally bought used off eBay. I got it cheap and it worked really well for a year, so no complaints. Ebay is still a good source for cheap gear, although it seems like there's less good stuff (computer wise) these days than say a year ago.

February 16, 2004

Web Programming Theory

Warning, this is a nerdy post. Moreso than normal even.

As I was visiting a site this morning that was having problems and spitting out random PHP error messages, I was reminded that I wanted to write up something about web programming. No I'm not going to name the site, but they should know better. And this is not a post bashing PHP, because like many things PHP can be used for good as well as evil.

During one of our web redesigns at ONElist , a couple of our very smart engineers developed a new templating system, called CS/HDF, named after the two types of files that are involved. Later, Dave Jeske and Brandon Long wrote an open-source version of CS/HDF, called Clearsilver. It's an excellent implementation, and it's what we use for Bloglines.

I won't go into explaining how Clearsilver works; the web site's got some pretty good docs in that regard. But I want to mention the design philosophy of the system. Basically, Clearsilver forces the developer to seperate application logic from the presentation layer. What does that mean? Before Clearsilver is given the chance to build a web page, all variables required for building that web page are populated. Once Clearsilver takes over, no additional data is input, no database calls are made, nothing. By the time that Clearsilver is processing the page, the application knows whether it was able to fetch all the data and whether there were any errors in processing it.

How is this good? A couple of reasons. First, since the display layer is already extracted, it's very easy to redesign the look of or internationalize the web site. No need to worry about messing up any application logic while editing templates. Also, there will never be any mysterious error messages showing up in the middle of the page whenever something fails. By the time the template is processed you know for certain whether you have all the information you need to display the page. That makes it easier to craft error dialogs. Ever been surfing a web site, and you see half the page come up, and then wait 30 seconds for the rest of the page to complete? That happens when the data required for the web page is fetched after the page has started to display. That makes it much more difficult to craft good error dialogs. If I want the look of the page to completely change if there's an error, I can't do that if I find out about the error after half the page has already been sent to the client web browser.

Again, I'm not bashing PHP. I believe you can program in this style using PHP as well, but it's not enforced. The language is not important (although Clearsilver is very good and deserves more recognition). The seperation of application logic from display is what matters.

February 10, 2004

Technorati?

I'm confused by something I'm seeing with Technorati. If I search on www.bloglines.com, it says that there are '1106 inbound blogs'. So then I go to the Top 100 listing, Bloglines doesn't show up. If I'm reading things corrected, we should appear at about number 64, right after Gawker. But we're nowhere. Strange. Maybe I'm misunderstanding something?

January 25, 2004

Thoughts on Orkut

Everyone else is talking about Orkut so I figured I'd add my two cents. I played with it a bit this morning. The interface is ok, but I'm surprised they didn't go for more of a 'Google-like' look. I think that would have served them better. I was very surprised to see them using Microsoft ASP. I'm not terribly familiar with Microsoft's web hosting solutions, so I won't comment on the technical merrits. But it's really surprising that Google would be willing to put up with the cost of a Microsoft based service. Imagine all the licenses needed as the service scales up. Google believes in throwing tons of small servers at a problem; imagine if each required a paid copy of Windows.

As of this evening, it looks like they've taken the service down due to scaling problems. I did notice a few slow page loads this morning. I don't wish scaling problems on anyone. Success is good, but not being able to handle the traffic is an experience you never want to have to go through a second time. A database/registration driven system, like Orkut or Bloglines, is a very different animal to architect than a search service like Google. Fun to do, but very challenging.

November 23, 2003

My New Phone

So, like apparently everyone else, I got a Treo 600 yesterday. I only upgrade phones every 4-5 years, so this is a big deal (my last phone was a Nokia 6120). I'm with AT&T Wireless, and finding and purchasing the phone was very difficult. Most stores in the SF Bay Area haven't gotten them yet, or the few that have received a shipment sold out quickly. I got the last one from the Santa Clara store.

The form factor is great, the keyword is very workable, the sound quality is a huge improvement over my last phone. The speakerphone even seems to work ok. I love the fact that I can ssh into machines at work from this little device. You wouldn't want to code an app with it, but it's a good safety net.

The downside is that, as it's a GSM phone, the coverage is almost non-existent in my house. I have found a couple of places where it works. Luckily one of those places is upstairs within hearing distance of our bedroom, so midnight pages won't go unheard. I have 30 days to decide whether there is enough coverage at home to keep the phone. I really want to keep this thing, but the lack of coverage might be a deal-breaker.