Archive for the ‘Projects’ Category

I dream of pervasive virtualization…

Friday, January 2nd, 2009

I dream of a day where virtualization is pervasive.

Instead of thinking about services in terms of servers, CPUs or directly mapped resources, I should be able to to add virtual machines in terms of guaranteed throughput rate over a whole grid.  Scaling out should be as easy as adding a blade or racking another server.

At the low level, I should have the option of running N+N redundancy.  That is, the VM should run in lockstep across multiple machines - so if it is running on 2 vcpus, 4 in total would be used.  This would allow for any node to fail.  And the VM should be an aggregate of the low level hardware - e.g. a VM grid across 4 8-core servers should scale near-linearly when a single OS instance is running 32 processes.

Current solutions only attempt to do some of the tasks above, and most fail miserably.  IBM mainframes have been doing it for ages.

If I had the time, I know I could build software to do this better than anyone else.  All the puzzle pieces are there, especially the tough ones like hypervisors and Infiniband.  This could have been done at least 3 years ago.  I bet it will take the industry 3-4 years yet to get anywhere close.

This is a real virtual datacenter.

How to upgrade to ext4 in place

Wednesday, December 24th, 2008

Here’s how you upgrade to ext4.  The process is pretty easy, but requires an fsck which means unmounting or rebooting if the file system is in use.

Make sure you are using at least e2fstools 1.41.3 and kernel 2.6.28 (or a vendor kernel with latest ext4 patches applied)!  Also, its probably a good idea to have proper backups (really!).  ext4 has just been declared stable, but what that really means is that the battle hardening has just begun.  I’ve done several heavily used systems without fault so far though, so its probably good enough for your desktop.

WARNING: DON’T CONVERT YOUR /boot PARTITION. Right now, there is no stable version of grub with ext3 support.  Even if there was, it really won’t gain you anything  :-) .

Run tune2fs, e.g.:

tune2fs -I 256 -O sparse_super,filetype,resize_inode,dir_index,ext_attr,has_journal,\
extents,huge_file,flex_bg,uninit_bg,dir_nlink,extra_isize /dev/sd[x][n]

Those are the default options for an ext4 file system if you were to create it with mkfs.ext4 (e2fsprogs 1.41.3 - see /etc/mke2fs.conf).  I’m getting pretty damn good performance with this!  The ‘-I 256′ option sets 256 bit inodes, which most recent ext3 FSs use already. If this is the case, and you get a message telling you so, remove this option.  Note that extents will make the FS backwards INCOMPATIBLE with ext3.

Next, edit /etc/fstab, e.g.:

/dev/vg/home /home ext4 defaults 0 0

Either unmount and mount or reboot your system.  tune2fs marks the fs as dirty and performs a fsck and conversion.
NOTICE: distros with initrds may need to be regenerated or you won’t be able to mount your root file system.  In Fedora (replace kernel version with your own):

cd /boot
mv initrd-2.6.27.7-134.fc10.i686.img initrd-2.6.27.7-134.fc10.i686.img.old
mkinitrd initrd-2.6.27.7-134.fc10.i686.img initrd-2.6.27.7-134.fc10.i686.

That’s all there is to it.  Stay tuned for future ext4 developments like online defragmentation.

Also, ext{2,3,4} reserve 5% of space for root in case the drive fills up.  On large modern drives, this can be excessive (e.g: 50GB on a 1TB disk).  Consider running ‘tune2fs -m 1 /dev/sd[x][n]‘ to reduce this to 1%.

For more information and tweaking:

  1. Documentation/filesystems/ext4.txt from the latest kernel sources
  2. http://ext4.wiki.kernel.org/index.php/Main_Page
  3. man tune2fs
  4. http://e2fsprogs.sourceforge.net/

Retrocomputing for Fun and Profit

Sunday, December 7th, 2008
  1. Buy Old Computers
  2. ???
  3. Profit

What is retrocomputing?

I define retrocomputing [wikipedia] as the collecting and use of old computers.  Why might one do this?  Well, for one, enterprises cycle out machines fairly frequently.  2,3,4 and 5 year old systems are often sent out to scrappers in droves despite still being plenty useful.  Top of the line systems for large companies often have more than enough power for small and medium sized ones at pennies on the dollar compared to new hardware.  These machines are likely complete overkill for home use, but none the less are very useful for fun and learning.

IBM mainframe ops in the 1980s

Why?!

A lot of what I know about computers has been learned on old machines.  Hooking up a couple of servers and desktops and trying to make something useful out of them is a great exercise for the aspiring system administrator.  With open source software, it can all be done freely and easily.

Yes, you can run Linux, BSD, and Solaris from the comfort of your Windows desktop in a virtual machine (weak sauce…).  Yet there is something much different when you cluster several high technology servers together, tethered to a Fibre Channel storage array and have them share a single distributed file system.  The knowledge of setup, installation, and troubleshooting I’ve gained from mock scenarios like this I cannot compare to anyone else I’ve ever met.  Breaking things here usually means digging deep and fixing it.  If you were to screw something up at work like some of the things I’ve gotten into, it would probably cost you your job.

BENCHNET - where I rip into computers that cost as much as a house and my "production" rack

Retrocomputing is also fun.  I am personally into old IBM hardware, though old UNIX workstations of all sorts are interesting to me.  You can see my collection of IBM PS/2 and RS/6000 knowledge here: http://ps-2.kev009.com:8081/.  There is a particular thrill to booting up a machine that cost between $20,000 and $50,000 10 years ago.  Knowing that these same machine models were used to design the Boeing 777, composed the famous Deep Blue machine, and were used in the largest automotive and shipbuilding firms not to mention some of the most important space craft to date also brings a sense of power and nostalgia.  In some ways its similar to having a classic car, but different.  Maybe if that classic car was a big ass bulldozer, tank, jet or some other well engineered piece of equipment :-P.

Some old systems I had at one time or another.  Left to right: IBM PS/2e (first "green" environmental pc), RS/6000 43p (7043-140), Apple PowerMac 7100/80, RS/6000 7006-42W, RS/6000 7012-397, HP Visualize c360 (PARISC)

IBM PS/2e, RS/6000 43p, PowerMac 7100, RS/6000 x2, HP Visualize c360

Nostalgia is one of the biggest things I get out of using particularly old hardware.  I missed the mainframe days, the minicomputer days, the PC and DOS days, the Apple II days (well, actually I used these a bit at a very young age), and to a degree the early Windows days.  Just like a history class, studying these old machines gives me insight as to why things are done the way they are today.  It gives me appreciation for modern systems and makes me write clean and well optimized code.  The old computer games that captivated me as a child (Sim City, Sim Tower, Sim Ant, Sim Farm, Gizmos and Gadgets, The Incredible Machine, Oregon Tail etc.) implanted a high degree of logic and understanding at a young age and it is heartwarming to revisit these.  I grew up a Mac user as well, so seeing what I was(or: was not :>) missing on PCs is also interesting.

Old MIPS UNIX server booting and logging in

Some of the benefits of retrocomputing:

  • Enterprise class hardware
  • Cheap, possibly even free
  • Different design philosophies - not everything is x86 - a lot of this gear is quite different.  For example, UNIX workstations integrated most of what we enjoy on our PCs years before it became available to consumers.  SGI machines were doing A/V and 3D in the early 90s.  IBM midrange AS/400s have an advanced integrated database, programming languages, and environment that make PCs look like a joke for business programming.  WinFS, Object Storage Devices, etc are just now being talked about for PCs.  The channel philosophy from mainframes is still pretty new to PC servers (fibre channel), not to mention virtualization.
  • If you break it, you can fix it and learn from it or toss it
  • The engineering and craftsmanship in some of these systems is downright astonishing
  • Old computers are works of art: they give you a window into the technology and culture of times past
  • You should never trust a computer you can’t lift

It is interesting that we as humans produce such elaborate machines, only to discard them as scarp a few years later.  It is humbling and shows you the incredible progress we are making.

How?

eBay is your friend, but also look for local scrapyards or businesses doing overhauls.

If you are faint of heart, plenty of good abandonware sites exist for games and operating systems that can be run on emulators or VMs.  Check out this IBM mainframe emulator, Hercules.  Some of the original IBM OSes are public domain.

If you don’t want old PCs and big iron overtaking your house, there is plenty of good material on YouTube as well.  The Computer Museum is a good start.  Some of the consoles, offices, and outfits are hilarious.

Old SGI tech demo - pretty impressive!

Building NAS, Part 2

Sunday, April 15th, 2007

Building your own Network Attached Storage server

In an earlier post, I discussed what was necessary to cable 1U servers up with power to run SATA drives. Next, you’ll need some drives and a controller if you have not already purchased them.

Controllers

The SATA controller you select is one of the most important pieces of the NAS box. It is my recommendation you get a high quality card, even if you are just doing RAID-1 (Mirroring). There are basically three types of controllers: standard, fakeraid, and hardware RAID. A standard controller is just a host-bus adapter. Any advanced features like striping and mirroring will need to be performed by your OS. Fakeraid includes a RAID BIOS, but all processing is left to a software driver. Hardware RAID is a complete subsystem with a dedicated CPU.

Standard controllers and fakeraid cards are generally cheap and maybe even integrated on your motherboard. However, these are rarely a good choice for servers. Aside from requiring the host to perform a lot more work, these can also cause a lot of trouble down the road. For instance, if a hard disk fails with software RAID, the system may not be able to reboot without intervention. Fakeraid may present the disks to the machine as one logical disk to avoid this, but these cards are the most evil. They tend to use closed drivers and proprietary disk formats which basically means avoid these like the plague. The cards themselves are fine, just use them like a standard controller and use the software RAID of your operating system. That leaves us with hardware RAID controllers.

Hardware RAID controllers are the only logical choice for a server. Hardware RAID cards offload all RAID processing to a subsystem on the card. These cards have an integrated CPU to perform parity calculations for RAID-5, and usually a large amount of RAM to act as cache. The RAM can even be used as write cache, but it is important to have a battery ON THE CARD in case of a power failure or crash to avoid potentially much greater data loss. Hardware RAID controllers present their RAID volumes as logical disks to the computer and operating system, so faults are transparent to the host. They also tend to allow hot addition of storage. Finally, since the card is handling all the I/O, bandwidth requirements are heavily reduced. If I am writing to a hardware RAID-1 device, data only needs to be sent across the PCI bus once as opposed to software RAID which will need to write the data to each disk. This is a very important consideration for older machines that have limited bus bandwidth and large arrays.

All things considered, I went with a 3Ware 9500S. Cost: $100 on eBay.

3Ware 9500S

3Ware has good drivers in the Linux kernel, and has been manufacturing SATA RAID controllers for some time. Other decent manufactures are Adaptec and LSI Logic. Be sure to check for OS compatibility before purchasing a card.

Hard Drives

Just because you are getting the price point of SATA, you should not disregard the quality of drives you are purchasing. Although most standard consumer drives will work fine, there is a much more attractive option. Seagate and others offer what they call “nearline” drives which are basically the consumer drives with a RAID friendly firmware and continuous duty cycle. What’s best is that these usually only have a $10 or so premium over their consumer counterparts. When it comes to selecting a manufacturer, take a look at the warranty and technology integrated in the drives. Seagate and Hitachi are both good choices. Take a look at reviews too, especially StorageReview.

When most people purchase a hard disk, they make their choice simply on size. For a server (and even desktop!), you should also consider the spindle speed and size of the drive cache. 7,200RPM is enough for most servers, but 10,000RPM will deliver far greater performance under concurrent and random access. Of course, these usually come at a a heavy price premium and lesser capacity.

Here, I went with a pair of nearline Seagate Barracuda ES drives. The 320GB model was ample for my need, but these go all the way up to 750GB. They use perpendicular magnetic recording which increases density and speed, and feature a large 16MB cache. My initial testing shows nearly 80MB/s throughput! That level of speed has traditionally only been available on expensive SCSI disks. Cost: $100/drive.

Seagate Barracuda ES 320GB

Networking

Equally important to a NAS server is ample networking bandwidth.  This is largely dependent on the scale of services, but a decent gigabit NIC should get you near disk performance.  I picked up an Intel Pro/1000 MT Server Adapter, which features various offloading schemes to free the host’s CPUs from networking tasks.  If your needs are greater, consider aggregating several gigabit ports together.  Cost: $20.

Final Thoughts

The total cost for storage hardware comes to about $320.

Cabling the drives up and configuring the array is a pretty straight forward task. If you chose software RAID, there are plenty of guides on the internet to assist you in setting up md and device-mapper. Try http://linas.org/linux/raid.html.

When it comes to RAID levels, the choice depends on the number of drives you have and the level of protection you need. RAID-1 is a great choice for most applications in that you get true redundancy and greater read performance, but at the cost of half the physical capacity. RAID-5 is also commonly used in which you lose the capacity of one drive, with a three drive minimum. It allows for the failure of a single drive. With the size of modern hard disks, RAID-5 is less attractive than it once was and also suffers from heavy write performance loss due to parity calculations. Try http://www.acnc.com/04_01_00.html for the lowdown on various RAID levels.

My next post will cover file system selection.

IBM NetVista 2800 Hacking

Thursday, March 29th, 2007

Similar to the x330 SATA mod, I have some NetVista 2800s that I need 5v/12v accessory power from. I’m doing some experimentation with the Asterisk open source PBX and wanted to interface with the house phone lines. I purchased a Digium TDM400P for this task.

Digium TDM400P

This is a modular card, with up to 4 lines. Any combination of FXO or FXS modules can be used. FXS modules to interface with analog telephones require 5v/12v power. The NetVista 2800 has a 4 pin power header, labeled “Hardfile Power” at J11.

NetVista J11 hardfile power

The connector we need this time is a Molex 43025-0400. Here’s the pin out:

NetVista 2800 Power Connector

I also found an IBM part for the xSeries 300. Note that this is not wired correctly for either the x330 and Netvista 2800. It will require reworking for both application. FRU 24P0622:

xSeries 300 Hard Drive Power

And that brings me to my final point. I found a really cool product today that would be useful for both the NetVista and 1U servers.

UpgradeWare HD25-I

It takes power from the PCI bus to power a 2.5″ notebook hard disk. There are a few different models with different SATA and PATA connection combos depending on your needs. The fact that this product uses a PCI slot is its greatest advantage and disadvantage. PCI slots are usually scarce on machines where this would be useful. None the less, at around $25 it is basically the same price as the IDE adapter you would otherwise need to use a notebook drive in a desktop machine.

IBM xSeries 330 (x330) SATA Retrofit

Wednesday, March 28th, 2007

This post documents the process of putting SATA hard disks into older 1U servers. At first thought, I thought it would be quick and easy but I will share some of the pitfalls I experienced along the way.

I acquired a pair of x330 servers this summer, basically missing only the CPUs and heatsinks. CPUs were not difficult to obtain, but it did take some trial and error to get correctly oriented passive heatsinks. Before putting these into service, I took the opportunity to bring some much needed larger NAS online for the home network by adding some large SATA drives.

x330 hard disks - old and new

After buying a good set of drives and a controller, I started running into some snags. The most obvious was the lack of standard 5v/12v accessory power. The x330 has a 24-pin header that provides power to the SCSI backplane. Using a multi-meter, I was able to locate the necessary contacts. If your server lacks molex power connectors as well, you will need to trace these down. My best advice is to triple check your readings and use an older drive to test the connection before you hook up your new drives. Nothing like all that work, followed by a puff of the magic black smoke flying out of your new drives :). The right eight pins are the ones we want.

x330 J8 Power Header

To make a nice cable, you will need a Molex 43025-2400 connector and 4 to 8 pins. While I waited for mine to come in, I just pressed a power cable wire by wire into the connections to make contact.

x330 Power Wiring

The next potential point of struggle is getting the drives mounted in the trays. The x330 trays have a plastic SCSI SCA pass through on the end. These were held on with a pair of security Torx screws, so you may need to get creative if you do not have the correct bit. The drives fit nicely at this point, but the SATA power connection was blocked by a metal corner of the tray. A quick bit of work with a file fixed this.

x330 Tray Mod

Pull the old hotswap backplane and cable up the new drives. At this point you should power on and see how it goes. Hopefully you wired the power cable correctly.

Now your server is capable of accepting inexpensive, high capacity SATA drives! The basic process should be the same for any 1U server lacking power connectors, though you may need to dig deeper to get to the 5v/12v source (power supply is a surefire bet).