The disk array

From RedBrick Wiki
Jump to: navigation, search

The Disk Array, or as it was pretentiously titled and labeled by it's constructor : JoltStorage Array Model 618, is what it sounds like: An array of Jolts. Dear Mother of crap no, I mean and array of disks, but almost as bad.

The array was first talked about around about February/March 2000, when it was decided that following several admin fuckups with disk-space, and Enigma, our main machine at the time, running at close to 90% of usage on /home that we needed more disk space, quick.

Following this impetus, the acquisition of the array went into the deep murky dark depths of the RedBrick Committee. The details are sketchy, and the individuals involved reluctant to come out with details. What is known is that some number of months later some people were consulted on a likely setup. Myself and Grimnar among others. Over the summer as a new committee came into force Phil took charge of the getting an array, and many dealings were had with the SPC. Eventually after many months, the cash was secured.

Around about now, some 6 months after getting the array was discussed, enigma's disk started to fill. It would do this about twice a week, the end result being that peoples mails were bouncing and the system was pretty much unusable. "When is the array coming?", we would ask, only be told that there were difficulties getting it ordered and delivered. Trouble with Online hardware sites and so on.

In the meantime, Elmer loaned RedBrick a SUN SCSI disk he found in a skip (we think) as a temporary solution. It was mounted as /its-nah-far-me-its-far-me-ma (Bobb) on enigma, which helped things a little, though it did once physically fall down the back of the monolith bringing enigma down with it.

Around about December it was finally actually ordered, on David Murphy's credit card I believe. It was delivered direct to Dave, who assembled and tested it (according to him) and in February it reached RedBrick, a good year after it had first been discussed.

Now this was no ordinary disk array. This was a piece of utter shit. Whoever had designed and bought it had approximately zero clue. The Raid controller was a bottom of the line mylex but would barely get us away with RAID5, there was no cable that was suitable or even long enough to connect it Enigma, oops. But worse, the disk array was built in a Server chassis, easily 20 times bigger volume than need be. This meant an extra long, custom-made SCSI cable was needed, that it would waste a ridiculous amount of space, and be pain to move around. Minor gripes are that it wasn't rack mountable, and that the drives/backplane were not SCA or hot-swappable.

We had no way to power this array, we're not sure how Dave did his testing. We had a 300W ATI PSU and 6 disks but no way to tell the PSU to power on. Numerous suggestions were made, and admins knew that pins 2 and 4 could be soldered together and it would work, but instead it was decided to use an ATI motherboard, which was handy because the way it worked out there was a spare.

On the plus side it did have 6 nice IBM drives, each with their own cooling enclosure. The use of a server chassis is overkill but extremely convient to a) fit a PSU in, b) fit disks in a 5 1/4" drive bay. Custom external drive enclosures couild well have cost more. So there it was, proudly titled "JoltStorage Array Model 618", and there it sat, for 2 months.

Two months later, in late March, it was decided that an attempt would be made to bring the array online. We don't have an idea of how much clue the people involved actually have, but they sure fucked up bigtime. Dave Murphy came out specially for the occasion, I'm sure he was a great help. Anyway, what ever they did enigma's system disk got trashed and it turned out they never even had the right cables or connectors to plug the array in in any case. So Enigma ended up going home with bobb who canilbalised one of the array disks and restored enigma from it's other system disk. Enigma was gone for over a week in what was the largest incident of downtime in the last 3 years.

About a month later, SUN decided after work by the committee, to donate an E450 to us. Anythign remotely array-related was forgotten about and all efforts were put towards the E450. As late as June, when the first in-person admin meeting (dubbed "The Architecture meeting") took place there was still the intention to install the array. Work on the E450 took priority though, and the array slid and slid. Amazingly, Prodigy (the E450) came online in October on Clubs and Socs Week, though the move was a little sudden.

With 2 committees now having failed to get the array online the new committee resolved to do it. Plans were put into action and a cable and brackets to fit the array were ordered from a crowd in cork. Again, there were difficulties ordering and getting them, but moridin managed to secure them by December.

In January, whilst testing the array, Mark somehow managed to set the PSU to 110V and plug it in. BANG , PSUs don't like that. That wasn't the only problem, the SCSI cable wasn't long enough anyway.

To consolidate hardware the admins decided to upgrade Enigma following an unrelated CPU failure which happened 6 months previous. A new PSU for the array was ordered, and over the course of a long weekend (to coincide with a CSD outage) in RFC technologies we swapped the motherboard in enigma, put the old one in the array got it up and running and made sure it worked. Aswell as working on Prodigy and Nanny at the same time.

The shitty mylex controller gave us some problems and was a tempremental bitch, but it fixed itself after a while or it could have been enigma. More problematic was the cable. Because of the ridiculous stupidity of whomever designed the array the large chassis meant that we couldn't get a cable to fit that length. We actually had to turn the chassis upside down put it right up to enigma and go through main parts of the normally closed chassis enclosure just to test it.

Needless to say this was not good enough. After a few weeks of searching for a supplier of SCSI cables cns found an excellent supplier of custom made cables based in the UK. Two new cables were ordered and they arrived in late July 2002.

In August two attempts were made to connect the array to enigma for testing, it didn't like it either time. After some intense RTFMing on my part the problem was traced to a jumper in the first instance and then a mysterious enigma-has-to-be-rebooted-twice effect which pixies discovered (he has psychic admin-fu, respect his powers).

Lastly, in order to initialise the array without it's missing disk (the one that was now in enigma itself) a spare 18Gb drive was needed. Cns posted askign for a loan from any brickies and John Bolger kindly turned up for one. Unfortunately this disk was SCA but cns came to the rescue by forking out for a SCA <-> HBA adapter and the array was initialised.

Finally in September 2002, after 3 committees, and 27 months the array was finally brought online on enigma following some last minute trouble from the ol' bitch:

colmmacc@enigma (~) $ df -hi
Filesystem    Size   Used  Avail Capacity iused   ifree  %iused  Mounted on
/dev/da1s1a   248M   204M    25M    89%    2953   60533     5%   /
/dev/da1s1d   6.6G   4.4G   1.7G    73%  242459 1495523    14%   /old-disk
/dev/da1s1e   496M    20M   437M     4%     334  126640     0%   /tmp
/dev/da1s1h   7.8G   5.7G   1.4G    80%  285978 1745636    14%   /usr
/dev/da1s1f   992M   667M   246M    73%   30244  223706    12%   /var
/dev/da1s1g   248M    53K   228M     0%     108   63378     0%   /var/tmp
procfs        4.0K   4.0K     0B   100%      47    4069     1%   /proc
/dev/ccd0c     16G    14G   914M    94%  209858 3972412     5%   /home
/dev/da3s1a    98M    42M    48M    46%    1304   11494    10%   /enigma
/dev/da3s1h   3.0G    42K   2.7G     0%      13  391665     0%   /enigma/local
/dev/da3s1e   126M   2.0K   116M     0%       1   16253     0%   /enigma/tmp
/dev/da3s1g   5.9G   879M   4.6G    16%  110165  664873    14%   /enigma/usr
/dev/da3s2e   4.9G    50K   4.5G     0%       2  645884     0%   /enigma/home
/dev/da3s2f    67G   1.3M    62G     0%     256 17630526     0%   /enigma/var
/dev/da3s1f   252M   4.0K   232M     0%       2   32508     0%   /enigma/var/tmp

And so it was, online, amen

touches wood

--Colmmacc

We're currently debating purchasing a new disk array. Because admins are machoists.

--lil_cain

Origionally from the Encyclopedia