WWDC2003 Session 626

Transcript

Kind: captions Language: en I'm really glad to be here today to talk to you about X Server aid and deploying it my name is Alex Grossman I'm a director of hardware storage for Apple and today I think we're gonna have a little bit of fun let me start with saying that there's an AI Qi IT manager that's actually out there there's a couple things that it's covered the first thing regarding storage is that today we're seeing storage needs grow a hundred percent yearly and that's not only because everybody's downloading from the iTunes music store but it's really because people are actually building more content and a little bit because they're downloading from the iTunes Music Store the other thing we're seeing is that managing storage is becoming a full-time job used to be where you deployed a server and you had a hard drive in the server and maybe you had another hard drive for storage and you kept deploying servers and everyone had storage and as you manage the server you manage the storage but as things move to more network-attached storage and more direct-attached storage and also storage area networking storage management becomes a full-time job and then the other thing is that people are starting to classify storage so there's things coming up called storage classes and we'll talk a little bit about what storage classes are and what they're doing is they're there they attempt to reduce the cost and complexity in the IT world and of course configuration planning that's the one thing that's key to efficient deployments because today a lot of people deploy storage and they do it in a haphazard way and it's not purposely it's just that storage needs are growing so fast how do we keep up and of course storage cost well Apple with X Server aid has really reduced the cost of storage and we've gotten a lot of a lot of heat from our competitors on that so what are we really doing about that well the first thing is X Server aid is the highest density in a 3u space in fact I have about 17 terabytes of storage up here on the stage with me and it's a little warm but it's powerful it feels really good and of course fast deployment in in easy to use tools make storage installation and management a lot easier so if you're gonna deploy something you want to deploy fast and one thing you have to remember about storage is that you deploy it once it you don't touch it every day I mean most of you you know you'll be in word you'll be in Excel you know if you're if you're coding you're going to use tools every the same tools but storage you set it up once and the only time you go back to it is when you have a problem or when you want to add more so if you have tools that are easy to use it just makes that faster and of course sex server Aid has been built to have the maximum versatility and flexibility to be able to get you over this storage class issue and use it in many different areas and then of course when it comes to configuring and planning there's just no substitute for hard work you just got to do it now there's there's methodology you should follow but you just have to do it and then of course extra raid it's inexpensive so what we're gonna try to go through today is some really simple stuff we're gonna introduce you to X Server 8 if you haven't seen the product get an idea why Apple did it I mean it's kind of interesting Apple has not really been in a high availability storage business until this year in fact I surveyed didn't launch it only launched four months ago and if you look at most other vendors of all kinds of hardware including servers they've been in a raid business for many years then we're gonna go through a little bit of speeds and feeds not a lot but just a little bit about the architecture because completely different than anything that anyone else has done and then we're gonna talk about some of the applications and IT some of its storage classing talked about planning considerations and then of course deployment and configurations and then I'm gonna give you a sneak peek at raid admin the newest latest version 1.1 which isn't out yet and so you get an idea of what we're doing to progress and rate admin is our management tool 4x serve raid you'll get an idea of how we're progressing that along so the first thing is an intro so what is next serve raid well there's a lot of them here these are the big boxes so they're a 3u high box that means they're five and a quarter inches tall they're nineteen inches wide fully loaded they weigh about a hundred and ten pounds so the first thing you remember when deploying one is don't do it yourself it's the buddy system and never put them in the top of the rack it's always a good thing and if you do don't stand under it and it's massive in size but it's also massive in capacity so it boasts today the highest capacity and a 3u system and of course that changes everyday hard drives get faster and bigger and of course Apple is always on the cutting edge with that we did that with ex serve we start it with 60 gig hard drives and now we have hundred and eighty hard drives same as with extra raid and then the other thing is that ex serve raid is a high availability storage system that uses an advanced architecture that is unlike what a lot of other people have done in fact if we really think it's a precursor to what we're seeing from competitors in the future uses ata hard drives and a fiber channel back end to the hosts connection it's something you're hearing more and more about but Apple was really the first Tier one vendor to pioneer it and put it in to a high availability state and then of course it's apples ease of use remote monitoring and management similar to what we've done with xserve and in every generation we think we're getting a little better and I'm gonna give you a sneak preview of our next generation of remote monitoring and management and then of course industry-leading value and we're talking two and a half terabytes of storage for around 10 grand it's not too bad now we designed XOR raid for non-stop operation a lot of definitions of what that means basically it's a high availability fault tolerant architecture that means if you have a power supply fail in the system there's a redundant one to take over that means if you have a cooling unit fail in the system there's redundant cooling you can build protectant raid sets if you have a hard drive fail there's another hard drive to take over for it and then we even have hot sparing of hard drives so that if one fails another one will take over rebuild and completely protect you and that's why we design the system and then of course we wanted to make it complete the thing we realized is that fibre channel storage was pretty expensive and it wasn't just the storage systems it was the infrastructure so we lowered the cost not only the storage but we lower the cost of the infrastructure we built a fiber channel PCI card that we sell at about a third of the price of what others sell it for and we can't figure out why they charge so much and then of course we include everything you need to complete it in to work with an ex serve a g4 or g5 and it's not just for the rack because not everybody has a rack installation and not everybody although I think everyone should run an ex serve not everybody's gonna run an excerpt and there are other deployment areas where it's server 8 is great such as in the video area so you can take a product like the extravert from extreme Mac that takes the X server and puts it on its side two and a half terabytes on your desktop or under your desk it's not terrible well let's talk about speeds and feeds well I was going to give you all the speeds and feeds and I decided you know what let's really look at the numbers and see what they mean because speeds and feeds are useless if you really can't figure out how they work in your application so we have fourteen independent drive channels what does that mean we say that all time there's fourteen hard drives in the next server ade I'll save you the county there's fourteen independent drive channels why do we do it well first of all we didn't want any inner drive dependencies why because anytime you have a dependency between one drive and another you limit the reliability of the system so if you view scuzzy before most raid systems are scuzzy or fiber channel all the drives have dependencies a scuzzy buses is a dependent bus they all share a common bus if something happens on one side of the bus guess what it can affect the other side of the bus well since every drives on its own bus there's fourteen independent ones people who do very high-end servers and storage systems and people who work in the video world have learned a long time ago that you put independent buses in for performance it's the reason they have different independent PCI buses on very high performance computer systems and also the bandwidth there's 1,400 megabytes per second internal bandwidth how do we get there that's a speed and feed number well if there's fourteen independent channels in each channel is 100 megabytes a second this is first grade math 100 times 14 is 1400 Meg's a second and that's a burst rate an individual spin-up and spin-down control who cares well hard times take a lot of power you can imagine that it's costing us a lot to run 17 terabytes here but hard drives taking a lot of power and to make it easier to deploy those in iraq you want to limit the amount of power surge you have or what they call the break current so you don't want to blow a breaker every time you turn on rack of X Server AIDS on big rack of them or two or three so what we do is we individually spin the drives up that also makes it a lot easier on your ups because this way if your drives if you ever have a an outage or UPS has to kick in and we have to actually spin the drives down and bring them back up the UPS doesn't take a big load so your time is longer and a passive data path through the midplane what the heck is that first of all what's a midplane well in raid systems unlike computer systems computer systems have back plants what's cool about the backplane everything plugs into it well on the raid system in the middle of it there's a backplane and because it's in the middle we call it a mid plane and what plugs into it well everything the hard drives plug into it the power supplies plug into it the raid controllers plug into it the cooling units plug into it it's all in the middle but what's nice about it is there's an independent path from every hard drive to the RAID controller going through the midplane but its passive it's like a connector and on other raid systems there's actually logic on the midplane is that good well it's good until the logic fails and then you have a single point of failure so we tried to eliminate single points of failure for for reliability and how about a dedicated 64 bit 66 megahertz pci bus to the drives so i talked about 1,400 megabytes a second internal bandwidth well I lied when we get to the raid controllers because each RAID controller has 533 megabytes a second of internal bandwidth so it's really about a thousand megabytes of internal bandwidth within the system and because we have we have an independent Co processing system in each one of the raid controllers events that happen on the PCI bus from the drive don't affect any of the performance of the system now the other area that we talked about are dual independent raid controllers so what are they well essentially it means that each controller only has to manage seven drives so we have two of them we have fourteen drives each one manages seven drives it's less work for the raid controller they're happy they get to work 9:00 to 5:00 but what that really means is that it leads to higher throughput the other thing we do is we put real cash on the raid controller when we say there's five up to 512 megabytes of cash in each RAID controller that means you have a useable gigabyte of cache in the system now what do I mean by that well this is an advantage to have independent raid controllers because they're independent there's no cache coherency between them we don't have to copy the data so if someone says I have a gigabyte of cache but it's coherent that means they really have 512 megabytes of cache and there's good and bad in that the good in that is that the cache is copied somewhere the bad in that is that your performance won't allow you to use all that cash you need and caches is really important to performance in some areas and I'll talk about that and then we also have a dynamic cache scheme what does that mean dynamic it changes we can change it I'll show you how we do that and we also performance tuning that's a combination of automatic and manual now that doesn't mean that because we do automatic that you know you guys don't know what you're doing but because we can do automatic we can react instantly you don't have that reaction time we can look at the data coming in and we can change the parameters of cache prefetch and really important aspects of the raid system on-the-fly by just looking back for the operating system and say hey what kind of data are you sending me so it's really important and then I think what's with the what's the most important is that we have a management coprocessor in fact we have dual redundant management coprocessors so we offload all those tasks of all that ease of use management all those events that come up the raid setup and the event logging we offer we offload that to a completely separate processor so the raid controller doesn't have to worry about it yet it knows about it and it can query it to find out what's going on so that's really important that's that's a little different than what most people do and then of course throughput so this is the real speeds and feeds throughput is a direct function of the complexity of data being delivered what does that mean that means that a spreadsheet that that's a thousand by a thousand lines is more intense than a word file no that's not what it means it basically means that the amount of data and the number of users will determine what the throughput is more users you're moving the heads you're using a cache differently than you would as a single user doing an enormous file so when we look at that the data complexity the algorithms that we build take in consideration the data complexity and we actually throttle the performance based on that and then the other thing that we do and a little unique as well is that there's a functionality in order to do parity raid parity would be protected raid like raid 3 and raid 5 as opposed to non parity raid like raid 1 which is mirroring but we actually do is we do multiple xor functions on the fly not one and wait and write the data we do multiple ones at the fly on the fly and we use the cache to store those so we get a higher level of performance and then the other thing that I think is really important is that we can use Apple RAID that's built in our operating system this is a nice thing about being close to the operating system we can use apple raid to build compound raid sets what are they so that means if we take a raid 5 and a raid 5 and put them together and stripe them we'll build a raid 50 and we can get the maximum performance out of that because in the operating system we can tune that and we're able to so it's really exciting so this is the architecture of X Server 8 doesn't look like the g5 stuff you saw the other day right looks a little different so basically what you have here is you have 14 hard drives on the bottom and I think we've all we've seen some of this in fact this is this is beautifully represented in the technical overview which is on Apple comm and I I advise anybody who can't sleep at night to download that 30 page document it's really good but there's 14 hard drives on the bottom you see that passive midplane connecting to what looks like two boxes with seven little chips and a couple big chips on there and each one of those boxes is a RAID controller and the seven chips you see on each side are actually controllers for the independent ATA hard drives so each hard drive is dumb the hard drive should be dumb they need to deliver what the controller's tell them so the controllers handle all the logic that needs to come off the hard drives all the data that comes off the hard drives is handled by the controllers completely independently of each other so across this PCI bus from that raid engine it can actually address all the hard drives exactly the same time it's pretty exciting because what it does is increase this throughput the other thing as I mentioned the raid engine that's where all the XOR functions that's where all your raid functions happen that's where your error correction happens and there's two of them they're completely independent and they're very fast and that little thing on the side there that little dim is a cash sim so that actually allows us to dynamically change the cache you notice this other little chip on there called the control center and the control center is our independent coprocessor and you'll notice that that one talks to the control center on the other rate controller so that each one of them can contain information about the rate set of the other rate controller if we had a failure of one of the control centers and then in the middle there's two fans or cooling units two power supplies and redundant battery backup so a pretty clean design in fact a lot cleaner and a lot more passive of an architecture than what others have done so how about deploying it well I'm talking about the applications the first thing you should see is how do you deploy it physically so I mentioned they need to be rack mounted or they need to have an enclosure that holds them what's really important in the IT world today is performance and reliability that's optimum right that's what you have to have so how do you how do you achieve that the first thing I can tell you is spend the money on a good rack why well let me tell you why first you want to make sure that with 14 spinning hard drives that spin at 7,200 rpm that you don't achieve what we call rotational vibration what does that mean we have we have desktops they have hard drives no problem well a hard drive in fact this was told to me a long time ago and it's still true a hard drive spinning with the head floating above it is like flying a 747 six feet off the ground at 600 miles an hour it's easy to do until there's a mountain a car you know something something's in the way so you can imagine that the heads actually fly above the disk and they're very close to about one-tenth the thickness of a human hair using what we call magneto resistive heads on the drives if you have a lot of hard drives in there together and they get vibration what do you think happens to the heads here's what I do my little wiggle they wiggle around and if they actually hit the platters that's a bad thing and the more vibration you have the more chance you're gonna have the heads are gonna wiggle around so if my data is actually on the disk and I have the heads wiggle around what's my chance of getting the data on the first rotation of the disk it's pretty poor so the less vibration I actually have the better chance I'm going to get of getting the data on my first rotation so what I want to do is I want to buy good racks and I want to rack up my systems where I'm not gonna get a lot of vibration so don't put them on your kitchen table you know don't sit him on the back room don't just lay them on their side if you're gonna put them on their side put them in a closure that'll hold them so that's the first thing you remember how to rack them and if anybody went to the Xserve session you'll learn that the Xserve raid needs a four-post rack you can't hang it from the front I mentioned 110 pounds if you hang it from the front don't call me unless you're gonna send me your insurance policy you can't hang it from the middle don't sit it on top this is the number one rack thing I'm like this I used to do this in my younger I T years you just rack the bottom component and stack everything above it until the screws shear off you know in the middle of the night at 3:00 in the morning okay you don't do that what X Server aid you want to keep it you want to keep it good so let's talk about the applications so a lot of things are changing in a storage market the first thing is changing of the requirements this is what I call the yeast abuse it used to just be about data protection everybody mirrored you got your hard drive you mirror it it's all about data protection I got my data nothing to worry about do I ever back up no probably not it used to be too expensive raid that's really expensive gotta buy all those hard drives used to be complicated anybody ever install a raid system other than one of the Apple ones it's really hard it's really hard it takes a while and it used to be that you had to follow their traditional views and most do you probably still follow those views so what are they these are the views that you get this is what the server companies left to sell you server based storage is best it costs less it's easier to deploy so there's the thing what we call captive and non captive storage and in the Apple world there was no captive storage that meant that the server company actually sold this storage along with the servers and before xserve raid and before xserve it was really hard to get a lot of storage in it and an apple platform so if you wanted it you were stuck in firewire drives are buying from some third party company and hoping for the best but the other storage companies and server companies they made it their business to sell storage in the server's because it was easier to deploy you don't have to worry about setting it up and also my server vendor knows what's best for me well I still believe that one because where the sir now throughput is driven by cost if it costs more it has to be faster Ferraris cost more than Volkswagens they have to be faster unless you throw them off a cliff if it costs more it has to be better how can you build it for less if it costs more it must be better supported those are traditional views throw them out the window technology changes everything and that's what's happened so this is the traditional view of storage this is the way most people deploy it you have internal server usually using software array it's a good way to start the throughput and the availability are low and the cost is low now there is an exception to that and that's X sir the throughputs phenomenal the cost is still low but the throughputs phenomenal internal server based Hardware rate now I would argue with this a little bit and say that usually the throughput slower in the excerpt case it's much lower than our software it and then there's external scuzzy hardware RAID storage the staple of the industry it's easy it's expensive and then there's external SAN attached fibre channel storage area networking the idea that I can centralize all my storage into one area these are traditional views these are the way people look at it so what's changing it something called storage classing what is storage classy well servers are moving to a scale out model we have more servers people talk about server consolidation I want to get rid of all this servers I have and do one that works until you're asked to deliver more services and as soon as you deliver more services you buy more servers and you're trying to scale them out because you're not buying these monster servers anymore and deploying everything on them although I do but you're not you're not doing that I mean next serve that I have runs everything but because it's fast but you're buying more and more servers and you put you're scaling them out the other thing is that the density of servers today has gotten greater well you can do it in X surf today you just couldn't do five years ago with a large number of servers you had have a lot and the other thing is that the management of having all these servers and all these storage devices is out of control because especially if you bought servers from a lot of different people you manage them all differently and a lot of different operating systems it's really kind of crazy and then direct-attached storage was the only alternate now I'll bet you that a lot of people out here have installed network attached storage of some case at least at least some of it and probably even storage area networking and those numbers are increasing everyday that's one of the changes now the other thing about storage classes is that people spend a lot of money on storage they always have it's been an interesting thing I buy a server for X dollars I spend three times as much on storage that was always the role and they always had to buy the same storage they always bought really expensive scuzzy storage if that's what they bought because they had to use it everywhere because it was the same and it was there and they're to keep buying it every time they did a terabyte they paid X dollars for it or a gigabyte in the old days they paid X dollars storage classes are changing that they're realizing that you can you can deploy different types of storage for different needs G and also this this incredible thing happened anybody who's who's older like me they'll remember something called HSM hierarchical storage management they renamed it Nearline storage what's old is new again and its really getting a lot of play and I'll talk about that so as you deploy you look at these traditional views and something's changed and this is what's changed and you're gonna see this from other vendors happening but Apple was the first one to put the shot across the bow at this one external ata based hardware RAID storage that's fast and affordable and guess what it has the high availability needs that you use in business critical applications as well and you get that for free and so when you look at this picture you say how can something that cost is as little as internal storage based on what people will do with scuzzy deliver the performance equivalent to what you get when external fibre channel based systems that cost five times as much well it can happen and I'll show you why in a couple minutes so here's what it's really all about IT has to look at doing more with less I don't think there's anyone in the room here who said who's had a boss has said you know what let's let's deliver less services this year let me give you tons of money go buy whatever you want and only work five days a week in six hours a day I say a bunch of nods everybody's doing that right okay it's not happening like that it's like this department needs this this department needs that I need to deploy it now it needs to be faster it needs to be better I'm bringing ten people online I need you to work Sunday to install that server and while you're at it put the storage in the top of the rack as far as you can because there's no room to buy a new rack that's happening and then there's backup so how do you backup 17 terabytes of storage what I have on this stage how long does that take can you do it in the eight hour window that no people are working at your company say yes right so people are looking at different ways of doing it they're saying you know what what we can do is as we bring more storage online we're looking at new ways so things like Nearline disk to disk backup have we all heard that term dist did this back up what why is that because it's fast right and then I can back it up I have another copy and at my leisure I can have those slow tape drives take it offline because I can tell you right now you can backup all 17 terabytes that are on this stage in one hour for about a million dollars it's about how many tape drives it it takes but all 17 terabytes that I have up here to do it near line to get another 17 terabytes with Apple stores request you about $60,000 see the difference wonder why people are buying Nearline so that's what's really happening okay I know this will work okay so I mentioned storage classes so there's three classes of storage most people identify the first class is high availability highly redundant and very expensive where do you want to use this well some people used to call this mission-critical I tend to call it mission-critical but everything you do is mission-critical if you're doing code and you lose that code that was mission-critical to you especially if you had deadlines next week if you're deploying a web server and you lose all that content that was mission-critical but where do you really see this used I used my favorite example is the ATM machine when I put my card in I want to make sure that it has my balance or lies and gives me more money but it has my balance that's in there so that's mission-critical I want the people who run the banks to spend a lot of money on storage but you know the transactions that I did the day before they're not as important to me because it's my balance I care about so I want to make sure that that becomes class to storage it's still protected but it's business critical it's not mission critical and that's like running a website that's business critical stuff that's your email it's business critical stuff it's highly redundant and it's available but you can accept some downtime as long as you can save the data and the data still there and then there's raid 0 raid 1 j-bot low-cost and risky you got two hard drives your mirror and you've run your whole business on it it's a good thing so this is the way storage classes get deployed and so in the enterprise apps I use that Big E word up there and database financial there's what they call online storage in class one there's also what they call Nearline storage what the heck is in your line mere line means it's not as fast it's not as redundant but it's there and safe and if it does have a little bit of downtime I can recover from it quickly and I know my date is still there and then when you look at business apps for the most part you start to see that they're using different classes of storage you'd never deploy an enterprise app on Nearline alone you'd always have some online but you would find a business app running on your line and you'll find ecommerce and web apps running on Nearline so this whole class thing of storage is is is starting to do something to the world why well imagine if I have to buy Brand X storage that I've used for all my mission-critical applications and it cost and this is not unrealistic to $50,000 a terabyte and since I've deployed five terabytes of that in my organization I give it to everyone everyone's home directory runs on that storage that cost a lot of money that's crazy when I can buy two and a half terabytes of X or of raid and I can let all my people have ten megabytes of iTunes music store I can have all my people have enough room to do their work I don't have to bug them and I can put their home directories out there and it's safe and it's secure but guess what it's not the transactions that run the company so that's what this classing of storage is really doing so how do you plan for this because we're gonna talk about the real fun stuff in a minute but how do you plan for well the first thing is you have to have to plan take into consideration the capacity to throughput and availability and also the cost and value so when you're doing capacity planning my rule is buy more than you need because if you think you bought enough today start working on that budget for next year because you're gonna buy more and make sure you have a solution that scales how about throughput you really have to look at this what does it throughput you need again if you get throughput for free by the highest throughput you can but it's all about configuration and it's all about the availability mission-critical no downtime business critical manageable downtime archived and near line so you have to take all those things in consideration and build a big chart and decide what you really want because there are times when you want to buy storage that's even beyond what excerpt of raid can deploy so here's what people usually will determine use to determine their storage the first metric that you always get this is the financial people always say this one what's the cost per gigabyte so currently I DC and Gartner they say if you believe them they say it's $30 a gigabyte for high availability storage X surf rate is four dollars and 36 cents so it's chocolate up for value what are the hidden costs that you get with other companies well the hidden costs are things you have to look at what's the service and support cost what's the additional software you have to buy both to set up and manage manage the system what are the cost of cables what are the cost of power because it draws a lot of power those are all the hidden costs that you have and also what about expansion when I expand does it cost me as much to expand as it cost me to buy in or is it cost more does it cost less and will the company be there in a couple years those all storage planning tools how about deployments so here's the scenarios we always see an apple server base deployment a single server a single raid I think we could all do that one a single server multiple rates I'd like everyone to do that one multiple server single raid or multiple servers multiple raid so let's look at the cases how about a mixed OS platform is it possible to do that what X Server aid you could have multiple servers multiple raids single server single raids all that's possible how about a virtualized mix platform what is that why would you want to virtualize we're going to talk about that so let's build some scenarios you ready I built a couple already here so here's three typical scenarios that I built the first one is I took Mac OS 10 server on an X surf and an X server 8 it was a pretty easy and simple thing to do the next scenario I did is I took that same Xserve raid and I attached two excerpts to it I took another scenario in the middle kind of a funny one I have this server from a company called IBM and one from a company called Dell with an X surf and I have two raids and I'm sharing the storage across all those platforms and in scenario 4 it's kind of interesting I want to call virtualized storage so I have a third party product called a chaparral radar PS or a provisioning server there's the word you want to remember provisioning I have a couple X serves and a couple extra raids and I basically added additional capabilities texture of raid and kept that low cost storage deployment but add it true enterprise class performance and reliability and scalability to the Xserve raid without having to buy some very expensive stories that you that you could use to do this so let me show you how they work first thing is let's look at an X or a if you've seen one other than here it's pretty standard cleaned in the front I mentioned that there's 14 hard drives people always say we went a little overboard with lights I say we need more drive activity and health activity on each drive each hard drive what is health we actually measure the drives for are they good are they in a pre fail condition that means are they gonna fail or have they failed so the light goes green amber and red red is dead Green is good amber means that we've used a technology called smart to actually look at the drives and say what is the health of that drive if it's gonna fail we're gonna pre fail it and we're gonna warn you under send you an email and say hey this drive is gonna fail we're gonna if you had a hot spare in the system we're gonna rebuild it a hot spare C of full availability it's an indicator on the front gives you a good idea there's fiber channel link indicators just like I next serve we want to make sure that the links that you have to your server are easily available why do we put those on the front there's a lights on the back right well this this comes from the classic example of ease of use anybody have a rack anybody ever install a rack okay a few people here a lot of people when you have a rack you really want to make it really clean in the back and run all the cables down the side it's gonna be beautiful it doesn't happen guys looks like a spaghetti jungle in there and so there's a chance there's always that possibility that you could come along and unplug a cable by mistake and if you don't have an indicator on the front to show you you may not notice that until you walk to your desk and get an email door page so it can happen it can happen really easy so we put a lot of indicators on the front including link indicators for fiber channel because I will tell you when you disconnect the fiber channel cable you don't get called by our admin utility first you get called by all those people who just lost their their connection for their data and usually those pages of the bad ones so we put a lot on the front you'll notice just like XAR we have a system identify our button an enclosure lock and we added something else called an alarm silencer because I can set up my xserve raid that if I have a failure and it's in a - side configuration or under the desk or on the desk or in the CEOs office I can actually have it beep at me beyond the visual indicators and you notice this one didn't beep when I pulled that power cord out because I shut it off that's the fun thing you can do so let's look at the back of it this is really how you deploy it right pretty clean Apple design - big fans and power supplies they're redundant two redundant cooling modules in the middle two independent raid controllers two backup battery modules a power a power indicator an alarm mute button on the back as well and a system identifier and so you look at it you say pretty easy there's nothing you can't get to if I go back and look at the front I can get to all the hard drives if I look at the back I can get to all the components so the design of xserve raid is a high availability design meaning that the only thing that I can't change in the field is that midplane but remember that was a passive data flow so the only way to break that is to physically Bend the connectors bad thing guys or you're gonna hurt the metal maybe look at the bezel on this little hard to hurt right it's aluminum so you're really not going to be able to hurt anything and you're gonna be able to deploy this thing anywhere you need it in an easy manner so that most of the connections you're gonna look at are going to be the fiber channel connection the Ethernet connection and the serial connection why well that's we're going to talk about so here's scenario one I call this one standard operating procedure you take an X server eight you take an X surf you put our fibre channel card in the Xserve it has two ports on it you take the two cables that come with the card you plug them in the Xserve raid and you're done it's easy two and a half terabytes ready to go the extra raid comes out of the box it's configured for two raid 5 sets so you have about 2.1 terabytes available to you protect it raid you go to Disk Utility just like you would any other hard drives either build one you know one raid 50 set by striping the two together or two raid 5 sets there on your on the desktop you serve on your clients you're done pretty easy you tell your boss that took you two days software that comes with it Apple raid admin what is raid admin it's a utility of apples design to make it easy to deploy these systems so you can install them using this utility you can manage them using this utility what's different about it what's different about it it's web-based it's Java based so read that Java based platform independent based you can use it anywhere in the world and you know what's cool about it you can deploy a raid system literally in a minute and I'm going to show you that and then the other thing about this this installation because they just have have strong back because you got to lift the raid ok how about this one scenario 2 to X serves and an X server 8 what do we do here why do we do this well what we did is we centralize the storage so that's what I did in this case I took my X Server aid I didn't use a switch in this case I just went directly from the extra raid to the 2x serves I took centralized protected storage high availability storage and delivered over a terabyte to each X surf now what's really cool about this is that that storage is more redundant than the server in this case and if I ever had a problem I can move that storage over to another server so I have a highly redundant configuration and it's inexpensive now about scenario 3 that's the one I built over here actually some nice people here built it for me we took an X surf we took an IBM 330 server adele 1550 server an X server aide we took Apple rated min and I configured it with 4 raid sets 3 raid 5's in a raid 1 and each server has a raid 5 set and the Xserve has an additional raid 1 but-- so contrary to popular belief you can actually boot your exer off the next serve raid it just works and i'll show you exactly how we did that so what I did is I built a raid set this took me about 10 seconds and I took three hard drives and I built a 360 gig raid set fully protect it and I gave it to my ex surf then I built another raid set and I gave it to the Dell dude sorry and then because someone said I had to keep that Dell in at IBM 3:30 and my budget for buying Xers to replace those pigs took a little while I also gave a 360 gig hour 540 gig rate set to my IBM 330 and then the last two hard drives that I had I made a boot that was mirrored for my ex surf now I had two drives left over and I made those hot spares so that all my rates that sir globally protected against a failure of any drive it's pretty cool so scenario for was virtualized and I would go through this whole thing but this is really complicated so I'm just gonna hit the top lines of here but this is something that a lot of people are really looking for this competes at the highest level with storage systems that cost well hundreds of thousands of dollars in the same capacity with some storage systems you'd have to spend hundreds of thousands of dollars to get this capability so we took two excerpts we took two X Server AIDS we took a Vic cell fiber channel switch a 16 port Vic cell fiber channel switch and fortunately I didn't have enough room to put it in the front it's in the back I took a third party product a provisioning server from a good company called chaparral called a raid RPS I took Apple raid in min in chaparral LUN provisioning software and I built a completely virtualized storage system we call this ass and you might have heard the time up technology and so what did it do well the radar PS sees four point three two terabytes and the radar PS allows me to build a thousand 24 Luntz what's a lon a LUN is basically think of it as a partition so that if I had a lot of users maybe not 1,024 but if I had a hundred users and they were all graphic artists I could break that storage up to them equally and I probably would leave a pool of that storage unused or possibly just dead Kait for something that I'm not using today and I could deliver that up to 126 servers or 126 users and that's scenario where I have the two servers I've given a two terabyte one to each one of these servers so let's take that let's take that scenario let's decide that one server handles my graphics department and one server handles my video department my graphic guys are slacking off they're using a terabyte of storage my video guys on the other hand they've been downloading QuickTime flicks for a long time here and there they really are maxed out I'm running at two point one six terabytes I'm telling them every week throw your stuff away what can I do with radar PS and with xserve ray I can go ahead and re-partition or re-engineer that software so that I can re provision it so that now my video people can have let's say three terabytes and my graphic artists only get one terabyte and I can do that dynamically essentially on the fly with a restart of the Xserve they can have a complete change in what they need and so that's really high high level capabilities that are very difficult to get with other systems now the other thing that's what we call dynamic growth of Luntz the other thing we can do here is we can do things like snapshots so I mentioned backup right if that graphics department is moving a terabyte a day I probably don't have a tape drive to move a terabyte a day but I can snapshot that or make a point in time copy of that data so that someone can get it off there at their leisure and that's all available with Xserve raid using low-cost storage with a provisioning server and again the provisioning server we have here is the provisioning server from chaparral now managing ones and performance tuning with raid admin so I'm going to give you a quick look at a sneak preview of a new version of rated min that's coming out let me show you how this works so can I have a demo two up please okay so this is rate admin our little piece of software to manage the raid system so the first thing you're gonna notice is that all my raid systems here happen to be on my network and they're on a subnet now to make it easy because I didn't want to confuse myself I took some of them off the subnet because I didn't want to see them but you'll notice that raid admin uses rendezvous and so I can just grab one of those systems and I happen to know the password so I'm not gonna tell everybody gets public here anybody who's on my subnet here and I know the password and I'm gonna log into and it's gonna go out there and it's gonna hit one of my my raid systems and give you all the information about it now remember pretty cool cocoa app right it's Java guys swing Java so the things you can do with this stuff is just incredible first thing you'll notice I get all the information about about the array is down here and if anybody's sat in the Xserve they saw something called server monitor first similar application try to make the applications have a similar look and feel and you'll notice I can scroll this up and down and use it as I need to you'll see why in a second here I get all my information about my raid systems how long they've been up how long they've been running I can look at the arrays in the drives I'm looking I'm showing a race here I happen to have a raid 5 set here that's a terabyte you all called raid raid 1 or it's raid 5 but it's actually array 1 and then on this side I don't have an array but if I want to see what drives I have available I can click through those drives and I can see that each one of them is about 172 gigabytes I also get an interesting view when I click here and I notice that my drive cache is disabled on this particular raid set hmmm how did I do that I can look at the components so I mentioned that the system's have different components this one has a power supply and guess what it's ok I've got a little green light cooling modules there's my speed Wow those things are running guess what we actually timed them so they run the same sis rpm and my controllers I can look at those and you'll notice that my batteries aren't installed so I cheaped out and didn't buy batteries rule 1 always buy backup batteries how about fiber channel I have a connection a fiber channel I can look I've

  • links I can see this this little

worldwide name down here it's only 48 digits it's really simple and easy to remember and my network so why do I need network because we do what's called out-of-band management what's connecting my raid admin system to my raid systems here is Ethernet I'm not using the bandwidth on fibre channel I'm doing this in what's called a tab and so it's really a simple connection but you'll notice I have a link that's down because I wanted to get an amber light in here to show you what it looks like I have a link that's down it's a warning it's warning me telling me Alex you only connect to one of those Ethernet lines so if your switch or router on your Ethernet goes down you're not gonna be able to see your raid system when you're on vacation and Fuji next week so you want to have them both on there you'll notice I also have some speed and configuration on here and then something brand-new and rated min 1.1 is an event log so I can actually save the events and know what I did because believe it or not we get busy it's one of those funny things that happens so this is basically a system let's do something funny here let's create a raid system is that cool so everybody's used to keychain right so I have that in my Java app here let's let's see if I remember my password not gonna tell you it's Alex you notice the first thing that happens here is it shows me with this little lock right now that I'm keychain enabled now this is really where rated min varies from every other set up utility that you've seen now if anybody's looked on the web you've seen this screen but this is really where it varies I'm an IT person but I manage a lot of different things I don't just manage my raid systems every day if I do I got a pretty cushy job if they're X Server AIDS but I don't remember what the difference between raid 3 and raid 5 is so I just look up here and it tells me and I can build a raid set knowing I need three or more drives how do I do it this is really hard so I pick a drive and I select them oh my fingers getting tired oh there's a lot of work I select my drives and then we have this little thing is called raid now what does that mean it's background initialization so I can click that and that allows me to actually use the raid set instantly so what I'm going to do is the raid set and it's gonna take a long time it's actually gonna take 24 hours to build a raid set because what I'm gonna do is I'm gonna check to a surface analysis on every single Drive in the system to make sure there's no failures of those drives because you don't want to build a rates that I'm bad drives and you'd be surprised if you ever see other raid systems and they build a raid set in like five minutes or ten seconds you go oh they did a lot of checking on those drives didn't they they made sure those were perfect they just looked at the signature and wrote it on there we're gonna do a complete surface analysis of the entire drive back and forth reads and writes and we're gonna spare out any bad areas we're gonna take care of it all but in the meantime you'll still be able to use it and then I want high-performance and know what it says here it says that if I turn on drive cache I should have batteries or UPS that's a warning guys cuz you can lose data if you don't and then I say create I'm done I just build I just built a raid 5 raid set on the fly in a matter of seconds that's it it's easy so what else is really cool about rated in it how about settings so the one thing I can do an assistant effect what I'm gonna do here is I'm gonna grab another one that has some different stuff on it so I can see some different settings this is actually the system I'm using right now in the in this configuration with the provisioning server this one has some cool settings on it so I'm gonna go in and take a look at the settings and what you're gonna notice first is the system information so this is the radar PS rack that's this rack over here and this is raid 3 I can synchronize my time I can change my management passwords remember I mentioned about the audible system alerts I can turn them on or turn them off I can restart automatically when there's a power failure anybody like to get called in the middle of the night when there's a power failure because the raid went down you got to go push the button back on forget that no more doing that I mentioned the network settings I can change from manual to DHCP fibre channel if you have an existing fiber channel network this is the most important thing you're ever gonna play with or never play with because in most networks you have to set the speeds we've gone through the topology and speed setting and made automatic and so now you can go ahead and just basically run with it and we'll go and investigate your network and we'll configure that the Xserve rate for your network and here's where I was mentioning provisioning in Lund settings so how do I serve different systems well what I did here is I basically took two world wide port names remember there are only forty eight digits easy to remember stuff did you remember those was that the right one I'm not sure okay those are the two ports that are on my fibre channel card that's in my provisioning server and I've given a raid five set with one through seven drives to this pour and I've given another raid set to this port and I've enabled one masking that means everything else up here can't see that's on the same fibre channel network can't see it only those two ports now this software from other people can cost a lot of money it can cost tens of thousands of dollars but with xserve rates include it it's easy and it's simple to configure we think it's easier how about performance anybody ever had a raid system and you know you read these things 3,000 megabytes a second 40 million iOS per second and then you put it on you get like 2 megabytes a second and you know you found that your USB dongle is faster part of that is part of that is marketing but for the most parts is because most raid systems you have to be a genius to tune them so I mentioned there were there were automatic and there were manual settings for our raid systems so first thing we do is we do all the automatic stuff for you we tune it based on the way you tell us you tell us you a 512 megs a cache in there you tell us that you have battery backup you tell us you're gonna do raid 5 and we tune it but we also give you other settings I mentioned drive cache I can turn the drive cache on to the drives so if I didn't have a UPS and I'm willing to take a performance hit I don't want to turn drive cache on because I'm scared that if I have a power failure I could lose some data now we do have in Mac OS 10 we do of course have journaling protects you against these things but there's still the possibility that you could lose a lot of data because there's 8 megabytes of cash per hard drive tons out by 14 and that's bigger than your spreadsheet or my spreadsheet so we may we may want to turn it on or turn it off I always suggest hooking to ups and running drive cache on because the performance can be a hundred percent better how about right cache what is right cache this is kind of a misnomer enable and disable because what we actually do is we have two levels of right cache one is right through that means we don't really cache anything at the rate controller everything that comes through we just write it through it and so the data flows quickly to the drives so the computer in this case let's call it a neck serve as soon as it sees that the data is at the drives it's free to send more data so it gets this right complete if we do something called write back cache which is what we're enabling here we actually tell the system the Xserve in this case that we've written the data to the drives when we really have it we've written it into a cache buffer and that's why we have those battery backups to back up that cache in case for some reason we lost power at that case but that could also increase your performance because imagine if you want to have a lot of people hitting the raid system at once what do you what do you expect it to do you expect the Xserve to send a lot of commands to the Xserve rate at once and the only way you can accept those commands is to start caching them up and also you have to remember that the data that's coming off the hard drives can come either in order or add or order to the RAID controller and so what we want to do is want to cache all that data up so we can get as much data delivered to the Xserve at once we want to open the floodgates so this is important now I mentioned that we had dynamic prefetch in the system so why do I have a prefetch setting for read and what is prefetch well essentially what prefetch is is that if I let's take let's take the alphabet A through Z if I tell you that I want to look at a and B your much gonna guess that the next thing I want to look at is C and B so in a streaming application or I feel like me I'm probably go look at Z next but if you're in a streaming application it's a pretty good guess that I'm gonna prefetch a lot of data ahead because I'm gonna anticipate what you're gonna get next so I said that we're dynamic so what we do is we look at the data that the driver is giving us from our exxor from the OS and we say what kind of data is it if it's streaming data what I do is I open my prefetch window and I start grabbing a lot of data because I know if I'm streaming a QuickTime movie the rest of the QuickTime movie is what you're likely gonna want to see it's the data you want but you notice here I have three settings for that Y if I just can dynamically change this it's the it's the size of the dynamic changes we can make so if I set it to 128 you give me the rate controller a lot of range I can go anywhere I want if I set it to 1 I can only do a little prefetch why why would I need to change it why would you be smarter than me well the reason is you may not want to watch that whole QuickTime movie you might be scrubbing a timeline you might have a hundred users and everybody wants to share a little of that data so you know your data patterns better than I do so what's cool about this it's dynamic you run a test based on your data patterns you're using your users you hammer it you hit it a lot you don't see the performance you want what do you do it's really hard here you go over and you push the button run the test again so I'll give you some general rules for this anybody here of iOS per second small files are iOS per second large files are usually megabytes they usually work against each other if you have large files video content put the prefetch as high as you can if you have tons of users lots of users they do small files turn the prefetch down you'll help the system and in all cases keep right cache enabled so that's basically a real quick tour of rate admin now let me show you something else can I go to demo three please so this is I don't know what kind of system this is this is can't remember one of these things it's called a PC so this is actually r8 admin running on Windows now this isn't something we support but I mentioned r8 admin is Java well what I'll even show you something that I didn't show you on the Mac which I could have shown you is contextual menus raid admin actually has contextual menus as a Java app so it's pretty cool all the capabilities of raid admin on the Mac are available on other platforms such as Windows such as Sun anything that runs Java 1.31 or better so you can actually manage your system although not necessarily a recommendation you can actually manage your system from a Windows system so when you're stuck in Fiji on vacation and you you forgot to bring your power book with you you can still manage manage to rate admin ok great can we go back to slides please cool so here's a typical LUN management configuration when things I like to like to talk about is how we actually can do basics and functionality in Xserve raid with rated min without really having to spend any money on sans software pieces and parts and still do what's true to a sand-witch is consolidating your storage and deploying it to many servers so I showed you how we had four servers basically deployed here or three servers deploying four raid sets so let me show you how it actually would look we break it up into 4x serves in this case I'd build 4 raid sets using that one masking I saw and now I get the advantage of global hot sparing using that tool by just putting the worldwide port name of each one of those exurbs and and addressing those to each one of the raid sets simple so we can deploy that you can do it in seconds in this case I used a fiber channel switch to allow me to go from 2 to 4 now the beauty of this is that not every computer can see every raid set only the computer or only the Xserve in this case that is attached to its rate set can see it's rate set the other date is protected away and easily I can change who sees what data this is a really simple deployment so let me wrap it up there's five hints I'll give you for deployment first thing to remember is it's easy as one two three almost the first one is read the manual what there's two manuals for X server eight one's a hardware manual it's 91 pages in it's excellent the second one is a software manual for raid admin do you really need to read it I would read it plan ahead this is the biggest problem that we all have with it with deploying raid is we either by by too much or by too little but we never deploy it right so that goes along with my next one by too much because you're gonna ask for more later okay that's the little marketing pitch okay then number four read the manual again okay and then number five go deploy it so it's really simple if you have more want to get more information you can contact myself or skip Levin's and of course there's reference libraries for Xserve Xserve raid a lot of information online for that and i'd like to open it up to any questions that anybody has if we can do that oh I did want to talk about one more thing - two more things actually in the in the Xserve in the Xserve deploying Xserve session yesterday we're talking about ways of making it easy to deploy multiple servers and Doug Brooks mentioned something called the introvert which is a little device here I call it the little buddy for the for the X Server aid and it allows us to take a cartridge or what we call a drive carrier from an X Server or an X Server aid and bring it up onto a g4 g5 or even another excerpt we want to really easily because it makes it hot pluggable using firewire on the system there's two really cool things you can do one is I mentioned mirroring on the X Server aid so if I mirror let's say I have seven drives on one side of my xserve raid and I do a mirror of one of those drives and I don't tell it that I want to just mirror one drive to the next drive it'll automatically mirror all seven drops why would anybody ever use that well how about if I have to make an image and deploy to multiple X surfs I have seven boot drives and if I want to check them I take my extrovert am i introvert actually and I mount it up on my g5 g4 and I can actually see what I got so I can have seven copies easily so that's my little tips and tricks thing for a X or a you