WWDC2003 Session 626

Transcript

Kind: captions
Language: en
I'm really glad to be here today to talk
to you about X Server aid and deploying
it my name is Alex Grossman I'm a
director of hardware storage for Apple
and today I think we're gonna have a
little bit of fun let me start with
saying that there's an AI Qi IT manager
that's actually out there there's a
couple things that it's covered the
first thing regarding storage is that
today we're seeing storage needs grow a
hundred percent yearly and that's not
only because everybody's downloading
from the iTunes music store but it's
really because people are actually
building more content and a little bit
because they're downloading from the
iTunes Music Store the other thing we're
seeing is that managing storage is
becoming a full-time job used to be
where you deployed a server and you had
a hard drive in the server and maybe you
had another hard drive for storage and
you kept deploying servers and everyone
had storage and as you manage the server
you manage the storage but as things
move to more network-attached storage
and more direct-attached storage and
also storage area networking storage
management becomes a full-time job and
then the other thing is that people are
starting to classify storage so there's
things coming up called storage classes
and we'll talk a little bit about what
storage classes are and what they're
doing is they're there they attempt to
reduce the cost and complexity in the IT
world and of course configuration
planning that's the one thing that's key
to efficient deployments because today a
lot of people deploy storage and they do
it in a haphazard way and it's not
purposely it's just that storage needs
are growing so fast how do we keep up
and of course storage cost well Apple
with X Server aid has really reduced the
cost of storage and we've gotten a lot
of a lot of heat from our competitors on
that so what are we really doing about
that well the first thing is X Server
aid is the highest density in a 3u space
in fact I have about 17 terabytes of
storage up here on the stage with me and
it's a little warm but it's powerful it
feels really good and of course fast
deployment in in easy to use tools make
storage installation and management a
lot easier so if you're gonna deploy
something you want to deploy fast and
one thing you have to remember about
storage is that you deploy it once it
you don't touch it every day I mean most
of you you know you'll be in word you'll
be in Excel you know if you're if you're
coding you're going to use tools every
the same tools but storage you set it up
once and the only time you go back to it
is when you have a problem or when you
want to add more so if you have tools
that are easy to use it just makes that
faster and of course sex server Aid has
been built to have the maximum
versatility and flexibility to be able
to get you over this storage class issue
and use it in many different areas and
then of course when it comes to
configuring and planning there's just no
substitute for hard work you just got to
do it now there's there's methodology
you should follow but you just have to
do it and then of course extra raid it's
inexpensive so what we're gonna try to
go through today is some really simple
stuff we're gonna introduce you to X
Server 8 if you haven't seen the product
get an idea why Apple did it I mean it's
kind of interesting Apple has not really
been in a high availability storage
business until this year in fact I
surveyed didn't launch it only launched
four months ago and if you look at most
other vendors of all kinds of hardware
including servers they've been in a raid
business for many years then we're gonna
go through a little bit of speeds and
feeds not a lot but just a little bit
about the architecture because
completely different than anything that
anyone else has done and then we're
gonna talk about some of the
applications and IT some of its storage
classing talked about planning
considerations and then of course
deployment and configurations and then
I'm gonna give you a sneak peek at raid
admin the newest latest version 1.1
which isn't out yet and so you get an
idea of what we're doing to progress and
rate admin is our management tool 4x
serve raid you'll get an idea of how
we're progressing that along so the
first thing is an intro so what is next
serve raid well there's a lot of them
here these are the big boxes so they're
a 3u high box that means they're five
and a quarter inches tall they're
nineteen inches wide fully loaded they
weigh about a hundred and ten pounds so
the first thing you remember when
deploying one is don't do it yourself
it's the buddy system and never put them
in the top of the rack it's always a
good thing and if you do don't stand
under it and it's massive in size but
it's also massive in capacity so it
boasts today the highest capacity and a
3u system and of course that changes
everyday hard drives get faster and
bigger and of course Apple is always on
the cutting edge with that we did that
with ex serve we start it with 60 gig
hard drives and now we have hundred and
eighty
hard drives same as with extra raid and
then the other thing is that ex serve
raid is a high availability storage
system that uses an advanced
architecture that is unlike what a lot
of other people have done in fact if we
really think it's a precursor to what
we're seeing from competitors in the
future uses ata hard drives and a fiber
channel back end to the hosts connection
it's something you're hearing more and
more about but Apple was really the
first Tier one vendor to pioneer it and
put it in to a high availability state
and then of course it's apples ease of
use remote monitoring and management
similar to what we've done with xserve
and in every generation we think we're
getting a little better and I'm gonna
give you a sneak preview of our next
generation of remote monitoring and
management and then of course
industry-leading value and we're talking
two and a half terabytes of storage for
around 10 grand it's not too bad now we
designed XOR raid for non-stop operation
a lot of definitions of what that means
basically it's a high availability fault
tolerant architecture that means if you
have a power supply fail in the system
there's a redundant one to take over
that means if you have a cooling unit
fail in the system there's redundant
cooling you can build protectant raid
sets if you have a hard drive fail
there's another hard drive to take over
for it and then we even have hot sparing
of hard drives so that if one fails
another one will take over rebuild and
completely protect you and that's why we
design the system and then of course we
wanted to make it complete the thing we
realized is that fibre channel storage
was pretty expensive and it wasn't just
the storage systems it was the
infrastructure so we lowered the cost
not only the storage but we lower the
cost of the infrastructure we built a
fiber channel PCI card that we sell at
about a third of the price of what
others sell it for and we can't figure
out why they charge so much and then of
course we include everything you need to
complete it in to work with an ex serve
a g4 or g5 and it's not just for the
rack because not everybody has a rack
installation and not everybody although
I think everyone should run an ex serve
not everybody's gonna run an excerpt and
there are other deployment areas where
it's server 8 is great such as in the
video area so you can take a product
like the extravert from extreme Mac that
takes the X server and puts it on its
side two and a half terabytes on your
desktop
or under your desk it's not terrible
well let's talk about speeds and feeds
well I was going to give you all the
speeds and feeds and I decided you know
what let's really look at the numbers
and see what they mean because speeds
and feeds are useless if you really
can't figure out how they work in your
application so we have fourteen
independent drive channels what does
that mean we say that all time there's
fourteen hard drives in the next server
ade I'll save you the county there's
fourteen independent drive channels why
do we do it well first of all we didn't
want any inner drive dependencies why
because anytime you have a dependency
between one drive and another you limit
the reliability of the system so if you
view scuzzy before most raid systems are
scuzzy or fiber channel all the drives
have dependencies a scuzzy buses is a
dependent bus they all share a common
bus if something happens on one side of
the bus guess what it can affect the
other side of the bus
well since every drives on its own bus
there's fourteen independent ones people
who do very high-end servers and storage
systems and people who work in the video
world have learned a long time ago that
you put independent buses in for
performance it's the reason they have
different independent PCI buses on very
high performance computer systems and
also the bandwidth there's 1,400
megabytes per second internal bandwidth
how do we get there that's a speed and
feed number well if there's fourteen
independent channels in each channel is
100 megabytes a second this is first
grade math 100 times 14 is 1400 Meg's a
second and that's a burst rate an
individual spin-up and spin-down control
who cares well hard times take a lot of
power you can imagine that it's costing
us a lot to run 17 terabytes here but
hard drives taking a lot of power and to
make it easier to deploy those in iraq
you want to limit the amount of power
surge you have or what they call the
break current so you don't want to blow
a breaker every time you turn on rack of
X Server AIDS on big rack of them or two
or three so what we do is we
individually spin the drives up that
also makes it a lot easier on your ups
because this way if your drives if you
ever have a an outage or UPS has to kick
in and we have to actually spin the
drives down and bring them back up the
UPS doesn't take a big load so your time
is longer
and a passive data path through the
midplane what the heck is that first of
all what's a midplane well in raid
systems unlike computer systems computer
systems have back plants what's cool
about the backplane everything plugs
into it well on the raid system in the
middle of it there's a backplane and
because it's in the middle we call it a
mid plane and what plugs into it
well everything the hard drives plug
into it the power supplies plug into it
the raid controllers plug into it the
cooling units plug into it it's all in
the middle but what's nice about it is
there's an independent path from every
hard drive to the RAID controller going
through the midplane
but its passive it's like a connector
and on other raid systems there's
actually logic on the midplane
is that good well it's good until the
logic fails and then you have a single
point of failure so we tried to
eliminate single points of failure for
for reliability and how about a
dedicated 64 bit 66 megahertz pci bus to
the drives so i talked about 1,400
megabytes a second internal bandwidth
well I lied when we get to the raid
controllers because each RAID controller
has 533 megabytes a second of internal
bandwidth so it's really about a
thousand megabytes of internal bandwidth
within the system and because we have we
have an independent Co processing system
in each one of the raid controllers
events that happen on the PCI bus from
the drive don't affect any of the
performance of the system now the other
area that we talked about are dual
independent raid controllers so what are
they well essentially it means that each
controller only has to manage seven
drives so we have two of them we have
fourteen drives each one manages seven
drives it's less work for the raid
controller they're happy they get to
work 9:00 to 5:00 but what that really
means is that it leads to higher
throughput the other thing we do is we
put real cash on the raid controller
when we say there's five up to 512
megabytes of cash in each RAID
controller that means you have a useable
gigabyte of cache in the system now what
do I mean by that well this is an
advantage to have independent raid
controllers because they're independent
there's no cache coherency between them
we don't have to copy the data so if
someone says I have a gigabyte of cache
but it's coherent that means they really
have 512 megabytes of cache and there's
good and bad in that the good in that is
that the cache is copied somewhere the
bad in that is that your performance
won't allow you to use all that cash you
need and caches is really important to
performance in some areas and I'll talk
about that and then we also have a
dynamic cache scheme what does that mean
dynamic it changes we can change it I'll
show you how we do that and we also
performance tuning that's a combination
of automatic and manual now that doesn't
mean that because we do automatic that
you know you guys don't know what you're
doing but because we can do automatic we
can react instantly you don't have that
reaction time we can look at the data
coming in and we can change the
parameters of cache prefetch and really
important aspects of the raid system
on-the-fly by just looking back for the
operating system and say hey what kind
of data are you sending me so it's
really important and then I think what's
with the what's the most important is
that we have a management coprocessor in
fact we have dual redundant management
coprocessors so we offload all those
tasks of all that ease of use management
all those events that come up the raid
setup and the event logging we offer we
offload that to a completely separate
processor so the raid controller doesn't
have to worry about it yet it knows
about it and it can query it to find out
what's going on so that's really
important that's that's a little
different than what most people do and
then of course throughput so this is the
real speeds and feeds throughput is a
direct function of the complexity of
data being delivered what does that mean
that means that a spreadsheet that
that's a thousand by a thousand lines is
more intense than a word file no that's
not what it means
it basically means that the amount of
data and the number of users will
determine what the throughput is more
users you're moving the heads you're
using a cache differently than you would
as a single user doing an enormous file
so when we look at that the data
complexity the algorithms that we build
take in consideration the data
complexity and we actually throttle the
performance based on that and then the
other thing that we do and a little
unique as well is that there's a
functionality in order to do parity raid
parity
would be protected raid like raid 3 and
raid 5 as opposed to non parity raid
like raid 1 which is mirroring but we
actually do is we do multiple xor
functions on the fly not one and wait
and write the data we do multiple ones
at the fly on the fly and we use the
cache to store those so we get a higher
level of performance and then the other
thing that I think is really important
is that we can use Apple RAID that's
built in our operating system this is a
nice thing about being close to the
operating system we can use apple raid
to build compound raid sets what are
they so that means if we take a raid 5
and a raid 5 and put them together and
stripe them
we'll build a raid 50 and we can get the
maximum performance out of that because
in the operating system we can tune that
and we're able to so it's really
exciting so this is the architecture of
X Server 8 doesn't look like the g5
stuff you saw the other day right looks
a little different so basically what you
have here is you have 14 hard drives on
the bottom and I think we've all we've
seen some of this in fact this is this
is beautifully represented in the
technical overview which is on Apple
comm and I I advise anybody who can't
sleep at night to download that 30 page
document it's really good but there's 14
hard drives on the bottom you see that
passive midplane connecting to what
looks like two boxes with seven little
chips and a couple big chips on there
and each one of those boxes is a RAID
controller and the seven chips you see
on each side are actually controllers
for the independent ATA hard drives so
each hard drive is dumb the hard drive
should be dumb they need to deliver what
the controller's tell them
so the controllers handle all the logic
that needs to come off the hard drives
all the data that comes off the hard
drives is handled by the controllers
completely independently of each other
so across this PCI bus from that raid
engine it can actually address all the
hard drives exactly the same time it's
pretty exciting because what it does is
increase this throughput the other thing
as I mentioned the raid engine that's
where all the XOR functions that's where
all your raid functions happen that's
where your error correction happens and
there's two of them they're completely
independent and they're very fast and
that little thing on the side there that
little dim is a cash sim so that
actually allows us to dynamically change
the cache you notice this other little
chip on there called the control center
and the control center is our
independent coprocessor and you'll
notice that that one talks to the
control center on the other rate
controller so that each one of them can
contain information about the rate set
of the other rate controller if we had a
failure of one of the control centers
and then in the middle there's two fans
or cooling units two power supplies and
redundant battery backup so a pretty
clean design in fact a lot cleaner and a
lot more passive of an architecture than
what others have done so how about
deploying it well I'm talking about the
applications the first thing you should
see is how do you deploy it physically
so I mentioned they need to be rack
mounted or they need to have an
enclosure that holds them what's really
important in the IT world today is
performance and reliability
that's optimum right that's what you
have to have so how do you how do you
achieve that the first thing I can tell
you is spend the money on a good rack
why well let me tell you why first you
want to make sure that with 14 spinning
hard drives that spin at 7,200 rpm that
you don't achieve what we call
rotational vibration what does that mean
we have we have desktops they have hard
drives no problem well a hard drive in
fact this was told to me a long time ago
and it's still true
a hard drive spinning with the head
floating above it is like flying a 747
six feet off the ground at 600 miles an
hour it's easy to do until there's a
mountain a car you know something
something's in the way so you can
imagine that the heads actually fly
above the disk and they're very close to
about one-tenth the thickness of a human
hair using what we call magneto
resistive heads on the drives if you
have a lot of hard drives in there
together and they get vibration what do
you think happens to the heads here's
what I do my little wiggle they wiggle
around and if they actually hit the
platters that's a bad thing and the more
vibration you have the more chance
you're gonna have the heads are gonna
wiggle around so if my data is actually
on the disk and I have the heads wiggle
around what's my chance of getting the
data on the first rotation of the disk
it's pretty poor so the less vibration I
actually have the better chance I'm
going to get of getting the data
on my first rotation so what I want to
do is I want to buy good racks and I
want to rack up my systems where I'm not
gonna get a lot of vibration so don't
put them on your kitchen table
you know don't sit him on the back room
don't just lay them on their side if
you're gonna put them on their side put
them in a closure that'll hold them so
that's the first thing you remember how
to rack them and if anybody went to the
Xserve session you'll learn that the
Xserve raid needs a four-post rack you
can't hang it from the front
I mentioned 110 pounds if you hang it
from the front don't call me unless
you're gonna send me your insurance
policy you can't hang it from the middle
don't sit it on top this is the number
one rack thing I'm like this I used to
do this in my younger I T years you just
rack the bottom component and stack
everything above it until the screws
shear off you know in the middle of the
night at 3:00 in the morning okay you
don't do that what X Server aid you want
to keep it you want to keep it good so
let's talk about the applications so a
lot of things are changing in a storage
market the first thing is changing of
the requirements this is what I call the
yeast abuse it used to just be about
data protection
everybody mirrored you got your hard
drive you mirror it it's all about data
protection I got my data nothing to
worry about
do I ever back up no probably not
it used to be too expensive raid that's
really expensive gotta buy all those
hard drives used to be complicated
anybody ever install a raid system other
than one of the Apple ones it's really
hard it's really hard it takes a while
and it used to be that you had to follow
their traditional views and most do you
probably still follow those views so
what are they these are the views that
you get this is what the server
companies left to sell you server based
storage is best it costs less it's
easier to deploy so there's the thing
what we call captive and non captive
storage and in the Apple world there was
no captive storage that meant that the
server company actually sold this
storage along with the servers and
before xserve raid and before xserve it
was really hard to get a lot of storage
in it and an apple platform so if you
wanted it you were stuck in firewire
drives are buying from some third party
company and hoping for the best
but the other storage companies and
server companies they made it their
business to sell storage in the server's
because it was easier to deploy you
don't have to worry about setting it up
and also my server vendor knows what's
best for me well I still believe that
one because where the sir
now throughput is driven by cost if it
costs more it has to be faster Ferraris
cost more than Volkswagens they have to
be faster unless you throw them off a
cliff if it costs more it has to be
better how can you build it for less if
it costs more it must be better
supported those are traditional views
throw them out the window technology
changes everything and that's what's
happened so this is the traditional view
of storage this is the way most people
deploy it you have internal server
usually using software array it's a good
way to start the throughput and the
availability are low and the cost is low
now there is an exception to that and
that's X sir
the throughputs phenomenal the cost is
still low but the throughputs phenomenal
internal server based Hardware rate now
I would argue with this a little bit and
say that usually the throughput slower
in the excerpt case it's much lower than
our software it and then there's
external scuzzy hardware RAID storage
the staple of the industry it's easy
it's expensive and then there's external
SAN attached fibre channel storage area
networking the idea that I can
centralize all my storage into one area
these are traditional views these are
the way people look at it
so what's changing it something called
storage classing what is storage classy
well servers are moving to a scale out
model we have more servers people talk
about server consolidation I want to get
rid of all this servers I have and do
one that works until you're asked to
deliver more services and as soon as you
deliver more services you buy more
servers and you're trying to scale them
out because you're not buying these
monster servers anymore and deploying
everything on them although I do but
you're not you're not doing that I mean
next serve that I have runs everything
but because it's fast but you're buying
more and more servers and you put you're
scaling them out the other thing is that
the density of servers today has gotten
greater well you can do it in X surf
today you just couldn't do five years
ago with a large number of servers you
had have a lot and the other thing is
that the management of having all these
servers and all these storage devices is
out of control
because especially if you bought servers
from a lot of different people you
manage them all differently and a lot of
different operating systems it's really
kind of crazy
and then direct-attached storage was the
only alternate now I'll bet you that a
lot of people out here have installed
network attached storage of some case at
least at least some of it and probably
even storage area networking and those
numbers are increasing everyday that's
one of the changes now the other thing
about storage classes is that people
spend a lot of money on storage they
always have it's been an interesting
thing I buy a server for X dollars I
spend three times as much on storage
that was always the role and they always
had to buy the same storage they always
bought really expensive scuzzy storage
if that's what they bought because they
had to use it everywhere because it was
the same and it was there and they're to
keep buying it every time they did a
terabyte they paid X dollars for it
or a gigabyte in the old days they paid
X dollars storage classes are changing
that they're realizing that you can you
can deploy different types of storage
for different needs G and also this this
incredible thing happened anybody who's
who's older like me they'll remember
something called HSM hierarchical
storage management they renamed it
Nearline storage what's old is new again
and its really getting a lot of play and
I'll talk about that
so as you deploy you look at these
traditional views and something's
changed and this is what's changed and
you're gonna see this from other vendors
happening but Apple was the first one to
put the shot across the bow at this one
external ata based hardware RAID storage
that's fast and affordable and guess
what it has the high availability needs
that you use in business critical
applications as well and you get that
for free and so when you look at this
picture you say how can something that
cost is as little as internal storage
based on what people will do with scuzzy
deliver the performance equivalent to
what you get when external fibre channel
based systems that cost five times as
much well it can happen
and I'll show you why in a couple
minutes so here's what it's really all
about IT has to look at doing more with
less I don't think there's anyone in the
room here who said who's had a boss has
said you know what let's let's deliver
less services this year let me give you
tons of money go buy whatever you want
and only work five days a week in six
hours a day I say a bunch of nods
everybody's doing that right okay it's
not happening like that it's like this
department needs this this department
needs that I need to deploy it now it
needs to be faster it needs to be better
I'm bringing ten people online I need
you to work Sunday to install that
server and while you're at it put the
storage in the top of the rack as far as
you can because there's no room to buy a
new rack that's happening and then
there's backup so how do you backup 17
terabytes of storage what I have on this
stage how long does that take
can you do it in the eight hour window
that no people are working at your
company say yes right so people are
looking at different ways of doing it
they're saying you know what what we can
do is as we bring more storage online
we're looking at new ways so things like
Nearline disk to disk backup have we all
heard that term dist did this back up
what why is that because it's fast right
and then I can back it up I have another
copy and at my leisure I can have those
slow tape drives take it offline because
I can tell you right now you can backup
all 17 terabytes that are on this stage
in one hour for about a million dollars
it's about how many tape drives it it
takes but all 17 terabytes that I have
up here to do it near line to get
another 17 terabytes with Apple stores
request you about $60,000 see the
difference wonder why people are buying
Nearline so that's what's really
happening okay I know this will work
okay so I mentioned storage classes so
there's three classes of storage most
people identify the first class is high
availability highly redundant and very
expensive where do you want to use this
well some people used to call this
mission-critical
I tend to call it mission-critical but
everything you do is mission-critical
if you're doing code and you lose that
code that was mission-critical to you
especially if you had deadlines next
week if you're deploying a web server
and you lose all that content that was
mission-critical but where do you really
see this used I used my favorite example
is the ATM machine when I put my card in
I want to make sure that it has my
balance or lies and gives me more money
but it has my balance that's in there so
that's mission-critical I want the
people who run the banks to spend a lot
of money on storage but you know the
transactions that I did the day before
they're not as important to me because
it's my balance I care about so I want
to make sure that that becomes class to
storage it's still protected but it's
business critical it's not mission
critical and that's like running a
website that's business critical stuff
that's your email it's business critical
stuff it's highly redundant and it's
available but you can accept some
downtime as long as you can save the
data and the data still there and then
there's raid 0 raid 1 j-bot low-cost and
risky you got two hard drives your
mirror and you've run your whole
business on it it's a good thing so this
is the way storage classes get deployed
and so in the enterprise apps I use that
Big E word up there and database
financial there's what they call online
storage in class one there's also what
they call Nearline storage what the heck
is in your line mere line means it's not
as fast it's not as redundant but it's
there and safe and if it does have a
little bit of downtime I can recover
from it quickly and I know my date is
still there
and then when you look at business apps
for the most part you start to see that
they're using different classes of
storage
you'd never deploy an enterprise app on
Nearline alone you'd always have some
online but you would find a business app
running on your line and you'll find
ecommerce and web apps running on
Nearline so this whole class thing of
storage is is is starting to do
something to the world
why well imagine if I have to buy Brand
X storage that I've used for all my
mission-critical applications and it
cost and this is not unrealistic to
$50,000 a terabyte and since I've
deployed five terabytes of that in my
organization I give it to everyone
everyone's home directory runs on that
storage that cost a lot of money that's
crazy
when I can buy two and a half terabytes
of X or of raid and I can let all my
people have ten megabytes of iTunes
music store I can have all my people
have enough room to do their work I
don't have to bug them and I can put
their home directories out there and
it's safe and it's secure
but guess what it's not the transactions
that run the company so that's what this
classing of storage is really doing so
how do you plan for this because we're
gonna talk about the real fun stuff in a
minute but how do you plan for well the
first thing is you have to have to plan
take into consideration the capacity to
throughput and availability and also the
cost and value so when you're doing
capacity planning my rule is buy more
than you need because if you think you
bought enough today start working on
that budget for next year because you're
gonna buy more and make sure you have a
solution that scales
how about throughput you really have to
look at this what does it throughput you
need again if you get throughput for
free by the highest throughput you can
but it's all about configuration and
it's all about the availability
mission-critical no downtime business
critical manageable downtime archived
and near line so you have to take all
those things in consideration and build
a big chart and decide what you really
want because there are times when you
want to buy storage that's even beyond
what excerpt of raid can deploy so
here's what people usually will
determine use to determine their storage
the first metric that you always get
this is the financial people always say
this one what's the cost per gigabyte so
currently I DC and Gartner they say if
you believe them they say it's $30 a
gigabyte for high availability storage X
surf rate is four dollars and 36 cents
so it's chocolate up for value what are
the hidden costs that you get with other
companies
well the hidden costs are things you
have to look at what's the service and
support cost what's the additional
software you have to buy
both to set up and manage manage the
system what are the cost of cables what
are the cost of power because it draws a
lot of power those are all the hidden
costs that you have and also what about
expansion when I expand does it cost me
as much to expand as it cost me to buy
in or is it cost more does it cost less
and will the company be there in a
couple years those all storage planning
tools
how about deployments so here's the
scenarios we always see an apple server
base deployment a single server a single
raid I think we could all do that one a
single server multiple rates I'd like
everyone to do that one multiple server
single raid or multiple servers multiple
raid so let's look at the cases how
about a mixed OS platform is it possible
to do that what X Server aid you could
have multiple servers multiple raids
single server single raids all that's
possible how about a virtualized mix
platform what is that why would you want
to virtualize we're going to talk about
that so let's build some scenarios you
ready I built a couple already here so
here's three typical scenarios that I
built the first one is I took Mac OS 10
server on an X surf and an X server 8 it
was a pretty easy and simple thing to do
the next scenario I did is I took that
same Xserve raid and I attached two
excerpts to it I took another scenario
in the middle kind of a funny one I have
this server from a company called IBM
and one from a company called Dell with
an X surf and I have two raids and I'm
sharing the storage across all those
platforms and in scenario 4 it's kind of
interesting I want to call virtualized
storage so I have a third party product
called a chaparral radar PS or a
provisioning server there's the word you
want to remember provisioning I have a
couple X serves and a couple extra raids
and I basically added additional
capabilities texture of raid and kept
that low cost storage deployment but add
it true enterprise class performance and
reliability and scalability to the
Xserve raid without having to buy some
very expensive stories that you that you
could use to do this so let me show you
how they work
first thing is let's look at an X or a
if you've seen one other than here it's
pretty standard cleaned in the front I
mentioned that there's 14 hard drives
people always say we went a little
overboard with lights I say we need more
drive activity and health activity on
each drive each hard drive what is
health we actually measure the drives
for are they good are they in a pre fail
condition that means are they gonna fail
or have they failed so the light goes
green amber and red red is dead Green is
good amber means that we've used a
technology called smart to actually look
at the drives and say what is the health
of that drive if it's gonna fail we're
gonna pre fail it and we're gonna warn
you under send you an email and say hey
this drive is gonna fail we're gonna if
you had a hot spare in the system we're
gonna rebuild it a hot spare C of full
availability it's an indicator on the
front gives you a good idea there's
fiber channel link indicators just like
I next serve we want to make sure that
the links that you have to your server
are easily available why do we put those
on the front there's a lights on the
back right well this this comes from the
classic example of ease of use anybody
have a rack anybody ever install a rack
okay a few people here a lot of people
when you have a rack you really want to
make it really clean in the back and run
all the cables down the side it's gonna
be beautiful it doesn't happen guys
looks like a spaghetti jungle in there
and so there's a chance there's always
that possibility that you could come
along and unplug a cable by mistake and
if you don't have an indicator on the
front to show you you may not notice
that until you walk to your desk and get
an email door page so it can happen it
can happen really easy so we put a lot
of indicators on the front including
link indicators for fiber channel
because I will tell you when you
disconnect the fiber channel cable you
don't get called by our admin utility
first you get called by all those people
who just lost their their connection for
their data and usually those pages of
the bad ones so we put a lot on the
front you'll notice just like XAR we
have a system identify our button an
enclosure lock and we added something
else called an alarm silencer because I
can set up my xserve raid that if I have
a failure and it's in a - side
configuration
or under the desk or on the desk or in
the CEOs office I can actually have it
beep at me beyond the visual indicators
and you notice this one didn't beep when
I pulled that power cord out because I
shut it off that's the fun thing you can
do so let's look at the back of it this
is really how you deploy it right pretty
clean Apple design - big fans and power
supplies they're redundant two redundant
cooling modules in the middle two
independent raid controllers two backup
battery modules a power a power
indicator an alarm mute button on the
back as well and a system identifier and
so you look at it you say pretty easy
there's nothing you can't get to if I go
back and look at the front I can get to
all the hard drives if I look at the
back I can get to all the components so
the design of xserve raid is a high
availability design meaning that the
only thing that I can't change in the
field is that midplane
but remember that was a passive data
flow so the only way to break that is to
physically Bend the connectors bad thing
guys or you're gonna hurt the metal
maybe look at the bezel on this little
hard to hurt right it's aluminum so
you're really not going to be able to
hurt anything and you're gonna be able
to deploy this thing anywhere you need
it in an easy manner so that most of the
connections you're gonna look at are
going to be the fiber channel connection
the Ethernet connection and the serial
connection why well that's we're going
to talk about
so here's scenario one I call this one
standard operating procedure you take an
X server eight you take an X surf you
put our fibre channel card in the Xserve
it has two ports on it
you take the two cables that come with
the card you plug them in the Xserve
raid and you're done
it's easy two and a half terabytes ready
to go the extra raid comes out of the
box it's configured for two raid 5 sets
so you have about 2.1 terabytes
available to you protect it raid you go
to Disk Utility just like you would any
other hard drives either build one you
know one raid 50 set by striping the two
together or two raid 5 sets there on
your on the desktop you serve on your
clients you're done pretty easy
you tell your boss that took you two
days software that comes with it Apple
raid admin what is raid admin it's a
utility of apples design to make it easy
to deploy these systems so you can
install them using this utility you can
manage them using this utility what's
different about it what's different
about it it's web-based it's Java based
so read that Java based platform
independent based you can use it
anywhere in the world and you know
what's cool about it you can deploy a
raid system literally in a minute and
I'm going to show you that and then the
other thing about this this installation
because they just have have strong back
because you got to lift the raid ok how
about this one scenario 2 to X serves
and an X server 8 what do we do here why
do we do this well what we did is we
centralize the storage so that's what I
did in this case I took my X Server aid
I didn't use a switch in this case I
just went directly from the extra raid
to the 2x serves I took centralized
protected storage high availability
storage and delivered over a terabyte to
each X surf now what's really cool about
this is that that storage is more
redundant than the server in this case
and if I ever had a problem I can move
that storage over to another server so I
have a highly redundant configuration
and it's inexpensive now about scenario
3 that's the one I built over here
actually some nice people here built it
for me
we took an X surf we took an IBM 330
server adele 1550 server an X server
aide we took Apple rated min and I
configured it with 4 raid sets 3 raid
5's in a raid 1 and each server has a
raid 5 set and the Xserve has an
additional raid 1 but-- so contrary to
popular belief you can actually boot
your exer off the next serve raid it
just works and i'll show you exactly how
we did that so what I did is I built a
raid set this took me about 10 seconds
and I took three hard drives and I built
a 360 gig raid set fully protect it and
I gave it to my ex surf
then I built another raid set and I gave
it to the Dell dude sorry
and then because someone said I had to
keep that Dell in at IBM 3:30 and my
budget for buying Xers to replace those
pigs took a little while I also gave a
360 gig hour 540 gig rate set to my IBM
330 and then the last two hard drives
that I had I made a boot that was
mirrored for my ex surf now I had two
drives left over and I made those hot
spares so that all my rates that sir
globally protected against a failure of
any drive it's pretty cool
so scenario for was virtualized and I
would go through this whole thing but
this is really complicated so I'm just
gonna hit the top lines of here but this
is something that a lot of people are
really looking for this competes at the
highest level with storage systems that
cost well hundreds of thousands of
dollars in the same capacity with some
storage systems you'd have to spend
hundreds of thousands of dollars to get
this capability so we took two excerpts
we took two X Server AIDS we took a Vic
cell fiber channel switch a 16 port Vic
cell fiber channel switch and
fortunately I didn't have enough room to
put it in the front it's in the back
I took a third party product a
provisioning server from a good company
called chaparral called a raid RPS I
took Apple raid in min in chaparral LUN
provisioning software and I built a
completely virtualized storage system we
call this ass and you might have heard
the time up technology and so what did
it do well the radar PS sees four point
three two terabytes and the radar PS
allows me to build a thousand 24 Luntz
what's a lon a LUN is basically think of
it as a partition so that if I had a lot
of users maybe not 1,024 but if I had a
hundred users and they were all graphic
artists I could break that storage up to
them equally and I probably would leave
a pool of that storage unused or
possibly just dead
Kait for something that I'm not using
today and I could deliver that up to 126
servers or 126 users and that's scenario
where I have the two servers I've given
a two terabyte one to each one of these
servers so let's take that let's take
that scenario let's decide that one
server handles my graphics department
and one server handles my video
department my graphic guys are slacking
off they're using a terabyte of storage
my video guys on the other hand they've
been downloading QuickTime flicks for a
long time here and there they really are
maxed out I'm running at two point one
six terabytes I'm telling them every
week throw your stuff away what can I do
with radar PS and with xserve ray I can
go ahead and re-partition or re-engineer
that software so that I can re provision
it so that now my video people can have
let's say three terabytes and my graphic
artists only get one terabyte and I can
do that dynamically essentially on the
fly with a restart of the Xserve they
can have a complete change in what they
need and so that's really high high
level capabilities that are very
difficult to get with other systems now
the other thing that's what we call
dynamic growth of Luntz the other thing
we can do here is we can do things like
snapshots
so I mentioned backup right if that
graphics department is moving a terabyte
a day I probably don't have a tape drive
to move a terabyte a day but I can
snapshot that or make a point in time
copy of that data so that someone can
get it off there at their leisure and
that's all available with Xserve raid
using low-cost storage with a
provisioning server and again the
provisioning server we have here is the
provisioning server from chaparral now
managing ones and performance tuning
with raid admin so I'm going to give you
a quick look at a sneak preview of a new
version of rated min that's coming out
let me show you how this works so can I
have a demo two up please okay
so this is rate admin our little piece
of software to manage the raid system so
the first thing you're gonna notice is
that all my raid systems here happen to
be on my network and they're on a subnet
now to make it easy because I didn't
want to confuse myself I took some of
them off the subnet because I didn't
want to see them but you'll notice that
raid admin uses rendezvous and so I can
just grab one of those systems and I
happen to know the password so I'm not
gonna tell everybody gets public here
anybody who's on my subnet here and I
know the password and I'm gonna log into
and it's gonna go out there and it's
gonna hit one of my my raid systems and
give you all the information about it
now remember pretty cool cocoa app right
it's Java guys swing Java so the things
you can do with this stuff is just
incredible first thing you'll notice I
get all the information about about the
array is down here and if anybody's sat
in the Xserve they saw something called
server monitor first similar application
try to make the applications have a
similar look and feel and you'll notice
I can scroll this up and down and use it
as I need to you'll see why in a second
here I get all my information about my
raid systems how long they've been up
how long they've been running I can look
at the arrays in the drives I'm looking
I'm showing a race here I happen to have
a raid 5 set here that's a terabyte you
all called raid raid 1 or it's raid 5
but it's actually array 1 and then on
this side I don't have an array but if I
want to see what drives I have available
I can click through those drives and I
can see that each one of them is about
172 gigabytes
I also get an interesting view when I
click here and I notice that my drive
cache is disabled on this particular
raid set hmmm how did I do that I can
look at the components so I mentioned
that the system's have different
components this one has a power supply
and guess what it's ok I've got a little
green light cooling modules there's my
speed
Wow those things are running guess what
we actually timed them so they run the
same sis rpm and my controllers I can
look at those and you'll notice that my
batteries aren't installed so I cheaped
out and didn't buy batteries rule 1
always buy backup batteries how about
fiber channel I have a connection a
fiber channel I can look I've
- links I can see this this little
worldwide name down here it's only 48
digits it's really simple and easy to
remember and my network so why do I need
network because we do what's called
out-of-band management what's connecting
my raid admin system to my raid systems
here is Ethernet I'm not using the
bandwidth on fibre channel I'm doing
this in what's called a tab and so it's
really a simple connection but you'll
notice I have a link that's down because
I wanted to get an amber light in here
to show you what it looks like I have a
link that's down it's a warning it's
warning me telling me Alex you only
connect to one of those Ethernet lines
so if your switch or router on your
Ethernet goes down you're not gonna be
able to see your raid system when you're
on vacation and Fuji next week so you
want to have them both on there you'll
notice I also have some speed and
configuration on here and then something
brand-new and rated min 1.1 is an event
log so I can actually save the events
and know what I did because believe it
or not we get busy it's one of those
funny things that happens so this is
basically a system let's do something
funny here let's create a raid system is
that cool
so everybody's used to keychain right so
I have that in my Java app here let's
let's see if I remember my password not
gonna tell you it's Alex you notice the
first thing that happens here is it
shows me with this little lock right now
that I'm keychain enabled now this is
really where rated min varies from every
other set up utility that you've seen
now if anybody's looked on the web
you've seen this screen but this is
really where it varies I'm an IT person
but I manage a lot of different things I
don't just manage my raid systems every
day if I do I got a pretty cushy job if
they're X Server AIDS but I don't
remember what the difference between
raid 3 and raid 5 is so I just look up
here and it tells me and I can build a
raid set knowing I need three or more
drives how do I do it this is really
hard so I pick a drive and I select them
oh my fingers getting tired oh there's a
lot of work I select my drives and then
we have this little thing is called raid
now what does that mean it's background
initialization so I can click that and
that allows me to actually use the raid
set instantly so what I'm going to do is
the raid set and it's gonna take a long
time it's actually gonna take 24 hours
to build a raid set because what I'm
gonna do is I'm gonna check to a surface
analysis on every single Drive in the
system to make sure there's no failures
of those drives because you don't want
to build a rates that I'm bad drives and
you'd be surprised if you ever see other
raid systems and they build a raid set
in like five minutes or ten seconds you
go oh they did a lot of checking on
those drives didn't they they made sure
those were perfect they just looked at
the signature and wrote it on there
we're gonna do a complete surface
analysis of the entire drive back and
forth reads and writes and we're gonna
spare out any bad areas we're gonna take
care of it all but in the meantime
you'll still be able to use it and then
I want high-performance and know what it
says here it says that if I turn on
drive cache
I should have batteries or UPS that's a
warning guys cuz you can lose data if
you don't and then I say create I'm done
I just build I just built a raid 5 raid
set on the fly in a matter of seconds
that's it it's easy so what else is
really cool about rated in it how about
settings so the one thing I can do an
assistant effect what I'm gonna do here
is I'm gonna grab another one that has
some different stuff on it so I can see
some different settings this is actually
the system I'm using right now in the in
this configuration with the provisioning
server
this one has some cool settings on it so
I'm gonna go in and take a look at the
settings and what you're gonna notice
first is the system information so this
is the radar PS rack that's this rack
over here and this is raid 3 I can
synchronize my time I can change my
management passwords remember I
mentioned about the audible system
alerts I can turn them on or turn them
off I can restart automatically when
there's a power failure anybody like to
get called in the middle of the night
when there's a power failure because the
raid went down you got to go push the
button back on forget that no more doing
that I mentioned the network settings I
can change from manual to DHCP fibre
channel if you have an existing fiber
channel network this is the most
important thing you're ever gonna play
with or never play with because in most
networks you have to set the speeds
we've gone through the topology and
speed setting and made
automatic and so now you can go ahead
and just basically run with it and we'll
go and investigate your network and
we'll configure that the Xserve rate for
your network and here's where I was
mentioning provisioning in Lund settings
so how do I serve different systems well
what I did here is I basically took two
world wide port names remember there are
only forty eight digits easy to remember
stuff did you remember those was that
the right one I'm not sure okay those
are the two ports that are on my fibre
channel card that's in my provisioning
server and I've given a raid five set
with one through seven drives to this
pour and I've given another raid set to
this port and I've enabled one masking
that means everything else up here can't
see that's on the same fibre channel
network can't see it only those two
ports now this software from other
people can cost a lot of money it can
cost tens of thousands of dollars but
with xserve rates include it it's easy
and it's simple to configure we think
it's easier how about performance
anybody ever had a raid system and you
know you read these things 3,000
megabytes a second 40 million iOS per
second and then you put it on you get
like 2 megabytes a second and you know
you found that your USB dongle is faster
part of that is part of that is
marketing but for the most parts is
because most raid systems you have to be
a genius to tune them so I mentioned
there were there were automatic and
there were manual settings for our raid
systems so first thing we do is we do
all the automatic stuff for you we tune
it based on the way you tell us you tell
us you a 512 megs a cache in there you
tell us that you have battery backup you
tell us you're gonna do raid 5 and we
tune it but we also give you other
settings
I mentioned drive cache I can turn the
drive cache on to the drives so if I
didn't have a UPS and I'm willing to
take a performance hit I don't want to
turn drive cache on because I'm scared
that if I have a power failure
I could lose some data now we do have in
Mac OS 10 we do of course have
journaling
protects you against these things but
there's still the possibility that you
could lose a lot of data because there's
8 megabytes of cash per hard drive tons
out by 14 and that's bigger than your
spreadsheet or my spreadsheet so we may
we may want to turn it on or turn it off
I always suggest hooking to ups and
running drive cache on because the
performance can be a hundred percent
better
how about right cache what is right
cache this is kind of a misnomer enable
and disable because what we actually do
is we have two levels of right cache one
is right through that means we don't
really cache anything at the rate
controller everything that comes through
we just write it through it and so the
data flows quickly to the drives so the
computer in this case let's call it a
neck serve as soon as it sees that the
data is at the drives it's free to send
more data so it gets this right complete
if we do something called write back
cache which is what we're enabling here
we actually tell the system the Xserve
in this case that we've written the data
to the drives when we really have it
we've written it into a cache buffer and
that's why we have those battery backups
to back up that cache in case for some
reason we lost power at that case but
that could also increase your
performance because imagine if you want
to have a lot of people hitting the raid
system at once what do you what do you
expect it to do you expect the Xserve to
send a lot of commands to the Xserve
rate at once and the only way you can
accept those commands is to start
caching them up and also you have to
remember that the data that's coming off
the hard drives can come either in order
or add or order to the RAID controller
and so what we want to do is want to
cache all that data up so we can get as
much data delivered to the Xserve at
once we want to open the floodgates so
this is important now I mentioned that
we had dynamic prefetch in the system so
why do I have a prefetch setting for
read and what is prefetch well
essentially what prefetch is is that if
I let's take let's take the alphabet A
through Z if I tell you that I want to
look at a and B your
much gonna guess that the next thing I
want to look at is C and B so in a
streaming application or I feel like me
I'm probably go look at Z next but if
you're in a streaming application it's a
pretty good guess that I'm gonna
prefetch a lot of data ahead because I'm
gonna anticipate what you're gonna get
next so I said that we're dynamic so
what we do is we look at the data that
the driver is giving us from our exxor
from the OS and we say what kind of data
is it if it's streaming data what I do
is I open my prefetch window and I start
grabbing a lot of data because I know if
I'm streaming a QuickTime movie the rest
of the QuickTime movie is what you're
likely gonna want to see it's the data
you want but you notice here I have
three settings for that Y if I just can
dynamically change this it's the it's
the size of the dynamic changes we can
make so if I set it to 128 you give me
the rate controller a lot of range
I can go anywhere I want if I set it to
1 I can only do a little prefetch why
why would I need to change it why would
you be smarter than me well the reason
is you may not want to watch that whole
QuickTime movie you might be scrubbing a
timeline you might have a hundred users
and everybody wants to share a little of
that data so you know your data patterns
better than I do
so what's cool about this it's dynamic
you run a test based on your data
patterns you're using your users you
hammer it you hit it a lot you don't see
the performance you want what do you do
it's really hard here you go over and
you push the button run the test again
so I'll give you some general rules for
this anybody here of iOS per second
small files are iOS per second large
files are usually megabytes they usually
work against each other if you have
large files video content put the
prefetch as high as you can if you have
tons of users lots of users they do
small files turn the prefetch down
you'll help the system and in all cases
keep right cache enabled so that's
basically a real quick tour of rate
admin now let me show you something else
can I go to demo three please
so this is I don't know what kind of
system this is this is can't remember
one of these things it's called a PC so
this is actually r8 admin running on
Windows now this isn't something we
support but I mentioned r8 admin is Java
well what I'll even show you something
that I didn't show you on the Mac which
I could have shown you is contextual
menus raid admin actually has contextual
menus as a Java app so it's pretty cool
all the capabilities of raid admin on
the Mac are available on other platforms
such as Windows such as Sun anything
that runs Java 1.31 or better so you can
actually manage your system although not
necessarily a recommendation you can
actually manage your system from a
Windows system so when you're stuck in
Fiji on vacation and you you forgot to
bring your power book with you you can
still manage manage to rate admin ok
great can we go back to slides please
cool so here's a typical LUN management
configuration when things I like to like
to talk about is how we actually can do
basics and functionality in Xserve raid
with rated min without really having to
spend any money on sans software pieces
and parts and still do what's true to a
sand-witch is consolidating your storage
and deploying it to many servers so I
showed you how we had four servers
basically deployed here or three servers
deploying four raid sets so let me show
you how it actually would look we break
it up into 4x serves in this case I'd
build 4 raid sets using that one masking
I saw and now I get the advantage of
global hot sparing using that tool by
just putting the worldwide port name of
each one of those exurbs and and
addressing those to each one of the raid
sets simple so we can deploy that you
can do it in seconds in this case I used
a fiber channel switch to allow me to go
from 2 to 4 now the beauty of this is
that not every computer can see every
raid set only the computer or only the
Xserve in this case
that is attached to its rate set can see
it's rate set the other date is
protected away and easily I can change
who sees what data this is a really
simple deployment so let me wrap it up
there's five hints I'll give you for
deployment first thing to remember is
it's easy as one two three
almost the first one is read the manual
what there's two manuals for X server
eight one's a hardware manual it's 91
pages in it's excellent the second one
is a software manual for raid admin do
you really need to read it I would read
it plan ahead this is the biggest
problem that we all have with it with
deploying raid is we either by by too
much or by too little but we never
deploy it right so that goes along with
my next one by too much because you're
gonna ask for more later okay that's the
little marketing pitch okay then number
four read the manual again okay and then
number five go deploy it so it's really
simple if you have more want to get more
information you can contact myself or
skip Levin's
and of course there's reference
libraries for Xserve Xserve raid a lot
of information online for that and i'd
like to open it up to any questions that
anybody has if we can do that oh I did
want to talk about one more thing - two
more things actually in the in the
Xserve in the Xserve deploying Xserve
session yesterday we're talking about
ways of making it easy to deploy
multiple servers and Doug Brooks
mentioned something called the introvert
which is a little device here I call it
the little buddy for the for the X
Server aid and it allows us to take a
cartridge or what we call a drive
carrier from an X Server or an X Server
aid and bring it up onto a g4 g5 or even
another excerpt we want to really easily
because it makes it hot pluggable using
firewire on the system there's two
really cool things you can do one is I
mentioned mirroring on the X Server aid
so if I mirror let's say I have
seven drives on one side of my xserve
raid and I do a mirror of one of those
drives and I don't tell it that I want
to just mirror one drive to the next
drive it'll automatically mirror all
seven drops why would anybody ever use
that well how about if I have to make an
image and deploy to multiple X surfs I
have seven boot drives and if I want to
check them I take my extrovert am i
introvert actually and I mount it up on
my g5 g4 and I can actually see what I
got so I can have seven copies easily so
that's my little tips and tricks thing
for a X or a
you