WWDC2004 Session 319

Transcript

Kind: captions
Language: en
good morning so I am up here I'm course
here to talk about programming for the
Mac os10 64-bit avi or more correctly to
talk a little bit about apples direction
with respect to 64-bit computing and
then hand things over to our esteemed
engineering team who is going to give us
a lot more detail about this first
things first why 64-bit I think the key
thing that we see is access to more
memory than you can imagine for now
that's always an embarrassing or mark of
course 640 k used to be more memory than
some people could imagine and 16
exabytes may seem like that someday as
well I hope I'm not introducing the
128-bit ABI session but we do see a
number of reasons for this in some cases
you're going to see increased
performance for memory intensive tasks
now i will say that all those things
equal your programs was likely going to
run slower in 64-bit but all those
things are not always equal here one of
the great things is that you have access
to a lot more physical memory a lot more
virtual memory and we think there are
some benefits of that and obviously one
of those is that physical memory is
considerably faster than disk I think
the other one is that you can be lazier
programmers in some instances because
it's gonna be a lot easier to access
this large amount of them around the
system our 64-bit goals are quite simple
we want to provide the key benefits of
64-bit computing for the people who need
it the most while making sure that we
actually don't break any of these 12,000
native applications that we have today
and we talk a lot about this internally
as a goal we want to make sure the
people in scientific computing and
high-performance computing and media
have access to these benefits of 64 bit
but we don't want educational software
vendors with very simple titles for
children to have to actually redo
everything because we've decided to move
into the 64-bit world so our objective
is one that really stresses
compatibility and performance for the
native applications on the system and
64-bit is an addition and you learn more
today about why we're able to do that in
our 64-bit architecture but i think it's
a unique advantage that apple has with
mac OS 10 we started the 64-bit journey
already with panther we had actually
system
access to greater than 4 gigabytes of
physical memory in a machine but all of
the applications that you could write
were 32 bit all of the user texts were
32 bit and that's the address space you
had as developers all applications
however couldn't answer actually use
64-bit hardware math functions there
were 64-bit registers you did have to
recompile your binaries to actually take
specific advantage of the powerpc g5
architecture but those were features you
could get and once again this is
something that distinguishes apples
64-bit story from some of the other
platforms is that many of the
performance benefits of 64-bit computing
are actually available to 32-bit
applications today on Mac OS 10 if
you're going to bother to actually do
the recompile and optimization for
powerpc g5 but with Tiger we actually
want to take it to another level and
we're actually adding 64-bit addressing
for user tasks we're going to focus our
64-bit support initially on the
applications that we think are most
likely and most merchant ly need to
benefit from 64-bit addressing and so we
think these are going to be scientific
applications rendering and computational
engines and server applications and the
infrastructure we're providing to do
that is actually based on a 64-bit
libsystem and 64-bit capabilities in the
compiler so that actually concludes
briefly and mercifully the marketing
portion of this morning's presentation
and I'm actually very pleased to hand
this over to Nick Clegg who's going to
talk more about 64-bit architectures
Thank You Ellie one of the things while
I mentioned was that 64-bit computation
was already available to 32-bit
applications on Panther today and I want
to start off by going in detail what
that means when you think about 32-bit
to 64-bit computing there are actually
four aspects of that so the first is you
need a processor with 64-bit registers
the second you need to be able to
actually load and store the 64-bit
registers to main memory and single
instructions next came to 64 bit ALU
which is the computation unit of the CPU
so that you can do 64-bit multiplies and
chips and so forth and lastly if you
haven't used when those 64-bit registers
to access memory that you actually the
colonel has set up to 64-bit address
space for you now the first three of
those four points are already available
in Panther today you can take your
32-bit apps and take advantage of the
full 64-bit registers you can do 64-bit
loads and stores in you have to make
them move move faster and you can take
advantage of 64 bit ALU if your
application happens to be 64-bit integer
computation intensive you can do all
those things but not have six to go to
address space and get a performance
boost another way of looking at this is
all we're introducing in Tiger is 64-bit
address spaces so next I'd like to talk
a bit about how 64-bit was implemented
in PowerPC because it is a lot different
than how some other microprocessors have
added 64-bit fitness to their wine
PowerPC is kind of unique in that
PowerPC started out as a 64-bit
architecture way over 15 years ago
whenever started only now finally seeing
the fruition of the full 64-bit net
another interesting thing about it is
that there is no mode some other
processors you have a 32-bit mode and a
64-bit mode and when you're in those
different modes you have differences of
registers and differences of
instructions it's almost as if it's two
processors merged into one and there's
some big squish as to which way you're
doing one of the downfalls of that is
that in the implementation of a process
that model the implementers have to make
the decision of how to micro crow each
of those instructions and they tend to
make the 64-bit one faster and the
32-bit at the cost of the 32-bit PowerPC
does not have a problem there is no mode
on
RPC between 64-bit and 32-bit it is
completely a software convention whether
you're doing 64-bit or 32-bit I want to
go into some more detail than this
because once you get this the rest of
the 64-bit talk and how Apple is rolling
out 64 bits will make a lot more sense
if you play it around the PowerPC
instruction set you'll know this being a
risk processes that there's basically
two categories of instructions there's
load and store instructions and
everything else now load and store
instructions simply are the only
instructions that can access main memory
they all they do is move between memory
and registers all other instructions
operate with registers for instance at
our three Darfur and store back and our
three if you look at those instructions
and what they do is registers you notice
there's no sighs designation on those
instructions there's no ad bite or
Adwords or simply the add instruction
which works on the entire register so
how that works is that you want to do
32-bit now or you're in a 32-bit program
you only ever load the 32 bits of the
register you do your addition whatever
and you only ever look at the low 32
bits as a result what that allows you to
do is that 32-bit a program written to
32-bit conventions has no concept of
weather and authors actually higher bit
what this means is when the piracy came
out even though is a 64-bit architecture
of the original silicon only supported
the low 32 bits of registers all the
software written only looked at 32-bit
registers the low 32 bits because that
all there was what they follow that
convention well the g5 came out there
half as any more bits of it there but
the program's only love the low 32 bits
it doesn't really matter what's in the
high bit so it can be sibility garbage
they still do the same add instruction
and the results they only look at the
low 32 bit so what this means is there's
no mode with PowerPC it's only
conventions that distinguished 64-bit
from 32-bit processes another way of
looking at this is that there's no
performance penalty for running as a
32-bit process in fact was a learn later
the flight performance gain for any of
the 32-bit now those of you who is on
the subway programming also know that
there's something called condition codes
so you've got this mental model of how
you can run within thirty two-bit
conventions on 64-bit processor but you
may ask well Howard condition codes it
now condition codes the things like they
carry bit in the zero bit they are set
as a side effect of some instructions so
you do the ad if there was a carry out
from the alley carry bit early sat or if
you an ad and the result was 0z bit is
set there is one tiny bit of mode units
in the powerpc and that is there's a
mode of how the condition codes are set
as a result of the instruction whether
the process would look as a full 64-bit
or the low 32 bit so when the colonel
starts the process it looks at the code
in the process and basically a flag on
the header in size whether this piece of
power pc codes using 32-bit conventions
or 64-bit conventions if it's using
32-bit conventions it's a little plugin
with hard pc it says when you set
condition codes for this thing because
it's using 32-bit conventions it's only
looking at the loads of these bits so
the condition codes should only look at
the low 32 bits as well so one other
thing the colonel is a different when
launching a 32 vs 64-bit process it's
for 32-bit process that also tells the
MMU memory management unit to ignore the
high 32 bits of addresses because
they're potentially garbage and to only
look at the low 32 bit that gives a
32-bit process a four gig address space
a 64-bit process the full 16 exabytes so
what are some of the trade-offs for
compiling for 64 bits because now you
have a choice of whether to file your
same barbecue instructions using 32-bit
conventions or 64-bit conventions well
since we all know that these bitch
events conventions what they mean today
let's talk about the trade-offs of
changing to 60 word conventions well
there's the obvious advantage that you
now get a huge address space and if you
have lots of data and you need that
address space then this is the advantage
you want to go for another advantage is
if you are using 64-bit computations
64-bit integer commutations but you
don't need the full address space one of
the limitations of using the 32-bit
calling conventions is that none of the
functions know about
I a 32 bits and registers what that
means is when you compile today and you
tell our GCC compiler that you're
building for the g5 you want to optimize
for the g5 whenever in any leaf function
the compiler will basically make use of
the full 64-bit within that function the
compiler cannot do that when I crosses
function boundaries that is if the
function calls another function the
compile have to worry that function may
trash them upper it's registered so
therefore it doesn't use 64 victim
conventions within that that function
now once you switch the 64 e conventions
you know that it's safe to basically use
all bits of registers so there are a
small cattery of applications that don't
need a lot of data don't need a large
address space but because they use a lot
of security risk matech can take
advantage of compiling for 64-bit mode
so what's a disadvantage of compiling
for 64-bit mode as I said before the
instructions are exactly the same on
powerpc no matter which way you compile
but the difference is pointers are
bigger in 64 bits there are 64 bit what
that means is every data structure you
have that has a pointer in it is now
bigger overall that means the data in
the application is bigger when the data
in the application is bigger that means
you need to take up more address space
which means you need more pages and rams
run your process on one hand you can put
more round on the machine that solves
that problem but there's also the l1 and
l2 caches in the processor and they try
to cash the most recently used data from
the entire address space well if your
data set for your app is larger because
your pointer is a larger the chances of
your data being in the cache is slightly
less so there will be a small decreasing
performance for compiling physics your
bit so therefore it only makes sense
because offered 64 bits if you actually
find that you have your running into
limit 0 afford gig address space another
disadvantage is since this is all
conventions is any libraries you depend
on also have to be available with the
64-bit convinced otherwise you can't
call them so you have to wait till
everything below you as been converted
60 more risks before you can convert
now that explains it how the powerpc
works and how there's no 64-bit mode now
to explain how we're going to actually
roll out 64-bit let me explain what
we're not going to do there's not going
to be a 64-bit macula 10 and a 32-bit
Mac os10 there's only going to be mac OS
10 when you happen to run the g5 the
colonel will recognize it and allow
programs that are marked as using 64-bit
conventions to be launched and they'll
be set up with 64-bit address spaces for
Tiger all Apple is committing to at this
point is that lid system will be
available to 64-bit programs blood
system is the standard C library and
most of the POSIX functionality which
means command-line applications and
applications with no you I will be able
to convert to 64 bits if they so choose
over time I will be rolling up more
libraries and one of things we want to
hear back from you is what library
should we do first what are most
important to you you being people first
converting to 64-bit now I want to go
into a little more detail and you got
the big picture of how 64-bit processing
works of Power PC how actually going to
roll this out the most interesting thing
is a last point here we're gonna have a
single kernel the single kernel is going
to be a 32-bit Colonel we can do this
because of the PowerPC architecture it's
just conventions a third G bit kernel
can launch a 64-bit process the single
kernel has a number of advantages first
of all it means we can produce one disk
because that can boot on any machine
second it means all the existing kernel
extensions and device drivers that are
all written 230v conventions will still
run within it some of you may have heard
the term LP 64 for data models that is
the convention that we have chosen to
adopt for 60 bit 64-bit calling
conventions on vaca was 10 now I want to
go through a little bit of history of
where these acronyms came from let's go
back in time to the early 90s with the
pioneers and 64-bit computing or prey
and alpha
now once they got the hardware done they
started looking at the c language or
like well how big shouldn't in a long be
well they were gung ho / 64 bit so they
said well we're going to make an intent
along all 64 bits as well as pointers
evasions that rolled out to more and
more programmers and more more programs
as they use that said bang it is hard to
use because I've got this file format
that has 32-bit values in it or I
discusses network packet with 32-bit
values in it it's really hard to get to
because there's knows already a bit
tight in the C language so when the next
generation of 60 words of processors
came out for solaris and SGI and
eventually linux and stuff they all look
back at the early pioneers and said
we're not going to do that we're going
to use LP 64 an LT 64 Long's and
pointers or 64 bits but integers remain
32-bit therefore they have the base
types and see to have white short I mean
8-bit 16-bit 32-bit and 64-bit types are
care short into in long that was a much
easier programming model and that's
basically if you go searching for any
64-bit clean software out there today
you can see all of it is written to the
LP 64 model now the next thing it
happens is once all these acronyms came
out of a 64 LT 64 and stuff people said
well what do we call the old budget
model so the old sir demons model got
renamed its ILP 64 which is integer as
long as appointments are all 32-bit well
then some of the 32-bit processors got
jealous of the 64-bit computation
available and said well we want a 64-bit
type in a.c language so a budget even
compile vendors added extensions and
advantages you get standardizing
ratified in c99 as long long then they
have the problem over what is long long
lean and LP 64 and IL t 6 before well
just to make it easy to say loss is the
same as long so long long as 64 bits
across all compilers you also may have
heard recently that windows has
announced windows 64 now they decided to
take a different path after looking
through the source code they decided
that they had too many cases where they
had hard-coded long to be 32 bit so
rather than adopting the industry
standard LP 64 they can't with a new
model they call p 64 which only pointers
change in size to be 64 bits all the
others integer types
main this thing so those easy who are
investigating 64 roots and have code you
need to compile both for mac OS 10 or
the UNIX world and the windows world
have a bit of a conundrum turns out it's
not that bad if all you do is avoid
using the raw long type you're fine
because inside the same between fe 64
and p 64 and long lungs are the same as
you learn later in the talk using the
raw types is kind of problems begin with
now the API we had a chance to reexamine
the power to see avi for the 64 bit
because we need to come up with a new
64-bit convention and there's no reason
why we had to be compatible anyway with
the old of you big conventions so we did
as much analysis as we could on how the
old calling conditions worked how
parameters passed and registers and so
forth we want to come up the optimal
convention for 64-bit what we decided
was the original fuzzy that was pretty
darn good and hard to improve on that we
found a few edge cases where we could
improve so we did that for 64-bit the
first is when you pass struck the
contains slopes the old convention was
not very efficient we've improved that
for 64-bit the second reason we'd
returns of structs a little bit more
efficient and last thing is we decided
to dedicate our 13 which was previously
a non-volatile register to be owned by
the OS and in fact owned by the
threading package so our 13 will be
unique per thread this allows faster
pthread access elf and particular
thread-local storage leaf after on the
64-bit next we need to update the file
format we use the both of you have
actually looked the details with nahco
file format you'll see that it uses
32-bit off thats everywhere in the file
we decided that in the long term people
reading 64-bit code do so because
they're gonna have large amounts of data
and large amounts of code and the 32-bit
offsets or four gig limitation on the
file size might be an issue so we're
going to be enhancing the Moscow file
format to be allowed largest and four
gig file but the key point to all this
is that it's going to be quickly
transparent to you next thing is some of
you may be thinking well if there's no
mode on the PowerPC I could be clever
and write some function that happens to
work both for 64-bit callers and 32-bit
college
and yes you can come up with some
trivial examples where that works but as
soon as you do anything interesting that
breaks down so we're recommending and
we're adding no support for mixing
32-bit and 64-bit code in the same
process the way we do that is our tools
will mark all code even though it's just
power of you see instructions will mark
it with whether it's using 32-bit or
64-bit calling conventions how we're
going to do this is something we call
fat libraries or fed binaries some of
you may have seen different incantations
of this concept of fatness before I want
to contrast this with some other OS is
that when they introduced 64-bit say for
instance you put all your libraries and
flash user live as some OSS do when
64-bit came out they had two libraries
of the same names and they couldn't put
in the same directory so they came up
with a new directory userland 64 and
basically they kept all the 64-bit and
all the 32 bit binaries and separate
directories and that's how they kept
track of them we're doing something
different we're leveraging the fat
technology and this is how it works on
the right hand side there you see a
normal macco file which is it starts
with a small header which marks in this
case that all inventions 32-bit powerpc
is PVC and then has the text and data
need you for that file we also allow you
to create fat files and our tools will
do the three automatically or you can
use a lipo to lipo tool to pack these
things together all it is is at the
beginning has a table of contents that
says here's all the subs of files of
some damages in this file and they're
appended one after another this allows
you to ship one file that has both
32-bit and a 64-bit implementation in it
either a library or main executable if
the user takes that and runs it on a
32-bit system while the 32-bit version
of that's Apple we run and they'll be
limited to the 4g address space if they
take that and run that on a tiger or
later machine on a g5 or greater machine
the OS will automatically pick the 64 64
bit version of the file and run that I
just want to have a little fun here to
talk about what does the 64-bit address
space really mean it's pretty easy to
say but how big is it
so imagine if you will you took a 50 of
your pen and made a little dot let's
call that one bit let's say right next
to it you tried to pack around seven
other dots to make a bite few
millimeters on the side now if you
extended that and try to draw actually
four billion of these little bites how
big of a surface area would that be well
it turns out to be roughly the surface
of the roadway and sidewalk of the gold
laid bridge including the approaches so
we've spent our professional career in
the 32-bit world basically flying around
an area besides the Golden Gate Bridge
so what a 64-bit means with that same
scale well 64 bits is actually the
surface of the earth it's not quite evie
65 bits it's actually twice the surface
area of all the landmasses on planet
earth so basically you can get lost in
64 bits it's big now how are we going to
divvy up the 64-bit address space well
first thing remember is the colonel is
going to set up depending which calling
conventions you work which convention
32-bit 64-bit whether you have a 64-bit
or 32-bit address space all the existing
binaries will load is in the 32-bit
address space and still have the same
restrictions they've always had 64 bit
you can load anywhere in that 64-bit
address space and have access to the
entire 64 bits now one thing we're
contemplating doing is some of you may
know that we currently have a thing
called a zero page where the first 4k of
a third ebit process is mapped to be
neither readable a writable and that
catches a lot of null pointer or simple
haces awful no pointers so we're
considering it the same thing for the
64-bit address space but instead laxity
map illegally the first four gig and
last four gigs of the entire edge of
space and once it's going to cache is
all those subtle programming errors well
you've truncated the top bits of your
program now if you start off and low
memory and use malloc your way up you
may never actually run into that problem
until out in the field somewhere well by
permanently taking out the to end chunks
you can immediately catch the software
problem
so let's get down into how you actually
compile for 64 bits so in Xcode in the
inspector there's now a new attribute
here am I sure the big enough to read
this is architectures that's new for the
preview you have of Xcode in the dark
addictive field you can type ppc64 that
is our token to denote that you're using
the 64-bit calling conventions for
powerpc if you're using GCC you can say
dash arc ppc64 from the command line now
I talked a bit about what is actually in
the 64-bit on the preview you guys
received this week and what we're going
to have by the time 64 basically but I'm
Tigers on 64-bit so first of all the
colonel currently does not support the
full 64-bit address space it only
supports basically two million times
before gig address space that's still
elba hard time filling that up the
second is the only compile we have
particular it it's a seat compiler when
we should find the wall so have the c++
compiler available some of you may ask
what about it yet to see that's actually
easy to do in the compiler the problem
is as I said earlier we're only
committing to lib system being available
and our objective-c runtime relies
heavily on the foundation framework so
we haven't committed yet to when the
foundation framework will be available
using 64-bit conventions so we're not
going to use to compile yet until that's
done next thing is GD b can actually
debug 64-bit programs already with this
preview the assembler and the static
linker can create them but what it
creates our static conversions of
executables execute xk jewels cannot
load any libraries so just one image is
loaded in that's it for tiger final
we're no longer going to support these
statics cuticles and only going to
support dynamic executables for this
release the file format we're using as a
standard micro file format which means
it's limited to 30 to 32 bits and which
means that you're 64 processes that will
load in the low four gig we haven't done
the trick yet of mapping it out by the
time Tigers a final will have an updated
file format which you
load your SQL anywhere in the four gig
address space lastly because the
difference being static and dynamic
Xcode is only going to support building
standalone static ppc64 executables not
fat in finally be able to build fatten
binaries and of course the whole point
of this preview is for you to evaluate
which is where rates and start playing
with and give the feedback once Tiger is
final then the program's you make on the
final tiger you can ship an apple will
support let me summarize here g5 is
unique 64-bit processor there is no mode
all the instructions are exactly the
same the only difference between 32-bit
and 64-bit executables is conventions
they use again because once you switch a
64-bit your data structures are bigger
you have a slight performance decrease
for that reason the only reason you
should convert to using 64-bit calling
conventions is if you actually need more
than four gigs that address space we're
using the architecture part of our fat
build to enable a mixture of both 32 bit
and 64 calling conventions of power PCT
code and we're calling the new thank g-d
c64 and lastly we're only shipping lids
to strong the committee to shipping
libsystem as available for 64-bit
programs so you need to work around that
and again the program's you build on
this preview will not run on Tiger final
it's purely for evaluation the next I'd
like to bring up Jess laughs and who's
going to give you a short demo 64-bit
thanks Nick what I wanted to show is I
wanted to go a little bit into how to
build a 64-bit service application using
Xcode and for those of you that were at
Ed's tools over each session on Monday
afternoon you saw the Celestia app what
was going on behind the scenes there was
we have this 32-bit GUI application with
a 64-bit service application in the
background and that application actually
was mapping six and a half gigabytes of
terrain data which I think we left out
that little statistic so it really was
making use of the 64-bit address space
so I'm going to show that actual service
application how you would build out an
Xcode then play some games and step
through the debugger so let me watch
Xcode and I'm going to do some cut and
paste to speed this up a little bit but
you'd be doing the same thing but typing
so I'm going to create a new project and
just a standard C tool just call it Tara
mapper it's a good similar name give it
a second I think my disks bun down while
the machine was resting so I'm going to
do this quicker instead of me actually
typing in the text I'm going to actually
add the source file to the project
quickly and copy it in and delete that
little template main that came with the
temp project ok so the first thing you
need to do when you're building a 64-bit
app to make sure is you want to open up
the project inspector and as Nick
mentioned there's a architecture flag
setting and by default we build for
32-bit convention so I'm just going to
change this to ppc64 and then close that
and so now what I want to do and open up
the source file and I'm going to set a
breakpoint at the start of actually
before I do that I'm going to do
something a little tricky to speed this
demo up I'm going to actually
pre-allocate three gigabytes of my
address space
and count the right number zeros here
this is actually going to speed things
up some so we don't actually have to
read four gigabytes of data off the disk
for this demo let me sit break and then
we'll build it and actually let's go
into the debugger so Nick mentioned gdb
and the Xcode you I all is 64-bit aware
right now this actually is not going to
be very interesting so I'm going to step
a couple steps okay so now let's do
something a little more clever so I've
just pre-allocated three gigabytes of
address space what I want to do is where
we're going to start mapping the data I
want to set a breakpoint there and now I
actually have to drop to the Xcode comp
or the gdb console here because I want
to actually set a condition on that
break point that is actually break point
to so what I'm going to do is set a
condition so that break points only
going to stop when my pointer value gets
above 4 gigabytes again I have to make
sure i type enough zeros because more
than I'm used to typing so and now i can
continue running so no the demo gods
have not blessed me let me try that one
more time
sorry Oh
give me one second I quit and relaunch
it in case there was some stale data
there remember this is preview software
ah I remember what's wrong I forgot to
give it some command line arguments I
actually need to go here and actually
tell it where the data is my fault not
the software's fault so I'm going to
actually cut and paste a whole bunch of
stuff here okay
alright
ok condition key I'm just being paranoid
with you ll to make sure gdb knows that
I'm typing a 64-bit canta 64-bit
constant so okay so now we're actually
reading reading reading so it's loading
about a gigabyte since I pre-allocated
three gigabytes and let's see we
actually have a pretty big value for P
right now that's way up there in the
address space and actually you can use
all the features of Xcode you can
actually look in dereference it you can
bring up the memory viewer which is here
oh I can't type so we actually can look
at that memory not very interesting
because this is actually map but hasn't
been faulted in yet but the tools are
all ready for 64 bits and as we move
more frameworks along the line and as we
make things dynamic I think you'll be
able to explore and give us feedback
right now so that's all I actually had
to show not very interesting since the
GUI is not 64 bits and you already saw
that but with that I'd like to introduce
stan shebs is going to talk a little bit
about some pitfalls with 64 bits
[Applause]
hello everybody so we're going to go
into twist out and go in a little more
detail into the pitfalls and what
actually happens when you try and do
64-bit programming Jeff very narrowly
skated several errors and where he's
actually quite lucky to God demo God's
didn't ultimately smile on him you know
because he actually got 64 bit numbers
back when he put 64 bit numbers in so
the kinds of things that can happen is
that the the source code of you know
will need changes because integers
remain 32 bits and there's a number of
practices the long-standing practices
that no longer work for instance
integers cannot hold pointers that seems
fairly obvious but in fact a lot of code
will casually assign pointers to
integers expecting it the pointer back
later on somehow that kind of practice
won't work even something as innocuous
as using a percent D in a printf will
not actually show you the entire numbers
and that can be very confusing if you
don't use gdb and you try using printf
for debugging the other things we have
to do is a casting doesn't actually
solve the problem there's ways to get
tricked by sign extensions there's ways
to get tricked by function calls so
we'll start out by recalling Diogenes
and the oddities of your remember was a
philosopher of ancient Greece and his
one of his sticks was to wander around
with a lantern looking for honest people
and never finding any so since we're in
the modern age we have a flashlight LED
flashlight and we're going to be looking
for honest programmers so so our first
question how many programmers have
assigned a pointer to an integer wow we
have a lot of honest programmers that's
very encouraging but I know it's not
everyday raise their hand as perhaps we
have some Java people in here
so the key thing to know about about
assigning the pointers to integers is it
will lose data it will just simply drop
off the top half of the pointer and
it'll just be gone and this is happening
instruction set level there's there's
there's no way to recover from it now
you can assign too long or a long long
both of those are perfectly okay so with
the code example we have here we do the
Malik will assume the mallet came back
with a big pointer variable we assign it
and GCC is helpful and it does warn you
that the assignment will is making an
integer from a pointer but it doesn't
tell you that you're losing data just
warns you that you're you're doing this
without a cast will come back to cast in
a moment so if you look at the value the
integer variable at that point it's just
the lower half of the pointer now if you
do the same thing to a long variable you
get the entire value you can assign that
back to a pointer later on that works
now print if how many people yep okay
how many people use the correct kind of
printf directive for their lungs and
long long so the most people will just
habitually tend to use % d + % d has the
fatal flaw that it will only show you an
integer side thing printf is not a magic
function it you handed all the arguments
but it decides what to pull off the
stack that you pass to it based on what
the printf directives tell it but
director says pull four bytes it pulled
four bytes and leaves the next four
bytes for the next thing that it's asked
to print so you can get some interesting
behaviors and it'd be very confusing
because if you use printf as your window
into what's happening inside the program
and projeff is not telling you what's
really happening okay you can have a
situation where the program is more or
less working correctly but printf says
it has to be failing okay so I got to
watch out for that and so the directives
to to use theirs they've always been
there and see
they've been around you can use a
percent LD for a decimal print out or
you can use percent LX for the
hexadecimal print for long and for a
long long as it's always it's always
been available to to do percent ll d + %
LX and then we also have % p for
pointers the standard actually does not
define what % p does in our case it puts
a zero X on the front and prints it out
in hexadecimal but that's actually not a
cross-platform expectation % t P may do
something different on a on a Linux or a
solaris or what have you now how many
people use casting yes that's good so so
casting unfortunately is not a magical
process that somehow makes the
conversion work all it does is tell the
compiler that you actually intended to
assign one to the other so as our
example here shows we can assign the
pointer variable to an integer variable
and voila wax the top off again but
except at this time the compiler hasn't
actually said anything it says hey we
put in an inch cast in there this
program must know what they're doing so
it doesn't say anything and again you
can do the same thing with a long cast
the long cast will do the right thing so
so the basically bottom line is that all
those caps you thought were we're going
to fix the 64-bit problem actually
aren't doing you a bit of good now sign
extensions is a little bit complicated
here the problem with is that a unsigned
64-bit number may actually look like a
signed 32-bit number and it's a little
bit messy to set it up but I did check
this out in a code so if you run back to
the lab I'm I'm reasonably certain that
if you type all this and you'll get more
or less the same result however we're
destroying the slides after this talk so
you won't actually you'll have to work
from memory
so let's say to make an example let's
take and assign 9 bajillion to an
unsigned integer variable so this is
just above 2 gigabytes so so if this as
assigned value would actually come out
as a negative number well we're cleaning
or code we've made it an unsigned
integer we can take that we can assign
it to an unsigned long and the right
thing happened we can but then if we
take that and we assign it to a signed
integer we end up with a value that's
less than zero okay and if the other
thing that's interesting about that if
you also take that less than zero value
and then assign it to assign long it's
still less than zero right so now we've
taken our nice large positive number and
turn it into a large negative number now
the juicy part about this is then you go
and do another assignment you assign it
back to an unsigned long okay it now
comes out as an extremely large unsigned
number and so if you say we're expecting
this to be say a number of iterations
you know two billion iterations you
could that's sort of plausible however
large that number is that's an awful lot
of iterations your machine will spend
quite a bit of time getting to the end
of that so that's kind of a mysterious
looking number but what it really is
it's the nine bajillion you had
originally but it had FFF glued on the
front essentially that's just being a
sign extended in two's complement so
that's what that value really comes from
but after a set of transformations like
this it's not obvious so that's what the
remember where the number really comes
from and what you want to do is when
you're sitting in debugger and the
numbers aren't making any sense look at
them in hexadecimal and a lot of cases
you'll see that in fact there's a sign
extension it's gone on numbers then
turned into a large unsigned value
now another way to have bad stuff
happening through function calls now
most of you have probably done
prototypes is that true is everybody
done prototypes for all their functions
as it's going ok there we go hey
everybody does prototypes for all their
functions they add prototypes when the
system doesn't provide the prototypes
and fewer hands go up there so so it
this time around for 64-bit the
prototypes really really matter because
the rules of see our that if you don't
have a prototype it'll fall back to its
default which is to pass doubles for
floats which is usually ok but to pass
integers for the integer arguments and
particular that's not long so and the
compiler may or may not say something
about this so we have a here an example
of a function called fun with its
declared as a function but it's got
doesn't have a proper prototype we can
take a long value and assign it to to a
long variable and that works and we can
pass that into the function but what
happens is that the function calling
process truncates the integer again a
fashion it should be familiar by now and
if you look at the value of the variable
inside the function it's chopped off
again now if you pass the whole constant
it'll do the same thing it'll still chop
it off in that case the compiler will
give you a warning that is chopping it
off which is a small consolation ok so
we've got all these ways of losing by by
getting the values cut up in ways that
you don't want so how do you do it right
I mean what do you do you have often a
very situation for instance you have to
send something out over the wire you
have to send something to a 32-bit
process and you need to preserve both
halves of your big pointer there's a
couple different ways to do it and I
won't try and recommend a single way
because it really depends on your
situation your application one way that
works reasonably well is to use a union
in this case we have a union that exists
only for the purpose of
splitting up a big value and we have two
fields Union along and an array of two
inch we can assign it to the we can
assign them to take a long variable
assign it to the Union and then if you
look at the two halves Union they come
out as the two halves of the integer now
downside a hazard of this this is Indian
dependent okay so if you're just
transmitting within from one processor
to the same kind of processor this will
work or if you're doing within a single
program but if you're splitting in half
and sending it to has over the wire to
say an x86 machine chances are the two
halves will go out wrong and so usually
I need to be aware of that which half is
which because again in in traditional
fashion you send it out over the wire it
comes out wrong you sent to over the
wire in it and the x86 receives it
receives it as 2 times 4 gigabytes or 8
gigabytes and if for instance they and
that's in the header of a packet and
it's expecting 8 gigabytes of data when
you only sent to the x86 machine is
going to be waiting for a very long time
on the plus side you can then say that's
a Windows bug and everybody will believe
you so the other way to do that that's
more reliable if you have to pay
attention and Eunice is to write out
manually the cutting up operation and
written out here and I've tested this
one too so again you know do it from
memory on a lab machine and see if it
really works and come tell me if I got
it wrong the game here is we need to
mask off we want to mask off the low and
high halves of the of the long number
now it'll mask off the low half you can
end it with lots of F the trick here is
you need to end it with lots of F with
lots of leading zeros okay if I just
said 0 X and then 8 f that would get
sign extended 20 x + 16 apps and the
ampersand then would would yield a big
number and then it would be cut off to
get the wrong half sorry no we actually
get the right half but it we get it for
the wrong
so so I recommend doing it this way they
don't have a slide actually for this but
you can actually get into trouble if you
don't add the elves on to the the ends
of your constants so I recommend you do
that everywhere now that you're doing
working with 64-bit programming to get
the high half of the number since we're
working in a signed regime the correct
thing to do is to do a shift you can
shift down by 32 and in this case we
don't have to put an L on the 32 because
it's just a shift and that will make the
sign extended correctly if you have a
negative number that you're cutting up
and if we print printf it we see that
it's cut up in the two halves correctly
so what kind of assistance do we
actually give you to write compatible
code ok so we have all these problems
all these different ways to lose you
know what do you do so at the at the
language level would give you some
standard types and macros somebody's
have been around for a long time we have
size t we have in PTR key which is an
integer type that is large enough to
hold a pointer and that'll have the
correct type whether you're doing either
32 or 64-bit programming we have you NP
t RT corresponding type for unsigned
there's also an in 32 T and n 60 40
types that you can use it for the UNIX
side of the world there's additional
types that are specific to unix are not
a standard C type we have things like
see a dirty which is the size of a core
adder which is just a euphemism for a
pointer into main memory so that's going
to be a 64-bit type in the 64-bit world
so many process IDs off t is for file
offsets and so forth there's a if you
look in the user include you'll see
there's quite a few of these headers in
fact if you compare the Panther headers
the Tiger hitters you'll see a whole
bunch of changes those places where
pants are only had 32 bit headers so the
the obvious implication is beware of
compiling of stuff on Panther because
if you can you can Panther didn't give
you any complaints the headers are
different on tiger and so you may see
things entire that you didn't get out of
Panther another thing that's actually
very common for programs is to that
programs have been ported a 32-bit say
they were on UNIX systems already they
will have their own local definitions
for types and there's you know any
number of different conventions that
people use I'll have macros all
uppercase they'll have funny names for
you'll see all kinds of different things
out there when one source i'll recommend
is the good news software itself you
knew has been ported everywhere in the
universe has been ported as native tools
cross compiler tools all that kind of
stuff so every combination of 64 and 32
bits you can think of as actually had to
have been handled and there are a set of
definitions in there that that have been
proven over time to work well for this
kind of thing one of the the special
things has come up and several people
have had to solve this is what to do
about these print Jeff directives okay
there turns out there's not standard
macros for these also it seems like a
really good idea so you'll see some
programs they'll actually define macros
for the printf directives so I have an
example here if you want percent lld to
do the right thing and you don't want to
say pass an integer to present lld
because then that will grab the four
bytes of your integer and the four bytes
of the next data the next data item pass
to printf we wanted to to encapsulate
this somehow and so you can do something
with that amounts to using string
concatenation which is the capability of
C and you have part of your string you
have the macro with the directive in it
and then you have the rest of the string
and that's why you can get something so
that the compiler won't give you
warnings about the printf directives not
matching up with the data types
go so further API changes we have we
give you a LP 64 macro double
underscores on the front and back
because it's a something predefined by
the compiler and LP 64 the value of it
is one for 64-bit compilation as zero
for 32-bit compilation we also give you
a ppc64 that's defined in when you're
compiling for 64-bit PPC and it's not
defined for 32-bit PPC and we there was
the existing macro double underscore EPC
is not defined when you're doing 64-bit
PowerPC compilation and we had a little
debate on that and we looked at the uses
in practice and they were pretty much
either or theirs I either you're doing
32-bit or you're doing 64-bit and if you
turned on PPC and ppc64 at the same time
it would confuse a lot of headers so we
decided to make the mutual exclusive in
practice you should almost never use
ppc64 directly because that's going to
wire in an architecture dependency
unless you're actually literally writing
PowerPC code slipped into the middle of
C code or something like that you
probably want to use LP 64 instead or
else if at all possible right to code to
be the 32 64 independent one of the
things you'll see is again if you look
at the tiger headers you'll see we've
made a bunch of API changes where we had
to choose whether a value an argument
was a long or an integer and so I just
thumb through and found this a little
bit out of a header file whose name i
forgot to write down so i don't remember
which one it is but the functions get at
her list and in the old in the Panther
headers the argument for them is said
unsigned long and that would have an
unexpected InDesign will consequently
[Music]
it's the unsigned int however the
programs that have their own
declarations of
the same system function which happens
would be inconsistent if you if they
continue to say unsigned long and you
had unsigned int in the system header so
those are mostly conditional eyes on LP
64 and this way we get backwards
compatibility anther code that may refer
to the same prototype we have one little
API change for assembly language which
is a new directive to allocate a an
8-byte object and just call it dot quad
it's not the greatest name in the world
but it's consistent with other
assemblers to do this so that's why we
chose it the dot quad here I'm just
feeding it a large constant but it also
works to feed it a relocation that's not
very interesting right now but when the
full majko the 64-bit majko file format
is available you may end up wanting to
use this in assembly code so I've
alluded warnings the number of times is
now this time you really have to pay
attention to the warnings if you're
getting a warning and it's telling you
about loss of precision or casting
integers to pointers you know this time
around it's not just for show you really
are going to lose data and bad things
will happen one of the things you can do
is to add additional compiler options
just to be on the safe side in Xcode you
can say ask for other warning flags
which is the equivalent of dash W all
for DCC users and that will turn on lots
of additional warnings and people are
often annoyed they say well w all it
turns on too many warnings but actually
in the in doing 64-bit programming wall
is actually not all doesn't even list
all the bad things that can happen to
your code so we have an additional
option the dash W conversion which gives
additional warnings about conversions
that might possibly lose data precision
and 64-bit if it says you might lose
data precision you probably are losing
day
so we recommend using W conversion you
can also ask for dash W require
prototypes and there's not an Xcode flag
for it that I could see so send in a
radar for that and what it does is it
actually insists that all your functions
have prototypes so if you have a
forgotten piece of code that was always
quietly taking integers and assuming
everything was okay dash W require
prototypes will will flag them for you
say eight need prototypes there take a
moment to talk about what we do at tools
and utilities we have an extended set of
AP is and a few new AP is to handle
tools that want to manipulate 64-bit
processes but don't necessarily want to
be 64-bit themselves an obvious example
of GEB when you run gb and xcode it's
actually a 32-bit program still but it
is manipulating a a 64-bit process which
is the program you're debugging so the
way we do that is we have things like a
type vm address t in the in the system
headers and it sets a 64 it's set to a
64-bit type this ppc64 is enabled we
also have an extended api such as vm
read which will read 64-bit address it
will real read data out of the 64-bit
process so the OS guys have been slaving
away hard on this over the past few
weeks due to get all that to work now
device drivers are running in a 32-bit
environment we have a 32-bit kernel and
this is partly for efficiency and partly
for two so to have a single kernel that
runs on all types of systems and we can
we can do this actually because the the
representation of memory as it's often
the previous slide representation in
memory is as data structures so a 32-bit
the compiled text can
actually manipulate the memory going
into a 64-bit task so specific names of
function we have a prepare method that
established IO mappings and that we have
specific routines both for use with dma
and parallel i/o situations actually no
it's probably not parallel I oh that
just showed I'm but I'm knows I'll let
it let it go with that programmed i/o
thank you God for yes I like that says
you know parallel that's that's like
8-bit microcomputers eyes is probably
not what they meant an i/o kid land ok
so yeah so programmed i/o we have read
bites and write byte methods so at least
at least for now and and there is a
possibility we may have to do something
with 64-bit device drivers and and you
have OS guys over a nap side of the room
that you can button hole in the Q&A
period to ask about that now if you're
actually doing 64 bit io from the from a
are doing i/o from a 64-bit process the
same positive api's work as always ok
they say they've been essentially
compiled to take 64-bit addresses and
all that stuff has been done and again
well if people have questions about I
oak it live and IO user client plugins
which are not available let's bring it
up in the Q&A session so I'd like to pop
back up a little bit and talk about the
design issues that you might want to
think about the one of the classic uses
of 64-bit applications that have become
prevalent in recent years is to to use
them for servers and servers are
actually a very interesting youth
because the the class is a classic model
now what we've seen with Internet type
servers as the Internet's become popular
is as they need to handle large numbers
of clients in some cases maybe thousands
of clients simultaneously and it's a
very convenient to to actually be able
to have a very
arge address space because then what you
can do is have say one thread per client
and have access to a single large shared
data space so for instance you're
serving out images you load every last
one of your images in the memory so that
is readily available you can serve it
out to clients as they ask for them and
this can actually be a very effective
approach for things like databases where
you can lock on on individual data
elements and you can have a single
server managing all of those rather than
trying to do something with multiple
server processes managing shared files
so that's so the internet server is a
real is a really interesting area to add
to do 64-bit programming we can
generalize that a little bit and talk
about compute engines in general and the
terror vision demos you saw yesterday as
fee Peters did there's actually a
classic example of that we have a 32 bit
gooey front end and we use a inter
process communication in one form or
another going back to the compute engine
which is handling the very large address
space what this does is essentially it
can shift the burden from your
application code to the system okay a
lot of programs actually already have
mechanisms to handle large amounts of
data and what they'll do is they'll
manually page in data isn't needed and
page it out and page in different data
and with the 64-bit to a capability you
actually wouldn't have to do that
anymore you could just allocate large
amounts of memory and suck it in and use
it and never have to worry about running
out and if you have a piece of data
that's not being used at the moment you
can essentially rely on the virtual
memory system to page it out for you so
in a sense what it's doing is it's
replacing your code with a system code
and that can be a great advantage
because now it's not you having to write
all this stuff and play computer
scientists and read about vm memory
items you can let the the friendly
experts that Apple take care of that for
you now it may be that you
know something about your applications
memory usage that will be actually more
efficient than the generic OS can do and
you basically have to reevaluate that
for your own application do you have a
usage pattern you always have something
that's a always first in first out and
that doesn't necessarily with a last in
first out in the vm system and that's
just going to depend on you as your
application if you already have a memory
management algorithm that you know is
more efficient than anybody else in the
world can do then you probably want to
stick with it and maybe even stick with
using a 32-bit program and continue to
exploit your algorithm but it's the kind
of thing you want to actually to stop
the moment they take a look at and say
you know is this the best algorithm or
can I in a leverage apple's vm system
and get something better ok so what
64-bit does for us is that it really
opens up a lot of new vistas and I've
talked about a couple very specific
things but you know they're just there
are things we already know about there's
actually all kinds of opportunities that
we really don't know about I was
actually talking with one of our guys
last night and thinking about that
surface of the earth analogy and in fact
if you think about it every point on the
surface of the earth is identifiable
with a 64-bit number so you could write
an application for instance in which
instead of maintaining say a linked list
of locations stored at 64 bit values you
record data about a point on the earth
at that memory address ok so essentially
you're storing data on the earth you use
the entire 64-bit address space they
point under my feed you know is 4589
bajillion and you record you know the
color of the carpet at that location and
you just record it directly at that
memory location and as you read and
write that data to the vm system will
handle the paging of that blob so it's
kind of exotic right you know anything
about that's kind of silly you know who
would do that but the thing is that's
actually something you can do and since
you couldn't do previously
so you know maybe it's a good idea maybe
it's not so there's a real bottom line
in all this is that that these kinds of
applications are actually only limited
by your imagination and the final Tiger
we're going to give you the full 64-bit
address space and most other systems
don't give you that much they give you
somewhat less than that so I gave you
the whole thing and we're actually very
excited to see what you'll think of to
do with it in the future so that's my
part and I'd like to bring up Matt
Formica to run they do the wrap up and
do the Q&A session and thank you very
much so you've seen a lot of information
about 64 bit today there is further
documentation information available for
you the main thing right now is the
64-bit transition guy that's been
written it provides kind of the state of
the world right now and tiger with the
preview DVD that you have so that's the
best place to get information about what
we're doing we're now going to bring up
our QA panel we have a bunch of
engineers who are going to come up you
can certainly talk with us gasps
questions this week beyond this week
feel free to send me an email I'm the
developer tools evangelist here at Apple
my email is M formica at apple com I'd
love to communicate with you via email
about the 64-bit tool set on Mac OS 10