WWDC2019 Session 604

Transcript

[ Music ]
>> Good afternoon.
[ Applause ]
Hello, everyone.
And welcome to our session on
introducing ARKit 3.
My name is Andreas.
I'm an engineer on the ARKit
team.
And I couldn't be more excited
to be here today to tell you all
about the third major release of
ARKit.
When we introduced ARKit in
2017, we made iOS the largest AR
platform in the world, bringing
AR to hundreds of millions of
iOS devices.
And this is really significant
for you because that's the wide
audience that you can reach with
the apps and the games you
write.
Our mission has been from the
beginning to make it really
simple for you to write your
first augmented reality app even
if you're a new developer.
But we also want to give you the
tools at hand that you need to
create really advanced and
sophisticated experiences.
So if you look at the App Store
today, we can see that you did
an amazing job.
You built great apps and great
games.
So let's just have a look at a
few of them now.
Bringing your game idea into AR
can make them so much more
engaging and physical, like
Angry Birds.
You can now play it in AR,
actually work around those
structures and identify the best
spot to shoot and then have to
slingshot right into your own
hand and fire off those angry
birds.
ARKit also works really well for
large scale use cases.
iScape is a tool for outdoor
landscaping.
We can place bushes and trees
and watch your next garden
remodeling project come to life
in AR right in your own garden
or backyard.
Last year, with ARKit 2, we
introduced USDZ, a new 3D file
format for exchanging formats
right made for AR.
Wayfair uses this to place
virtual furniture right in your
home.
But leveraging ARKit's advanced
scene understanding features
like environment texturing,
these objects really blend
nicely with your living room.
And Lego is making use of
ARKit's 3D object detection
capabilities.
It finds your physical Lego set
and enhances it in AR.
And thanks to ARKit's multi-user
support, you can even play
together with your friends.
So these are just a few examples
of what you have created.
ARKit helps you to take care all
of the technical details, do the
heavy-lifting for you to make
augmented reality work so that
you can really focus on creating
the great experiences around
them.
Let's do a quick recap of the
three main pillars of
functionality that ARKit
provides to you.
First, there is tracking.
Tracking determines where your
device is with regard to the
environment so that virtual
content can be positioned
accurately and updated correctly
on top of the camera image in
real time.
This creates then the illusion
that virtual content is actually
placed in the real world.
ARKit also provides you with
different tracking technologies
such as road World Tracking,
face tracking or image tracking.
On top of tracking, we have
scene understanding.
With scene understanding, you
can identify surfaces, images,
and 3D objects in the scene and
attach virtual content right on
top of them.
Scene understanding also learns
about the lighting and even
textures in the environment to
help make your content look
really realistic.
And finally, rendering.
It brings your 3D content to
life.
We've been supporting different
renderers like SceneKit,
SpriteKit, and Metal.
And now, this year also
RealityKit, designed from the
ground up with augmented reality
in mind.
So with this year's release of
ARKit, they're making a huge
leap forward.
Not only are your experiences
going to look even better and
feel more natural, you will also
be able to create entirely new
experiences for use cases that
haven't been possible before.
Thanks to many new features
we're bringing with ARKit 3,
like people occlusion, motion
capture, collaborative sessions,
simultaneous use of the front
and back camera, tracking
multiple faces, and many more.
We've got a lot to cover so
let's dive right in and we're
starting with people occlusion.
Let's have a look at this scene
here.
So in order to create a
convincing AR experience, it's
important to position virtual
content accurately and also to
match the real world lighting.
So let's bring in a virtual
espresso machine here and put it
right on the table.
But wait, when people are in the
frame, as in this example, it
can easily break the illusion,
because you would expect the
person on the front to actually
cover the espresso machine.
So with ARKit 3 and people
occlusion, you can now solve
that problem.
[ Applause ]
Thank you.
So let's have a look at how this
is done.
Virtual content by default is
rendered on top of the camera
image.
As you can see here, for pure
tabletop experience, this is
fine, but if any people in the
frame are in front of that
object, the augmentation doesn't
look correct anymore.
So what ARKit 3 now does for you
is, thanks to machine learning,
recognized people present in the
frame and then create a separate
layer that has only the pixels
containing these people.
We call that segmentation.
Then we can render that layer on
top of everything else.
Let's have a look at the
composite image.
It's looking much better now.
But if you look closely, it's
still not entirely correct.
So the person the front is now
occluding the espresso machine.
But if you zoom in here, then
you can see that also the person
in the back is rendered on top
of the virtual object, although
she is actually standing behind
the table.
So the virtual model in that
case should occlude her, not the
other way around.
Now, that happened because we
did not take the distance of
people from the camera into
account.
When ARKit 3 uses advanced
machine learning to do an
additional depth estimation
step, with this estimate, how
far does segmented people are
away from the camera, we can now
correct the rendering order and
make sure to render only people
upfront if they are actually
closer to the camera.
And thanks to the power of Apple
Neural Engine, we are able to do
this on every frame in real
time.
So let's have a look at the
composite image now.
You see that people are
occluding the virtual content
just as you would expect
resulting in a convincing AR
experience.
[ Applause ]
That is great.
Thank you.
So people occlusion enables
virtual content to be rendered
behind people.
It works also for multiple
people in the scene, and it even
works if people are only
partially visible like in the
example before the woman behind
the table actually was not
visible with the full body but
it still is working.
Now, this is really significant
because it does not only make
your experiences look way more
realistic than before, it also
means for you that you can now
create experiences that haven't
been impossible before.
For example, think of a
multiplayer game where you have
people in the frame together
with your virtual content.
People occlusion is integrated
right in ARView and ARSCNView as
well.
And thanks to depth estimation,
we can provide you with an
approximation of the distance of
the people detected with regard
to the camera.
We're using Apple Neural Engine
to do that work.
So people occlusion will work on
devices with an A12 processor or
later.
So let's have a look at how to
turn this on in API.
We have a new property on
ARConfiguration called
FrameSemantics.
This will give you different
kinds of semantic information of
what's in the current frame.
You can also check if a certain
semantic is available on the
specific configuration or device
with an additional method on the
ARConfiguration.
Specific for people occlusion,
there are two methods available
that you can use.
One option is person
segmentation.
This will-- you provide just
with the segmentation of people
rendered on top of the camera
image.
That's the best choice if you
know that people will always be
standing upfront and your
virtual content will always be
behind those people.
For example, a green screen use
case just that you don't need a
green screen anymore now.
And the other option is person
segmentation with depth.
This will provide you with
additional depth estimation of
how far those people are away
from the camera.
That's the best choice if people
might be visible together with
virtual content either behind or
in front of that content.
If you do your own rendering
using Metal or for advanced used
cases, you can also directly
access the pixel buffers with
the segmentation and the
estimated depth data on every
ARFrame.
So now, let me show you how
people occlusion works in a live
demo.
[ Applause ]
So here in Xcode, I have a small
sample project using the new
RealityKit API.
And let me quickly walk you
through what it does.
So in our viewDidLoad method,
we're creating an AnchorEntity
looking for horizontal planes
and we're adding this anchor
entity to the scene.
Then, we're retrieving a URL of
a model to load and load it
using ModelEntity's asynchronous
mode loading API.
We add the entity as a child of
our anchor, and also install
gestures so that I will be able
to drag the object around on the
plane.
So what this does, thanks to
RealityKit, is automatically
setup a World Tracking
configuration because we know
that we need World Tracking for
plane estimation.
And then, as soon as a plane is
detected, the content will
automatically be placed.
Now, this is not using people
occlusion yet.
But I have already a stop to
turn this on.
It's called
togglePeopleOcclusion and what I
want to implement is a method
that lets me switch people
occlusion on and off when the
user taps the screen.
So let's go ahead and implement
it now.
So the first thing I'm going to
do is actually check if my World
Tracking configuration supports
the person segmentation with
depth frame semantics.
It's recommended that you do
that always because if the code
is run on a device that it does
not have Apple Neural Engine and
this frame semantic is not
supported, we want to gracefully
handle that case.
So if that's the case, we would
just display a message to the
user that people occlusion is
not available on that device.
And let's actually move on and
implement the toggle.
I'm going to do a switch
statement on the frameSemantics
property of our configuration
here.
And if
personSegmentationWithDepth is
part of the frameSemantics, then
we remove it and tell the user
that people occlusion is now
turned off.
And I just need to implement the
other case as well.
If you don't have person
segmentation enable then, we
insert the frameSemantics into
different semantics property and
display a message that we now
turn people occlusion on.
Now, I need to rerun the updated
configuration on my session.
So let me retrieve the session
from the ARView and call run
with a configuration that I've
just updated here.
So now, the implementation of my
togglePeopleOcclusion method is
finished, now I need to make
sure that it's actually called
when the user taps the screen.
I already installed a tap
gesture recognizer and here in
my onTap method, I just call
togglePeopleOcclusion.
And that's all I need to do.
And now, let me go ahead and
build that code and run it on my
device here.
And we already see that the
plane was detected and the
content placed.
I can also move it around,
thanks to the gestures that I've
added.
You see that RealityKit has
added a nice grounding shadow.
And now, let's actually have a
look at people occlusion.
Right now, it's still turned
off.
So if I bring in my hand now,
you see that the content is
always rendered on top.
This is the behavior that you
know from ARKit 2.
Now, let me turn it on and bring
in my hand again.
And you'll see that now
[applause] the virtual object is
actually covered.
[ Applause ]
And that's people occlusion in
the ARKit 3.
[Applause] Thank you.
So, let's talk about another
exciting new feature of ARKit 3
which is motion capture.
With motion capture, you can
track the body of a person which
can then for example be mapped
to a virtual character in real
time.
Now, this could only be done
with external setup and special
equipment before.
Now, with the ARKit 3, it just
takes few lines of code and
works right on your iPad or
iPhone.
So, motion capture let's you
track a human body both in 2D
and in 3D.
And it provides you with a
skeleton representation of that
person.
This, for example, enables to
drive a virtual character.
And this is made possible by
advanced machine learning
algorithms running on Apple
Neural Engine.
So it's available on devices
with an A12 or later processor.
Let's look at 2D body detection
first.
How to turn this on?
We have a new frame semantics
option called bodyDetection.
This is supported on the World
Tracking configuration and on
image and orientation tracking
configurations as well.
So you'll simply add this to
your frame semantics and call
run on the session.
Now, let's have a look at the
data we will be getting back.
Every ARFrame delivers an object
of type ARBody2D in the
detectedBody property if a
person was detected.
This object contains a 2D
skeleton.
And the skeleton will provide
you with all the joint landmarks
in normalized image space.
They are being returned in a
flat hierarchy in an array
because this is most efficient
for processing.
But you will also be getting a
skeleton definition.
And thanks to the skeleton
definition, you have all the
information how to interpret the
skeleton data.
In particular, it contains
information about the hierarchy
of joints, for example, the fact
that the hand joint is a child
of the elbow joint.
And you will also be provided
with names for joints that can
then be used for easier access.
So let's have a look at how this
looks like.
Here's a person we detected in
our frame.
And that's the 2D skeleton
provided by ARKit.
As mentioned before, important
joints are named to make it easy
for you to find out the position
of a particular one you're
interested in, for example, the
head or the right hand.
So this was 2D.
Now, let's have a look at 3D
motion capture.
3D motion capture let's you
track a human body in 3D space
and it provides you with a
three-dimensional skeleton
representation.
It also provides you scale
estimation to let you determine
the size of the person that is
being tracked.
And the 3D skeleton is anchored
in world coordinates.
Let's see how to use this in
API.
We are introducing a new
configuration called
ARBodyTrackingConfiguration.
So this lets you use 3D body
tracking but it also provides
the 2D body detection that we
have seen before.
So the frame semantics is turned
on by default in that
configuration.
In addition, this configuration
also tracks the device's
position and orientation and
provides selected World Tracking
features such as plane
estimation or image detection.
So with that, you have even more
possibilities in what you can do
using body tracking in your AR
app.
In order to set it up, you
simply create the body tracking
configuration and run it on your
session.
Note that we also have API to
check if that configuration is
supported on the current device.
So, now, when ARKit is running
and detects a person, it will
add a new type of anchor, an
ARBodyAnchor.
This will be provided to you in
the session that that anchor's
called back just like any other
anchor types that you know.
And just like any anchor, it
also has a transform providing
you with a position and
orientation of the detected
person in world coordinates.
In addition, you will be getting
the scale factor and a reference
to the 3D skeleton.
Let's have a look at how this
looks like.
You see already that it's much
more detailed than the 2D
skeleton.
So the yellow joints are the
ones that will be delivered to
the user with motion capture
data.
The white ones are leaf joints
that our additionally available
in the skeleton.
These are not actively tracked
so the transform is static with
regard to the tracked parent.
But of course, you can direct
the access each of those joints
and retrieve their road
coordinates.
Again, we also have labels for
the most important once, an API
to query them by name so that
you can easily find out about a
particular joint you're
interested in.
Now, I'm sure that you can come
up with a lot of great use cases
for this new API, but I want to
talk about one particular use
case that might be interesting
for many of you, which is
animating 3D characters.
By using ARKit in combination
with RealityKit, you can drive a
model based on the 3D skeleton
pose and it's really simple to
do.
All you need is a rigged mesh.
You can find an example for that
in one of our sample apps that
you can download on the session
home page, but of course, you're
also free to make your own in a
content creation tool of your
choice.
Let's see how easy it is to do
that in code.
It's built right in RealityKit
API.
And the major class we will be
using is the BodyTrackedEntity.
So here is a code example using
RealityKit API.
The first thing you're going to
do is create an AnchorEntity of
type body and add this anchor to
the scene.
Next, you're going to load the
model.
In our case, it's called robot.
We're using the asynchronous
loading API for that.
And in the completion handler,
you will be getting the
BodyTrackedEntity that we now
just need to add as a child to
our bodyAnchor.
And that's already all you need
to do.
So as soon as ARKit now adds the
AR body anchor to the session,
the 3D pose of the skeleton will
automatically be applied to the
virtual model in real time.
And that's how simple it is to
do motion capture with ARKit 3.
[Applause] Thank you.
So, now, let's talk about
simultaneous front and back
camera.
ARKit lets you to World Tracking
with the back-facing camera and
face tracking with the 2Depth
camera system on the front.
Now, one really highly requested
feature from you was to enable
user experiences with the front
and the back camera together.
Now, with ARKit 3, you can do
that.
So with this new feature, you
can build AR experiences using
both cameras at the same time.
And what this means for you is,
that you can now build two new
types of use cases.
First, you can create World
Tracking experiences.
So, using the back-facing
camera, but also benefit from
face data captured with the
front camera.
And you can create Face Tracking
experiences that make use of the
full device orientation and
position in 6 degrees of
freedom.
All of this is supported on A12
devices and later.
Let's see an example.
So here we're running World
Tracking with plane estimation
but we also placed a face mesh
on top of the plane and are
updating it in real time with
the facial expressions captured
through the front camera.
So let' see how to use
concurrent front and back camera
in API.
First, let's create a World
Tracking configuration.
Now, the configuration that I
choose determines which camera
stream is actually displayed on
the screen.
So in that case, it would be the
back-facing camera.
Now I'm turning on the new
userFaceTrackingEnabled property
and run the session.
This will cause that I receive
face anchors.
And I can then use any
information from that anchors
like the face mesh, land shapes,
or the anchors transform itself.
Now, note, since we are working
with world coordinates here, the
user face transfer will be
placed behind the camera, which
means that in order to visualize
the face, you would need to
translate it to a location
somewhere in front of the
camera.
Now, let's also look at the face
tracking configuration.
You create your face tracking
configuration just as you would
do it always and set
worldTrackingEnabled to true.
And then, after you run the
configuration, you can access in
every frame, for example, in the
session that update frame
callback the transform of the
current camera position.
And you can then use that for
whatever use case you have in
mind.
And that's simultaneous front
and back camera in ARKit 3.
We think that you will be able
to enable many great new use
cases with this new API.
[ Applause ]
Thank you.
[ Applause ]
And now, let me hand it over
Thomas who will tell you all
about collaborative sessions.
[ Applause ]
>> Thank you.
Thank you, Andreas.
Good afternoon, everyone.
My name is Thomas and I'm part
of the ARKit team.
So let's talk about
collaborative sessions.
In ARKit 2, you could create
multiuser experiences with the
ability to save and load world
maps.
You had to save a map on one
device and send it to another
one in order for your users to
jump into the same experience
again.
It was a one map -- one-time map
sharing experience, and after
that point, most of the users
wouldn't be in the same--
wouldn't share the same
information anymore.
Well, with collaborative
sessions in ARKit 3, we now
continuously share your mapping
information across the network.
This allows you to create ad hoc
multiuser experiences where your
users are more easily accessing
the same session.
Additionally, we also allow you
to share or we actually share
ARAnchors across all devices.
All those anchors are
identifiable with anchor's
session IDs on those ones.
Note that at this point, most--
all the coordinate systems are
independent from each others
even though we still share the
information under the hood.
Let me show you how this works.
So in this video here, we can
see two users.
Pay attention to the colors.
One user will be showing feature
points in green and another user
will be showing feature points
in red.
As they move around the
environment, they're starting to
map the environment and add more
feature points do it.
At this point, and this is the
internal representation of their
internal maps, they don't know
about each other's maps.
As they move around, they gather
more feature points.
When they gather more feature
points in the scene and you can
see the internal maps, and pay
attention to the color and their
final matching point, then those
internal map will then merge
into each others and will only
form one map only, which means
that each users will now
understand each others and scene
understanding.
As they move around, they map
even more information.
And they continue sharing that
under the hood.
Additionally, ARKit 3 provides
you with like AR participant
anchors that allows you to
understand where another user is
in real time in your
environment.
This is really handy if you want
to display for example an icon
or something to represent that
user.
As mentioned earlier, ARKit 3
also share ARAnchors under the
hood, meaning that if you share
or add an anchor on one device,
it will automatically show up on
the other device.
Let's now have a look at how it
works in code.
As Andreas mentioned earlier,
ARKit is really well integrated
with RealityKit.
If you want to enable the
collaborative session with
RealityKit, it's pretty simple.
You first need to setup your
Multipeer Connectivity session.
Multipeer Connectivity framework
is an Apple framework that
allows you for discovery and
peer-to-peer connections.
Then, you need to pass this
Multipeer Connectivity session
to the AR scenes view
synchronization service.
And finally, as every ARKit
experiences, you have to setup
your
ARWorldTrackingConfiguration,
set the isCollaborationEnabled
flag to true and run that
configuration on the session.
And that's it.
So what's going to happen then?
So ARKit when sending up the
isCollaborationEnabled flag to
true will essentially -- and
running that configuration on
the session -- will essentially
create a new method on the
ARSessionDelegate for you to be
transferring that data.
In the RealityKit use case,
we'll take care of it, but if
you're using ARKit in another
renderer, then we'll-- you'll
have to send that data across
the network.
This data is called AR
collaboration data.
ARKit can create at any point in
time an AR collaboration data
package that you then have to
forward again to other users.
This is not limited to only two
users.
You can have a large amount of
users in that session.
During that process, ARKit will
generate additional AR
collaboration data that you will
have to forward to other devices
and broadcast that data.
Let's see how that works in
code.
So you first need to setup your
multipeer connectivity in this
example, or you can also setup
any framework-- any network
framework of your choice and
make sure that your devices are
sharing the same session.
When they do, then you need to
enable the
ARWorldTrackingConfiguration
with this isCollaborationEnabled
flag to true.
When this is the case, you need
to run-- then run the
configuration.
At this point, you will then
have a new method available
under delegate for you where you
will be receiving some
collaboration data.
Upon receiving that data, you
need to make sure to broadcast
it on the network to other users
that are also in this
collaboration session.
Upon the reception of that data
on the other devices, you need
to update URL session so that it
knows about this new data.
And that's it.
This collaboration session data
automatically exchange all the
user created ARAnchors.
Each anchors is identifiable by
a session ID so that you can
make sure to understand from
which device or which AR session
the anchor is coming from.
As mentioned earlier, the
ARParticipantAnchor represents
in real time the participant
position which can be very handy
in some of your use cases.
So this is how you create
collaborative sessions.
[ Applause ]
Let's now talk about coaching.
When you create an experience,
an AR experience, coaching is
really important.
You really want to guide your
users whether they are new or
returning users into your AR
experience.
It's not a trivial process.
And sometimes, it's hard for you
to understand or even to guide
the user to that new experience.
Throughout that process, you
have to react to certain
tracking events.
Sometimes the tracking gets
limited because the user moves
way too fast.
So far, we've been providing you
with human interface guideline
that allowed you to provide some
guidelines for the onboarding
experiences.
Well this year, we're embedding
that in the UI view
and we call it the AR Coaching
View.
This is a built-in overlay that
you can directly embed in your
AR applications.
It guides your users to a really
good tracking experience.
It provides a consistent design
throughout your applications so
that your users are very
familiar with it.
You actually may have seen that
design before.
We have it AR Quick Look and in
Measure.
This new UI overlay
automatically activates and
deactivate base on the different
tracking events.
And you can also adjust certain
coaching goals.
Let's have a look at some of
those overlays.
So in the AR Coaching View, we
have multiple overlays.
The onboarding UI provides the
users the ability to understand
what you're looking for, and in
this case, surfaces.
Most of the time your experience
require a surface to place
content on to it.
So if you enable plane detection
on your configuration, then this
overlay will automatically show
up.
Secondly, we have another
overlay that provides the user
with the ability to understand
that they have to move around a
little bit more to gather
additional features so that
tracking can works best.
And then finally, we have
another overlay which helps your
user relocalize against certain
environments in case of your
lost tracking for example, or if
the app went into the
background.
Let's look at one example.
So, in this example, we're
asking a user to move the device
around to find a new plane, and
as soon as the user is moving
around and gathering more
feature, then the content can be
placed and the view deactivates
automatically.
So you don't have to do
anything.
Everything is handled
automatically.
[ Applause ]
Let's have a look at how you can
set that up.
Again, this is really easy.
As is it's a simple UI view, you
have to set it up as a child of
another UI view.
Ideally, you set it as a child
of the AR view.
Then, you need to connect this
session to the coaching view so
that the coaching view knows
what events to react to.
Or you need to connect the
session provider outlet of the
coaching view to the session
provider itself if you're using
a storyboard for example.
Optionally, you can set a bunch
of delegates if you want to
react to certain events that the
view is giving you.
And finally, you can also
provide a set of specific
coaching goals if you want to
disable certain functionalities.
Let's look at some of those
delegates.
So we have three new methods on
the AR Coaching View--
CoachingOverlayViewDelegate.
Two of them can react to
activation and deactivation.
So you can choose if you want to
still enable that throughout the
experience or if you think, for
example, that if a user had that
once, it doesn't need to know
anymore.
Additionally, you can react to
certain relocalization abort
requests by default the UI view,
the coaching view will actually
give you or give your users a
new UI bottom that-- where they
can actually relocalize and
restart the session or reset the
tracking for example.
So this new view is really handy
for your applications so that
you can make sure that you've
got that consistent design and
help your users.
Let's now talk about Face
Tracking.
In ARKit 1, we enabled Face
Tracking with the ability to
track one face.
While in ARKit 2, we have the
ability to the multi-face
tracking up to three faces
concurrently.
Additionally, you can also make
sure to recognize the person
when he leaves the frame and
comes back again giving you the
same face anchor ID again.
[ Applause ]
So multi-Face Tracking tracks up
to three faces concurrently and
provides you with a persistent
face anchor ID so that you can
make sure to recognize one user
throughout the session.
If you restart a new session,
then this ID disappears and a
new one comes up.
To enable this, this is really
easy.
We have two new properties on
the ARFaceTrackingConfiguration.
The first one allows you to
query how many multiple faces
can be tracked concurrently in
one session on that specific
device.
And the other one allows you to
set the number of track faces
that you want to track
concurrently.
And that's Multi-Face Tracking.
[ Applause ]
Let's now talk about a new
tracking configuration that we
called ARPositional
TrackingConfiguration.
So this new tracking
configuration is intended for
tracking only use cases.
You often had a use case where
it didn't really need the camera
backdrop to be rendered for
example.
Well, this is made for that use
case.
We can achieve a low power
consumption with the ability to
lower the capture frame rate and
also the camera resolution by
still keeping your rendering
rate at 60 hertz.
Next, let's talk about some of
the scene understanding
improvements we've made this
year.
Image detection and image
tracking has been around for
some time now.
We can now this year detect up
to 100 images at the same time.
We also provide you with the
ability to detect the scale of
an printed image for example.
Oftentimes, when new application
require a user to use an image
to place content and scale that
content accordingly, the image
might be printed with a
different size for example or
different paper size.
With this automatic scale
estimation, you can now detect
the physical size and adjust the
scale accordingly.
We also have the ability to
query at run time the quality of
an image that you're passing to
ARKit when you want to create a
new AR reference image.
We've also made improvements to
our object detection algorithms.
With machine learning, we can
enhance that object detection
algorithms and provide you with
a faster recognition, and also
in more robust environments, in
more-- in different
environments.
Oftentimes, you had to scan a
specific object in an
environment so that it works
perfectly in another one.
Now, this is a bit more
flexible.
Finally, another area of scene
understanding which is really
important is plane estimation.
Oftentimes, you need plane
estimation to place content.
Well, with machine learning,
we're actually making that even
more accurate and we're making
that even more robust to detect
planes and faster.
Let's have a look at an example.
With machine learning, not only
we can extend those planes on
the ground when features are
detected but the flow is
actually going even further.
But we can also with the ability
to detect-- we also have the
ability detect walls on the side
when no feature points our
present.
And this is thanks to machine
learning.
[ Applause ]
As you can see here-- As you
can-- You saw on the previous
video, we had a couple of
classifications on the planes.
Well, this is done again with
machine learning.
And last year, we introduced
five different classifications,
wall, floor, ceiling, table, and
seat.
Well, this year, we're adding
two additional ones.
We are adding the ability to
detect doors and windows.
As mentioned earlier, plane
classification is really
important-- Or plane estimation
is really important to place
content on your-- in the world.
This is actually ideal for
object placement.
You always your objects to be
placed on a surface.
Well, this year with the new
raycasting API, you can now even
easier place your content like
more precisely and is even more
flexible.
It supports any kind of surface
alignment.
So you're not always bound to
vertical and horizontal anymore.
But also, you can track your
raycast all the time.
Meaning that as ARKit-- or as
you move your device around,
ARKit detects more information
about the environment, it can
accurately place your object on
top of the physical surface such
as those planes are evolving.
Let's see how you can enable
that in ARKit.
Well, it sounds we're creating a
raycast query.
A raycast query has three
parameters.
The first one decides from where
you want to perform that
raycast.
In this example, we're doing
this from the screenCenter.
Then you need to tell it like
what you want to allow in order
to place that content or get the
transforms back.
And additionally, you need tell
it which alignment you want.
It can be horizontal, vertical,
or any.
Then you need to pass that query
on to the trackedRaycast method
on your AR session.
This method has a callback that
will allow you to react to the
new transform and result giving
to you with that raycast so that
you can adjust your content or
your anchors accordingly.
And then finally, when you are
done with this raycast, you can
just stop it.
And those are some of the
scene-- Those are some of the
raycasting improvements that
we've done this year.
Let's move on with some of the
visual coherence enhancement
we've made.
So this year, we have this new
ARView that allows you to
activate and deactivate
different render options on
demand.
It also can also be
automatically deactivated and
activated based on your device
capability.
Let's look at an example of
this.
You've probably seen that video
before.
Look at how the Quadrocopter
actually moves around and how
the objects are moving as well
on the surface and how real all
of these looks like.
When everything disappears, you
actually don't even realize that
those objects were virtual.
Let's look at some of those
visual coherence enhancements
we've made.
Let's look again at the
beginning of that video and
let's pause for a second when
those balls are rolling on the
table.
Here you can see the depth of
field effect which is a new
feature of RealityKit.
Your AR experience are designed
for, you know, small and big
rooms.
The camera on your iOS device
always adjust their focus to the
environment, while the depth of
field feature allows to adjust
the focus on the virtual
contents so that it matches
perfectly with your physical
one, so that the object blends
perfectly in the environment.
Additionally, when you move the
camera quickly or when the
object moves quickly, you can
see that blurriness occurring.
Well-- And then most of the time
when you have a regular renderer
and no motion blur effect, then
your virtual content stands out.
And it doesn't really blend
nicely in the environment.
Well, thanks to VIO camera
motion and the sense of
parameters, then we can
synthesize the motion blur.
And we can apply it on the
visual object so that it blends
perfectly in your environment.
This is a variable on ARSCNView
and on the ARView as well.
Let's look again at this example
where everything looks really
good.
Two additional APIs that we're
making available for visual
coherence enhancement this year
are HDR environment textures and
camera grain.
When you place your content, you
really want that content to
reflect the real world.
Well, with high-dynamic-range,
you can capture even in bright
light environment those
highlights or higher highlights
that makes your contents more
vibrant.
Well, with ARKit this year, we
can actually request for those
HDR environment textures so that
your content looks even better.
Additionally, we also have a
camera grain API.
You'll probably notice when you
have an AR experience in a very
low light environment how your
other content looks really shiny
compared to the camera.
Every camera produce some grain.
And especially in low lights,
this grain can be a bit heavier.
Where with this new camera grain
API, we can make sure to apply
the same grain patterns on your
virtual content so that it
blends nicely and doesn't stand
out.
So those are some of the visual
coherence enhancement for this
year.
But we didn't stop there.
I think there's one feature that
a lot of you have been
requesting.
When you develop an AR
experience, you'll always have
or often in order to prototype
or to test your experience go to
a certain location.
And most of the time when you go
there, you'd come back to your
desk, you develop your
experience, you want to come
back again.
Well, this year with the Reality
Composer app, you can now record
an experience or a sequence.
Meaning that you can go to your
favorite place so where-- the
place where your experience will
be happening to capture the
environment.
ARKit will make sure to save the
sensor data alongside the video
stream into a movie file
container so that you can take
it with you and put it in Xcode.
At that point, the Xcode scheme
settings now have like one new
additional feature or a new
field that allows you to select
that file.
When that file is selected and
then you press Run on the device
that is attached to Xcode, then
you can replay that experience
at your desk.
This is ideal for prototyping,
and even better for tweaking
your AR configuration with
different parameters and trying
to see how the experience looks
like.
You can even react to certain
tracking--
[ Applause ]
So this is great.
I think you've got a lot of
different tools this year in
ARKit 3 where you can enhance
your multiuser experiences with
collaborative sessions,
multiple-face tracking.
You can improve the realism of
all your AR apps with
RealityKit's new coherence
effect, and also the new visual
effects on the
ARWorldTrackingConfiguration.
You can enable new use cases
with the new motion capture, and
also the simultaneous face and
back camera.
And, of course, there are lots
of improvements under the hood
on existing features.
As an example, with object
detection and machine learning.
And least-- last but not the
least, the record and replay
workflow I think will make your
design and your experience
prototyping even better than
before.
So I'm really looking forward
for you to go and download our
samples on the website.
We also have a couple of labs,
one tomorrow and one on
Thursday, where I hope you will
be coming and asking us the
questions you have or just to
chat.
And then we also have two
in-depth sessions, one of which
is around bringing people into
AR which we'll talk more about
people occlusion and motion
capture.
And the second one is about
collaborative AR experiences.
I hope you enjoy the rest of the
conference.
Have a great day.
Bye.
[ Applause ]