WWDC2016 Session 511

Transcript

[ Music ]
>> Hi and welcome
to Session 511,
AVCapturePhotoOutput,
Beyond the Basics.
This is a chock talk
addendum to Session 501,
Advances in iOS Photography.
I'm Brad Ford.
I'm an engineer on the core
media capture team at Apple.
In session 501, we focused on AV
foundations camera capture APIs,
specifically the
AVCapturePhotoOutput,
which is a new interface
for taking photos in iOS 10.
This output supports Capturing
Live Photos, RAW + DNG,
Wide Color Content, and
Preview or Thumbnail Images.
Wide Color Content, and
Preview or Thumbnail Images.
If you haven't watched
Session 501 yet,
I recommend pausing here and
watching Session 501 first.
You'll get a lot more
out of this addendum.
In this session, we'll move
beyond the AVCapturePhotoOutput
basics and discuss two important
topics we didn't have time
for in Session 501.
Namely, Scene Monitoring,
and Resource Preparation
and Reclamation.
Lastly, we'll spend a few
minutes on an unrelated
but still very important topic,
Camera Privacy Policy
Changes in iOS 10.
By way of minimal review,
the new AVCapturePhotoOutput
has an improved interface
that addresses some
of AVCaptureStillImangeOutput's
design challenges.
AVCapturePhotoOutput uses a
function programming model.
There are clear delineations
between mutable and
immutable data.
It uses a separate object
to encapsulate per
photo settings
to encapsulate per
photo settings
called AVCapturePhotoSettings.
You pass it when making
a photo capture request.
It uses a delegate
style interface
for tracking the progress
of photo capture requests.
This is called the
AVCapturephotoCaptureDelegate
protocol.
All callbacks in the delegate
protocol return an instance
of
AVCaptureResolvedPhotoSettings.
This is an immutable object
in which all photo
settings have been resolved.
AVCapturePhotoOutput also
supports Scene Monitoring using
a subset of these capture
objects I just talked about.
Scene monitoring
allows you to present UI
that informs the user what
scene dependent features are
currently active.
In this screenshot of Apple's
Camera app, the user is clear
in a low-light situation.
The flash iconography at the
bottom of the screen indicates
that the user is
in auto flash mode,
meaning the flash
should only be used
meaning the flash
should only be used
if the situation requires it.
Apple's Camera app is a client
of AVCapturePhotoOutput,
which performs Scene Monitoring
to drive the flash
active yellow flash badge
that you see in the top middle.
The presence of the yellow
flash badge shows the user
that if they take a picture
now, the flash is going to fire.
AVCapturePhotoOutput
offers Scene Monitoring
for two kinds of scenes.
The first is the flash.
All of Apple's current
iPhone models,
as well as the 9.7-inch iPad
Pro have a true tone flash
to illuminate dark scenes for
the rear-facing eyesight camera
and a retina flash that
turns your retina display
into a true tone flash,
illuminating it at up
to three times normal
in order to brighten
up selfies in low light.
The second type of supported
Scene Monitoring is Still
Image Stabilization.
Still Image Stabilization is
a multi-image fusion capture
Still Image Stabilization is
a multi-image fusion capture
that blends differently
exposed images to reduce blur
in low-light situations.
It might not be totally obvious
why Still Image Stabilization is
a low-light feature, it's not
that your hands shake
more in the dark.
It's just that the camera
needs to expose longer
to gather the same number of
photons requiring the shooter
to be very, very steady.
Still Image Stabilization
counters this problem
by capturing multiple images
at different exposures
and then fusing them together
to reduce noise and
motion artifacts.
So at first glance,
flash worthiness
or Still Image Stabilization
worthiness would seem
like orthogonal features,
but they're actually
closely related.
And this causes some
API ambiguity.
Looking at this graph, we see
the applicable light ranges
for Flash Capture with
and without Still
Image Stabilization.
I've shortened Still
Image Stabilization
to SIS for brevity.
The blue bar represents
the light levels
The blue bar represents
the light levels
at which the photo
output will use the flash
if you've opted in for SIS.
The green bar represents
the applicable light levels
for flash if you've
opted out of SIS.
Note that with SIS on,
the photo output can do
without the flash
in darker scenes.
This is because SIS lowers the
noise in the image to a point
so the flash is not needed.
If your current scene's
light level is, say, here,
the answer to the question,
is this a flash scene
is a resounding yes.
But if the light level is
here, the answer depends
on whether you're interested in
using Still Image Stabilization,
and the inverse is true also.
So what to do.
The AVCapturePhotoOutput
doesn't know what kind
of capture you want
until you request it.
But if you're using
Scene Monitoring,
it needs to run continuously.
Is the current scene a SIS
scene or a flash scene?
In AVCapturePhotoOutput we've
addressed this ambiguity
In AVCapturePhotoOutput we've
addressed this ambiguity
with a specific API for
Scene Monitoring called
photoSettingsForSceneMonitoring.
And we've provided two key
value observable properties
that can asynchronously
inform you
when scene suitability
changes with respect
to Still Image Stabilization
or flash.
You create an
AVCapturePhotoSettings instance
specifically for Scene
Monitoring and specify
which features you'd
like AVCapturePhotoOutput
to consider.
Here I've set the flash
mode to auto indicating
that I'm interested in
using the flash feature
when it's available,
and I've also set
isAutoStillImageStabilization
Enabled to true.
So SIS should be considered too.
SIS tends to give better image
quality results than flash,
so if a scene falls into the
overlapping range between SIS
and flash, the photoOutput
reports that it's an SIS scene.
Next, I assign this object
as the photo settings
Next, I assign this object
as the photo settings
for SceneMonitoring property.
This property can be set
at any time including
before you start the
AVCaptureSession running.
To be informed of
changes to flash
and Still Image Stabilization
Scene worthiness,
I add key value observers for
the aforementioned isFlashScene
and
isStillImageStabliziationScene
Properties.
And I'm called back as
scene worthiness changes
for those two properties.
Now let's talk about
Scene Monitoring defaults.
photoSettingsforSceneMonitoring
is a nullable property,
and its default value is nil,
meaning no scenes
are being monitored.
If you query
isStillImageStabilization
or isFlashScene without first
configuring photo settings
for Scene Monitoring, they will
answer false forever and ever.
Once you do configure photo
settings for Scene Monitoring,
you can query or key value
observe the two isScene
properties and get
appropriate answers.
properties and get
appropriate answers.
Be aware, though that
if your photo settings
for Scene Monitoring
contain a flash mode of off,
isFlashScene will still
always report false.
Ditto
for AutoStillImageStabilization
Enabled.
My recommendations for
Scene Monitoring are simple.
If your app doesn't display
any UI indicating what kind
of scene the user is
seeing, you don't need
to enable Scene Monitoring.
But if you do, monitor
what you intend to capture.
For example, if you intend
to capture using Auto Flash
but not SIS, then monitor
with flash node set
to auto and auto SIS off.
Doing otherwise will
likely confuse your user,
as your UI might report
that it's not a flash scene
while the flash actually does
fire when taking a picture.
That wraps up Scene Monitoring.
On to our next Beyond
the Basics topic,
Resource Preparation
and Reclamation.
To understand the need for
on-demand resource preparation,
let's look at AVCaptureSession's
normal flow of data.
When you call AVCaptureSession
startRunning,
data begins flowing from all
your AVCapture inputs the
various AVCapture outputs.
Most outputs receive and handle
this data in a streaming manner,
such as the VideoPreviewLayer,
which continuously displays
input data to the screen.
Or VideoDataOutput
which pushes buffers
to your app via delegate
callback.
Streaming outputs such as these
require a disruptive capture
render pipeline rebuild if you
change their configuration.
You have to configure
them for one type
of output before you
call startRunning.
AVCapturePhotoOutput
is different,
since it only receives data from
its input on an as-needed basis.
When you request a photo
by calling CapturePhoto
with settings and delegate,
the photo output delivers just
one result or set of results.
the photo output delivers just
one result or set of results.
Unlike the streaming outputs,
the photo output has
a lot of downtime.
It's perfectly positioned to
prepare or reclaim resources
on demand without causing a
disruptive reconfiguration
of the render pipeline.
It has the luxury of preparing
while no one's watching.
Resource preparation
isn't free, of course.
And AVCapturePhotoOutput's
feature set is extensive.
Taking an uncompressed 420
photo in the native format
of the AVCapture device
requires some minimal resources.
Processed output such as EGRA
or JPEG requires
additional resources,
since there's a format
conversion involved.
Flash captures require their
own set of hardware resources
for delivering the
pre-flash sequence
and strobe synchronized result.
Still Image Stabilization
requires multiple buffers
for fusion.
RAW capture requires
very large buffers.
RAW capture requires
very large buffers.
RAW + JPEG requires
a combination
of resources big and small.
And bracketed capture
requires multiple buffers
to return multiple
images to the client.
And of course, many of
these features can be mixed
and matched, requiring
a superset of resources.
With so many capture features
available, it's difficult
for the AVCapturePhotoOutput
to guess how many
resources to prepare upfront.
And both over-preparing and
under-preparing are bad.
We liken an over-preparing to
baking a birthday cake every day
of the year, just in
case it's your birthday.
It's a lot of effort for us.
A lot of material invested.
A lot of uneaten cake
gets thrown away.
Video preview might come
up slower each time.
Memory consumption might
be needlessly high.
Under-preparing is just
as bad, if not worse.
If we're not ready
to capture a photo
with your requested feature
set, we might miss the shot,
with your requested feature
set, we might miss the shot,
while allocating
resources on-demand.
Fortunately, we've
provided a solution.
AVCapturePhotoOutput allows you
to tell it in advance what kinds
of captures you're
interested in.
You do this by calling
setPreparedPhotoSettingsArray,
passing an array of
AVCapturePhotoSettings
with each one representing a
different type of capture you'd
like it to prepare for.
You can optionally pass a
completion handler to be called
when preparation is complete.
The photo output also
provides a read only
preparedPhotoSettingsArray
property
so you can query the settings
array that you last set.
The
setPreparedPhotoSettingsArray
function can do several things.
It prepares resources for
all the types of capture
in your array of settings.
Also, it reclaims unneeded
resources if there are any.
And by passing an empty array,
you can reclaim everything.
And by passing an empty array,
you can reclaim everything.
It calls you back when all
resources are prepared.
And it returns an error if
resources couldn't be prepared.
This is all delivered via
the completion callback.
preparedPhotoSettingsArray's
default value is the default
constructor for
AVCapturePhotoSettings,
which has JPEG set
as the output format
and AutoStillImageStabilization
enabled.
preparedPhotoSettingsArray
is a sticky property.
It persists across
AVCaptureSession start
or stopRunning, begin
or commitConfiguration,
and you can set it and forget it
if you always take the same
kinds of captures in your app.
Another nice feature of
setpreparedPhotoSettingsArray is
that it participates
in AVCaptureSession
begin/commitConfiguration
deferred work semantics.
That is, if you call
beginConfiguration
That is, if you call
beginConfiguration
and then change your
sessions to topology by adding
or removing inputs or outputs
and then set new
preparedPhotoSettingsArray
and then commit the
configuration the preparation
won't occur until the
commitConfiguration is called.
You can atomically change
your session configuration
and prepare your photo output
for the new configuration
simultaneously.
You can prepare before running
your AVCaptureSession to ensure
that your app is ready
to capture photos as soon
as video preview starts running.
If you call
setPreparedPhotoSettingsArray
when the session is stopped,
it doesn't call your completion
handler back right away.
Instead, the handler is called
when preparation completes,
which is after you call
session startRunning.
If your session is stopped
and you prepare with one set
of settings and then you change
your mind and call it again
with another set of settings,
your first completion
handler fires immediately
your first completion
handler fires immediately
with prepared set to false.
This is effectively
a cancellation
of the first preparation.
We have three simple
recommendations
on how you should
use our prepare APIs.
Firstly, prepare.
You can always issue a capture
request without preparing first,
but if the photo output isn't
prepared for precisely the type
of capture you want,
you might get
that first image back slowly.
Second, prepare before calling
startRunning on your session.
Knowing the kinds of
captures you're interested
in lets the session allocate
just the right amount
for you during startup.
Third, re-prepare only
when your UI changes.
You don't need to re-prepare
every time you capture a photo,
just when you change the types
of capture you'll be performing,
like when your user toggles RAW
Capture or Bracketed Capture
on or off in your app.
Not all AVCapturePhotoOutput
features qualify
for on-demand resource
preparation.
The first of these is
isHighResolutionCaptureEnabled.
Some camera formats allow you
to capture a high resolution
still image that is bigger
than the format's sustainable
streaming resolution.
For instance, the front camera's
photo format on iPhone 6s
and 6s Plus supports
5 megapixel stills
but can only stream
at 1280 by 960.
When the camera is
configured with this format,
it can either deliver
1280 by 960 stills
or 5 megapixel stills depending
on whether your photo settings
specify high resolution capture.
But if the camera
must be configured
for 5 megapixel stills upfront,
so AVCapturePhotoOutput
requires you to opt-in
for the feature before
you start running
by setting
isHighResolutionCaptureEnabled
to true.
Once you've opted in,
you can take stills with
or without high res
capture enabled
or without high res
capture enabled
without causing an
expensive graph rebuild.
Similarly, LivePhotoCapture
involves delivering a movie
asset as well as a still image.
The movie contains
samples from the past,
1.5 seconds before
your capture request.
So the capture render pipeline
must be configured upfront
to do this special
kind of capture.
Lastly, live photos
can be intelligently
and automatically
trimmed at capture time
if large purposeful
motion is detected,
such as dropping one's arm
down to put the device
in their pocket.
If you wish to capture full
duration untrimmed live photos,
you must opt-out of autoTrimming
before calling startRunning
on your AVCaptureSession.
Our last topic of the day is
Camera Privacy Policy Changes
in iOS 10.
Let's review Apple's Privacy
Policy with respect to media.
Photos and videos on a user's
iOS device are personal,
Photos and videos on a user's
iOS device are personal,
private and sensitive data.
Use of the camera or microphone
is a privileged allowance
that must be granted
explicitly by the user.
So beginning in iOS 7,
users were notified the first
time an app used the camera
or microphone and given an
opportunity to disallow it.
This is a very good thing.
Transparency and trust are well
worth the one-time annoyance
of tapping okay.
In iOS 10, we're requiring
apps to go one step further
in transparency by informing
the user why they want
to access sensitive data.
Sometimes your UI makes it
obvious, but sometimes not.
Your reason string should
remove all ambiguity.
For instance, here AVCam is
telling the user it wants
to use the camera to
take photos and video.
That's a pretty explicit
statement
about what it will
use the camera for.
Likewise, apps linked against
iOS 10 must provide a reason
Likewise, apps linked against
iOS 10 must provide a reason
string for using the microphone.
And lastly, the Photos Library.
You should be clear in your
reason string with respect
to the Photos Library.
Are you using it for
reading or writing or both?
In the latest version of
Xcode you'll find a list
of possible privacy description
keys, not just for camera,
mic and photos, but for
access to all sensitive data.
In order to use any
of these services,
you must provide
a reason string.
If you don't, your app
will not be granted access
to the desired service.
The three specific keys
you should be concerned
about for Capture are
NSCameraUsageDescription,
NSMicrophone3UsageDescription,
and
NSPhotoLibraryUsageDescription.
Let's summarize what
we've learned about.
AVCapturePhotoOutput
allows fine control
of scene monitoring behavior.
It also allows on-demand
resource allocation
and reclamation.
And Capture clients must provide
a reason for camera, mic,
and photos use as of iOS 10.
For more information, visit
the URL for the Advances
in iOS Photography
Session which is 501.
And if you're still at the show,
we invite you to visit all three
of these related
sessions that have to do
with photography,
RAW, and Wide Color.
Thanks for watching and
happy photo capture.
Enjoy the rest of the show.