1.0 / x
Correctly rounded
x / y
Correctly rounded
acos(x)
<= 1 ulp
acosh(x)
<= 1 ulp
asin(x)
<= 1 ulp
asinh(x)
<= 1 ulp
atan(x)
<= 1 ulp
atanh(x)
<= 1 ulp
atan2(y, x)
<= 1 ulp
cos(x)
<= 1 ulp
cosh(x)
<= 1 ulp
cospi(x)
<= 1 ulp
exp(x)
<= 1 ulp
exp2(x)
<= 1 ulp
exp10(x)
<= 1 ulp
fabs
0 ulp
fdim
Correctly rounded
floor
Correctly rounded
fma
Correctly rounded
fmax
0 ulp
fmin
0 ulp
fmod
0 ulp
fract
Correctly rounded
frexp
0 ulp
ilogb
0 ulp
ldexp
Correctly rounded
log(x)
<= 1 ulp
log2(x)
<= 1 ulp
Math Function
Min Accuracy - ULP values
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
165
174
NOTE: Even though the precision of individual math operations and functions are
specified in Tables 36, 37, and 38, the Metal compiler, in fast math mode, may re-
associate floating-point operations that may dramatically change results in floating-
point. Re-association may change or ignore the sign of zero, allow optimizations to
assume the arguments and result are not NaN or +/-INF, inhibit or create underflow or
overflow and thus cannot be used by code that relies on rounding behavior such as (x +
2
52
) - 2
52
or ordered floating-point comparisons.
The ULP is defined as follows:
If x is a real number that lies between two finite consecutive floating-point numbers a and b,
without being equal to one of them, then ulp(x) = |b − a|, otherwise ulp(x) is the distance
between the two non-equal finite floating-point numbers nearest x. Moreover, ulp(NaN) is NaN.
7.5 Edge Case Behavior in Flush To Zero Mode
If denormals are flushed to zero, then a function may return one of four results:
log10(x)
<= 1 ulp
modf
0 ulp
pow(x, y)
<= 2 ulp
powr(x, y)
<= 2 ulp
rint
Correctly rounded
round(x)
Correctly rounded
rsqrt
Correctly rounded
sin(x)
<= 1 ulp
sinh(x)
<= 1 ulp
sincos(x)
ULP values as defined for sin(x) and cos(x)
sinpi(x)
<= 1 ulp
sqrt(x)
Correctly rounded
tan(x)
<= 1 ulp
tanh(x)
<= 1 ulp
tanpi(x)
<= 1 ulp
trunc
Correctly rounded
Math Function
Min Accuracy - ULP values
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
166
174
1. Any conforming result for non-flush-to-zero mode.
2. If the result given by (1) is a subnormal before rounding, it may be flushed to zero.
3. Any non-flushed conforming result for the function if one or more of its subnormal
operands are flushed to zero.
4. If the result of (3) is a subnormal before rounding, the result may be flushed to zero.
In each of the above cases, if an operand or result is flushed to zero, the sign of the zero is
undefined.
7.6 Conversion Rules for Floating-Point and Integer
Types.
The round to zero rounding mode is used for conversions from a floating-point type to an
integer type. The round to nearest even or round to zero rounding mode is used for conversions
from a floating-point or integer type to a floating-point type.
The conversions from
half
to
float
are lossless. Conversions from
float
to
half
round the
mantissa using the round to nearest even rounding mode. Denormalized numbers for the
half
data type which may be generated when converting a
float
to a
half
may not be flushed to
zero.
When converting a floating-point type to an integer type, if the floating-point value is NaN, the
integer result is 0.
7.7 Texture Addressing and Conversion Rules
The texture coordinates specified to the
sample
,
sample_compare
,
gather
,
gather_compare
,
read
and
write
functions cannot be INF or NaN. In addition, the texture coordinate must refer
to a region inside the texture for the texture
read
and
write
functions.
In the sections that follow, we discuss conversion rules that are applied when reading and
writing textures in a graphics or kernel function. When a multisample resolve operation is
performed, the conversion rules described in this section do not apply.
7.7.1
Conversion Rules for Normalized Integer Pixel Data Types
In this section we discuss converting normalized integer pixel data types to floating-point
values and vice-versa.
7.7.1.1
Converting Normalized Integer Pixel Data Types to Floating-Point Values
For textures that have 8-bit, 10-bit or 16-bit normalized unsigned integer pixel values, the
texture sample and read functions convert the pixel values from an 8-bit or 16-bit unsigned
integer to a normalized single or half-precision floating-point value in the range
[0.0 … 1.0]
.
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
167
174
For textures that have 8-bit or 16-bit normalized signed integer pixel values, the texture sample
and read functions convert the pixel values from an 8-bit or 16-bit signed integer to a
normalized single or half-precision floating-point value in the range
[-1.0 … 1.0]
.
These conversions are performed as listed in the second column of Table 39. The precision of
the conversion rules are guaranteed to be <= 1.5 ulp except for the cases described in the third
column.
Table 39 Rules for Conversion to a Normalized Float Value
Convert from
Conversion Rule to Normalized
Float
Corner Cases
1-bit normalized
unsigned integer
float(c)
0 must convert to 0.0
1 must convert to 1.0
2-bit normalized
unsigned integer
float(c) / 3.0
0 must convert to 0.0
3 must convert to 1.0
4-bit normalized
unsigned integer
float(c) / 15.0
0 must convert to 0.0
15 must convert to 1.0
5-bit normalized
unsigned integer
float(c) / 31.0
0 must convert to 0.0
31 must convert to 1.0
6-bit normalized
unsigned integer
float(c) / 63.0
0 must convert to 0.0
63 must convert to 1.0
8-bit normalized
unsigned integer
float(c) / 255.0
0 must convert to 0.0
255 must convert to 1.0
10-bit normalized
unsigned integer
float(c) / 1023.0
0 must convert to 0.0
1023 must convert to 1.0
16-bit normalized
unsigned integer
float(c) / 65535.0
0 must convert to 0.0
65535 must convert to 1.0
8-bit normalized
signed integer
max(-1.0, float(c)/127.0)
-128 and -127 must convert to -1.0
0 must convert to 0.0
127 must convert to 1.0
16-bit normalized
signed integer
max(-1.0, float(c)/
32767.0)
-32768 and -32767 must convert to
-1.0
0 must convert to 0.0
32767 must convert to 1.0
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
168
174
7.7.1.2
Converting Floating-Point Values to Normalized Integer Pixel Data Types
For textures that have 8-bit, 10-bit or 16-bit normalized unsigned integer pixel values, the
texture write functions convert the single or half-precision floating-point pixel value to an 8-bit
or 16-bit unsigned integer.
For textures that have 8-bit or 16-bit normalized signed integer pixel values, the texture write
functions convert the single or half-precision floating-point pixel value to an 8-bit or 16-bit
signed integer.
NaN values are converted to zero.
Conversions from floating-point values to normalized integer values are performed as listed in
Table 40.
Table 40 Rules for Conversion from Floating-Point to a Normalized Integer Value
In Metal Shading Language 2.0, the following restriction has been removed:
Convert to
Conversion Rule to Normalized Integer
1-bit normalized
unsigned integer
x = min(max(f, 0.0), 1.0)
i
0:0
= int
RTNE
(x)
2-bit normalized
unsigned integer
x = min(max(f * 3.0, 0.0), 3.0)
i
1:0
= int
RTNE
(x)
4-bit normalized
unsigned integer
x = min(max(f * 15.0, 0.0), 15.0)
i
3:0
= int
RTNE
(x)
5-bit normalized
unsigned integer
x = min(max(f * 31.0, 0.0), 31.0)
i
4:0
= int
RTNE
(x)
6-bit normalized
unsigned integer
x = min(max(f * 63.0, 0.0), 63.0)
i
5:0
= int
RTNE
(x)
8-bit normalized
unsigned integer
x = min(max(f * 255.0, 0.0), 255.0)
i
7:0
= int
RTNE
(x)
10-bit normalized
unsigned integer
x = min(max(f * 1023.0, 0.0), 1023.0)
i
9:0
= int
RTNE
(x)
16-bit normalized
unsigned integer
result = min(max(f * 65535.0, 0.0), 65535.0)
i
15:0
= int
RTNE
(x)
8-bit normalized
signed integer
result = min(max(f * 127.0, -127.0), 127.0)
i
7:0
= int
RTNE
(x)
16-bit normalized
signed integer
result = min(max(f * 32767.0, -32767.0),32767.0)
i
15:0
= int
RTNE
(x)
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
169
174
The GPU may choose to approximate the rounding mode used in the conversions from floating-
point to integer value described in the table above. If a rounding mode other than round to
nearest even is used, the absolute error of the implementation dependent rounding mode vs.
the result produced by the round to nearest even rounding mode must be <= 0.6.
7.7.2
Conversion Rules for Half-Precision Floating-Point Pixel Data Type
For textures that have half-precision floating-point pixel color values, the conversions from
half
to
float
are lossless. Conversions from
float
to
half
round the mantissa using the
round to nearest even rounding mode. Denormalized numbers for the
half
data type which may
be generated when converting a
float
to a
half
may not be flushed to zero. A
float
NaN may
be converted to an appropriate NaN or be flushed to zero in the
half
type. A
float
INF must be
converted to an appropriate INF in the
half
type.
7.7.3
Conversion Rules for Single-Precision Floating-Point Pixel Data
Type
The following rules apply for reading and writing textures that have single-precision floating-
point pixel color values.
• NaNs may be converted to a NaN value(s) or be flushed to zero.
• INFs must be preserved.
• Denorms may be flushed to zero.
• All other values must be preserved.
7.7.4
Conversion Rules for 11-bit and 10-bit Floating-Point Pixel Data
Type
The floating-point formats use 5 bits for the exponent, 5 bits of mantissa for the 10-bit floating-
point types and 6-bits of mantissa for the 11-bit floating-point types with an additional hidden
bit for both types. There is no sign bit. The 10-bit and 11-bit floating-point types preserve
denorms.
These floating-point formats use the following rules:
• If exponent = 0 and mantissa = 0, the floating-point value is 0.0.
• If exponent = 31 and mantissa != 0, the resulting floating-point value is a NaN.
• If exponent = 31 and mantissa = 0, the resulting floating-point value is positive infinity.
• If 0 <= exponent <= 31, the floating-point value is 2 ^ (exponent - 15) * (1 + mantissa/N).
• If exponent = 0 and mantissa != 0, the floating-point value is a denormal value given as 2
^ (exponent – 14) * (mantissa / N)
N is 32 if mantissa is 5-bits and is 64 if mantissa is 6-bits.
Conversion of a 11-bit or 10-bit floating-point pixel data type to a half or single precision
floating-point value is lossless. Conversion of a half or single precision floating-point value to a
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
170
174
11-bit or 10-bit floating-point value must be <= 0.5 ULP. Any operation that would result in a
value less than zero for these floating-point types is clamped to zero
7.7.5
Conversion Rules for 9-bit Floating-Point Pixel Data Type with a 5-
bit Exponent
The
RGB9E5_SharedExponent
shared exponent floating-point format use 5 bits for the
exponent and 9 bits for the mantissa. There is no sign bit.
Conversion from this format to a half or single precision floating-point value is lossless and is
computed as 2 ^ (shared exponent – 15) * (mantissa/512) for each color channel.
Conversion from a half or single precision floating-point RGB color value to this format is
performed as follows, where
N
is the number of mantissa bits per component (9),
B
is the
exponent bias (15) and
E
max
is the maximum allowed biased exponent value (31).
• Components r, g and b are first clamped (in the process, mapping NaN to zero) as
follows:
r
c
= max(0, min(sharedexp
max
, r)
g
c
= max(0, min(sharedexp
max
, g)
b
c
= max(0, min(sharedexp
max
, b)
where
sharedexp
max
= ((2
N
– 1)/2
N
) * 2(E
max
– B)
• The largest clamped component
max
c
, is determined:
max
c
= max(r
c
, g
c
, b
c
)
• A preliminary shared exponent
exp
p
is computed:
exp
p
= max(-B – 1, floor(log
2
(max
c
)) + 1 + B
• A refined shared exponent
exp
s
is computed:
max
s
= floor((max
c
/ 2
expp-B-N
) + 0.5f)
exp
s
= exp
p
, if 0 <= max
s
< 2
N
, and
= exp
p
+ 1, if max
s
= 2
N
.
• Finally, three integer values in the range
0
to
2
N
– 1
are computed:
r
s
= floor(r
c
/ 2
expp-B-N
) + 0.5f)
g
s
= floor(g
c
/ 2
expp-B-N
) + 0.5f)
b
s
= floor(b
c
/ 2
expp-B-N
) + 0.5f)
Conversion of a half or single precision floating-point color values to the RGB9E5 shared
exponent floating-point value is <= 0.5 ULP.
7.7.6
Conversion Rules for Signed and Unsigned Integer Pixel Data Types
For textures that have 8-bit or 16-bit signed or unsigned integer pixel values, the texture sample
and read functions return a signed or unsigned 32-bit integer pixel value. The conversions
described in this section must be correctly saturated.
Writes to these integer textures perform one of the conversions listed in Table 41.
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
171
174
Table 41 Rules for Conversion Between Integer Pixel Data Types
7.7.7
Conversion Rules for sRGBA and sBGRA Textures
Conversion from sRGB space to linear space is automatically done when sampling from an
sRGB texture. The conversion from sRGB to linear RGB is performed before the filter specified in
the sampler specified when sampling the texture is applied. If the texture has an alpha channel,
the alpha data is stored in linear color space.
Conversion from linear to sRGB space is automatically done when writing to an sRGB texture. If
the texture has an alpha channel, the alpha data is stored in linear color space.
The following is the conversion rule for converting a normalized 8-bit unsigned integer sRGB
color value to a floating-point linear RGB color value (call it
c
).
if (c <= 0.04045)
result = c / 12.92;
else
result = powr((c + 0.055) / 1.055, 2.4);
The precision of the above conversion must ensure that the delta between the resulting
infinitely precise floating point value when
result
is converted back to an un-normalized sRGB
value but without rounding to a 8-bit unsigned integer value (call it
r
) and the original sRGB 8-
bit unsigned integer color value (call it
r
orig
) is <= 0.5 i.e.
fabs(r – r
orig
) <= 0.5
The following are the conversion rules for converting a linear RGB floating-point color value (call
it
c
) to a normalized 8-bit unsigned integer sRGB value.
if (isnan(c)) c = 0.0;
if (c > 1.0)
c = 1.0;
Convert from
Convert to
Conversion Rule
32-bit signed integer 8-bit signed integer
result = convert_char_saturate(val)
32-bit signed integer 16-bit signed integer
result = convert_short_saturate(val)
32-bit unsigned
integer
8-bit unsigned
integer
result = convert_uchar_saturate(val)
32-bit unsigned
integer
16-bit unsigned
integer
result =
convert_ushort_saturate(val)
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
172
174
else if (c < 0.0)
c = 0.0;
else if (c < 0.0031308)
c = 12.92 * c;
else
c = 1.055 * powr(c, 1.0/2.4) - 0.055;
convert to integer scale i.e. c = c * 255.0
convert to integer:
c = c + 0.5
drop the decimal fraction, and the remaining
floating-point(integral) value is converted
directly to an integer.
The precision of the above conversion should be such that
fabs(reference result – integer result) < 1.0.
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
173
174
Apple Inc.
Copyright © 2017 Apple Inc.
All rights reserved.
No part of this publication may be
reproduced, stored in a retrieval system,
or transmitted, in any form or by any
means, mechanical, electronic,
photocopying, recording, or otherwise,
without prior written permission of
Apple Inc., with the following
exceptions: Any person is hereby
authorized to store documentation on a
single computer or device for personal
use only and to print copies of
documentation for personal use
provided that the documentation
contains Apple’s copyright notice.
No licenses, express or implied, are
granted with respect to any of the
technology described in this document.
Apple retains all intellectual property
rights associated with the technology
described in this document. This
document is intended to assist
application developers to develop
applications only for Apple-branded
products.
Apple Inc.
1 Infinite Loop
Cupertino, CA 95014
408-996-1010
Apple is a trademark of Apple Inc.,
registered in the U.S. and other
countries.
APPLE MAKES NO WARRANTY OR
REPRESENTATION, EITHER EXPRESS
OR IMPLIED, WITH RESPECT TO THIS
DOCUMENT, ITS QUALITY,
ACCURACY, MERCHANTABILITY, OR
FITNESS FOR A PARTICULAR
PURPOSE. AS A RESULT, THIS
DOCUMENT IS PROVIDED “AS IS,”
AND YOU, THE READER, ARE
ASSUMING THE ENTIRE RISK AS TO
ITS QUALITY AND ACCURACY.
IN NO EVENT WILL APPLE BE LIABLE
FOR DIRECT, INDIRECT, SPECIAL,
INCIDENTAL, OR CONSEQUENTIAL
DAMAGES RESULTING FROM ANY
DEFECT, ERROR OR INACCURACY IN
THIS DOCUMENT, even if advised of
the possibility of such damages.
Some jurisdictions do not allow the
exclusion of implied warranties or
liability, so the above exclusion may
not apply to you.
2017-9-12 | Copyright © 2017 Apple Inc. All Rights Reserved.
Page
of
174
174
Dostları ilə paylaş: |