Table of Contents
A variety of compile-time analyses require reasoning about types that are not yet known. Principal among these are generic method applicability testing (§18.5.1) and generic method invocation type inference (§18.5.2). In general, we refer to the process of reasoning about unknown types as type inference.
At a high level, type inference can be decomposed into three processes:
Reduction takes a compatibility assertion about an expression or type, called a constraint formula, and reduces it to a set of bounds on inference variables. Often, a constraint formula reduces to other constraint formulas, which must be recursively reduced. A procedure is followed to identify these additional constraint formulas and, ultimately, to express via a bound set the conditions under which the choices for inferred types would render each constraint formula true.
Incorporation maintains a set of inference variable bounds, ensuring that these are consistent as new bounds are added. Because the bounds on one variable can sometimes impact the possible choices for another variable, this process propagates bounds between such interdependent variables.
Resolution examines the bounds on an inference variable and determines an instantiation that is compatible with those bounds. It also decides the order in which interdependent inference variables are to be resolved.
These processes interact closely: reduction can trigger incorporation; incorporation may lead to further reduction; and resolution may cause further incorporation.
§18.1 more precisely defines the concepts used as intermediate results and the notation used to express them.
§18.2 describes reduction in detail.
§18.3 describes incorporation in detail.
§18.4 describes resolution in detail.
§18.5 defines how these inference tools are used to solve certain compile-time analysis problems.
In comparison to the Java SE 7 Edition of The Java® Language Specification, important changes to inference include:
Adding support for lambda expressions and method references as method invocation arguments.
Generalizing to define inference in terms of poly expressions, which may not have well-defined types until after inference is complete. This has the notable effect of improving inference for nested generic method and diamond constructor invocations.
Describing how inference is used to handle wildcard-parameterized functional interface target types and most specific method analysis.
Clarifying the distinction between invocation applicability testing (which involves only the invocation arguments) and invocation type inference (which incorporates a target type).
Delaying resolution of all inference variables, even those with lower bounds, until invocation type inference, in order to get better results.
Improving inference behavior for interdependent (or self-dependent) variables.
Eliminating bugs and potential sources of confusion. This revision more carefully and precisely handles the distinction between specific conversion contexts and subtyping, and describes reduction by paralleling the corresponding non-inference relations. Where there are intentional departures from the non-inference relations, these are explicitly identified as such.
Laying a foundation for future evolution: enhancements to or new applications of inference will be easier to integrate into the specification.
This section defines inference variables, constraint formulas, and bounds, as the terms will be used throughout this chapter. It also presents notation.
Inference variables are meta-variables for types - that is, they are special names that allow abstract reasoning about types. To distinguish them from type variables, inference variables are represented with Greek letters, principally α.
The term "type" is used loosely in this chapter to include type-like syntax that contains inference variables. The term proper type excludes such "types" that mention inference variables. Assertions that involve inference variables are assertions about every proper type that can be produced by replacing each inference variable with a proper type.
Constraint formulas are assertions of compatibility or subtyping that may involve inference variables. The formulas may take one of the following forms:
‹Expression → T›: An expression is compatible in a loose invocation context with type T (§5.3).
‹S → T›: A type S is compatible in a loose invocation context with type T (§5.3).
‹S <:
T›: A reference type S is a
subtype of a reference type T
(§4.10).
‹S <=
T›: A type argument S is
contained by a type argument T
(§4.5.1).
‹S = T›: A type S is the same as a type T (§4.3.4), or a type argument S is the same as type argument T.
‹LambdaExpression →_{throws} T›: The checked
exceptions thrown by the body of the LambdaExpression are
declared by the throws
clause of the function type derived
from T.
‹MethodReference →_{throws} T›: The checked
exceptions thrown by the referenced method are declared by the
throws
clause of the function type derived from T.
Examples of constraint formulas:
From Collections.singleton("hi")
,
we have the constraint formula ‹"hi"
→ α›. Through reduction, this will become the
constraint formula: ‹String
<:
α›.
From Arrays.asList(1, 2.0)
,
we have the constraint formulas ‹1
→
α› and ‹2.0
→
α›. Through reduction, these will become the
constraint formulas ‹int
→ α› and
‹double
→ α›, and then ‹Integer
<:
α› and ‹Double
<:
α›.
From the target type of the constructor
invocation List<Thread> lt = new
ArrayList
, we have the constraint formula
‹<>
()ArrayList
<
α>
→ List
<
Thread
>
›. Through
reduction, this will become the constraint formula
‹α <=
Thread
›, and then
‹α = Thread
›.
During the inference process, a set of bounds on inference variables is maintained. A bound has one of the following forms:
S = T, where at least one of S or T is an inference variable: S is the same as T.
S <:
T, where at least one of S or T is an
inference variable: S is a subtype of T.
G<
α_{1}, ..., α_{n}>
=
capture(G<
A_{1}, ..., A_{n}>
): The variables
α_{1}, ..., α_{n} represent the result of capture
conversion (§5.1.10) applied to
G<
A_{1}, ..., A_{n}>
(where A_{1}, ..., A_{n} may be
types or wildcards and may mention inference variables).
throws
α: The inference variable α appears in a
throws
clause.
A bound is satisfied by an inference variable substitution if, after applying the substitution, the assertion is true. The bound false can never be satisfied.
Some
bounds relate an inference variable to a proper type. Let T be a
proper type. Given a bound of the form α = T or T = α,
we say T is an instantiation of α.
Similarly, given a bound of the form α <:
T, we say T
is a proper upper bound of α, and given a
bound of the form T <:
α, we say T is
a proper lower bound of α.
Other
bounds relate two inference variables, or an inference variable to a
type that contains inference variables. Such bounds, of the form S =
T or S <:
T, are called
dependencies.
A bound
of the form G<
α_{1}, ..., α_{n}>
=
capture(G<
A_{1}, ..., A_{n}>
) indicates that α_{1},
..., α_{n} are placeholders for the results of capture
conversion. This is necessary because capture conversion can only be
performed on a proper type, and the inference variables in A_{1}, ...,
A_{n} may not yet be resolved.
A bound
of the form throws
α is purely informational: it directs
resolution to optimize the instantiation of α so that, if
possible, it is not a checked exception type.
An important intermediate result of inference is a bound set. It is sometimes convenient to refer to an empty bound set with the symbol true; this is merely out of convenience, and the two are interchangeable.
Examples of bound sets:
{ α = String
} contains a single bound,
instantiating α as String
.
{ Integer
<:
α, Double
<:
α, α <:
Object
} describes two
proper lower bounds and one proper upper bound for
α.
{ α
<:
Iterable<?>
, β <:
Object
, α
<:
List
<
β>
}
describes a proper upper bound for each of α and β,
along with a dependency between them.
{ } contains no bounds nor dependencies, and can be referred to as true.
{ false } expresses the fact that no satisfactory instantiation exists.
When inference begins, a bound set is typically generated from a list of type parameter declarations P_{1}, ..., P_{p} and associated inference variables α_{1}, ..., α_{p}. Such a bound set is generated as follows. For each l (1 ≤ l ≤ p):
If
P_{l} has no TypeBound, the
bound α_{l} <:
Object
appears
in the set.
Otherwise, for each type T delimited by &
in
the TypeBound, the bound α_{l} <:
T[
P_{1}:=α_{1}, ..., P_{p}:=α_{p}]
appears
in the set; if this results in no proper upper bounds for
α_{l} (only dependencies), then the bound α_{l} <:
Object
also appears in the set.
Reduction is the process by which a set of constraint formulas (§18.1.2) is simplified to produce a bound set (§18.1.3).
Each constraint formula is considered in turn. The rules in this section specify how the formula is reduced to one or both of:
Reduction completes when no further constraint formulas remain to be reduced.
The results of a reduction step are always soundness-preserving: if an inference variable instantiation satisfies the reduced constraints and bounds, it will also satisfy the original constraint. On the other hand, reduction is not completeness-preserving: there may exist inference variable instantiations that satisfy the original constraint but do not satisfy a reduced constraint or bound. This is due to inherent limitations of the algorithm, along with a desire to avoid undue complexity. One effect is that there are expressions for which type argument inference fails to find a solution, but that can be well-typed if the programmer explicitly inserts appropriate types.
A constraint formula of the form ‹Expression → T› is reduced as follows:
If T is a proper type, the constraint reduces to true if the expression is compatible in a loose invocation context with T (§5.3), and false otherwise.
Otherwise, if the expression is a standalone expression (§15.2) of type S, the constraint reduces to ‹S → T›.
Otherwise, the expression is a poly expression (§15.2). The result depends on the form of the expression:
If the expression is a parenthesized expression of the form
(
Expression' )
, the constraint reduces to
‹Expression' → T›.
If the expression is a class instance creation expression or a method invocation expression, the constraint reduces to the bound set B_{3} which would be used to determine the expression's compatibility with target type T, as defined in §18.5.2.1. (For a class instance creation expression, the corresponding "method" used for inference is defined in §15.9.3.)
This bound set may contain new inference variables, as well as dependencies between these new variables and the inference variables in T.
If the expression is a conditional expression of the form
e_{1}
?
e_{2}
:
e_{3}
, the constraint reduces to two
constraint formulas, ‹e_{2}
→ T› and
‹e_{3}
→ T›.
If the expression is a lambda expression or a method reference expression, the result is specified below.
By treating nested generic method invocations as poly expressions, we improve the behavior of inference for nested invocations. For example, the following is illegal in Java SE 7 but legal in Java SE 8:
ProcessBuilder b = new ProcessBuilder(Collections.emptyList()); // ProcessBuilder's constructor expects a List<String>
When both the outer and the nested invocation require inference, the problem is more difficult. For example:
List<String> ls = new ArrayList<>(Collections.emptyList());
Our approach is to "lift" the bounds inferred for
the nested invocation (simply { α <:
Object
} in the
case of emptyList
) into the outer inference process
(in this case, trying to infer β where the constructor is for
type ArrayList
<
β>
). We also
infer dependencies between the nested inference variables and the
outer inference variables (the constraint
‹List
<
α>
→ Collection
<
β>
›
would reduce to the dependency α = β). In this way,
resolution of the inference variables in the nested invocation can
wait until additional information can be inferred from the outer
invocation (based on the assignment target, β = String
).
A constraint formula of the form ‹LambdaExpression → T›, where T mentions at least one inference variable, is reduced as follows:
If T is not a functional interface type (§9.8), the constraint reduces to false.
Otherwise, let T' be the ground target type derived from
T, as specified in §15.27.3. If
§18.5.3 is used to derive a functional
interface type which is parameterized, then the test that
F<
A'_{1}, ..., A'_{m}>
is a subtype of
F<
A_{1}, ..., A_{m}>
is not performed (instead, it
is asserted with a constraint formula below). Let the target
function type for the lambda expression be the function type of
T'. Then:
If no valid function type can be found, the constraint reduces to false.
Otherwise, the congruence of LambdaExpression with the target function type is asserted as follows:
If the number of lambda parameters differs from the number of parameter types of the function type, the constraint reduces to false.
If the lambda expression is implicitly typed and one or more of the function type's parameter types is not a proper type, the constraint reduces to false.
This condition never arises in practice, due to the handling of implicitly typed lambda expressions in §18.5.1 and the substitution applied to the target type in §18.5.2.2.
If the function type's result is void
and the lambda
body is neither a statement expression nor a
void-compatible block, the constraint reduces to
false.
If the function type's result is not void
and the
lambda body is a block that is not value-compatible, the
constraint reduces to false.
Otherwise, the constraint reduces to all of the following constraint formulas:
If the lambda parameters have explicitly declared types
F_{1}, ..., F_{n} and the function type has parameter types
G_{1}, ..., G_{n}, then (i) for all i (1 ≤ i
≤ n), ‹F_{i} = G_{i}›, and (ii)
‹T' <:
T›.
If the function type's return type is a (non-void
) type
R, assume the lambda's parameter types are the same as the
function type's parameter types. Then:
If R is a proper type, and if the lambda body or some result expression in the lambda body is not compatible in an assignment context with R, then false.
Otherwise, if R is not a proper type, then where the
lambda body has the form Expression, the constraint
‹Expression → R›; or where the
lambda body is a block with result expressions e_{1}
,
..., e_{m}
, for all i (1 ≤ i ≤ m),
‹e_{i}
→ R›.
The key piece of information to derive from a compatibility constraint involving a lambda expression is the set of bounds on inference variables appearing in the target function type's return type. This is crucial, because functional interfaces are often generic, and many methods operating on these types are generic, too.
In the simplest case, a lambda expression may simply provide a lower bound for an inference variable:
<T> List<T> makeThree(Factory<T> factory) { ... }
String s = makeThree(() ->
"abc").get(2);
In more complex cases, a result expression may be a poly expression - perhaps even another lambda expression - and so the inference variable might be passed through multiple constraint formulas with different target types before a bound is produced.
Most of the work described in this section precedes assertions about the result expressions; its purpose is to derive the lambda expression's function type, and to check for expressions that are clearly disqualified from compatibility.
We do not attempt to produce
bounds on inference variables that appear in the target function
type's throws
clause. This is because exception containment is not
part of compatibility (§15.27.3) - in particular,
it must not influence method applicability
(§18.5.1). However, we do
get bounds on these variables later, because invocation type inference
(§18.5.2.2) produces exception containment
constraint formulas (§18.2.5).
Note that if the target type is an inference variable, or if the target type's parameter types contain inference variables, we produce false. During invocation type inference (§18.5.2.2), extra substitutions are performed in order to instantiate these inference variables, thus avoiding this scenario. (In other words, reduction will, in practice, never be "invoked" with a target type of one of these forms.)
Finally, note that the result expressions of a
lambda expression are required by §15.27.3 to be
compatible in an assignment context with the target type's return
type, R. If R is a proper type, such as Byte
derived
from Function
, then
assignability is easy enough to test, and reduction does so above. If
R is not a proper type, such as α derived from
<
α,Byte
>
Function
, then we
make the simplifying assumption above that loose invocation
compatibility will be sufficient. The difference between assignment
compatibility and loose invocation compatibility is that only
assignment allows narrowing of constant expressions, such
as <
String,α>
Byte b = 100;
. Consequently, our simplifying
assumption is not completeness-preserving: given target return type
α and an integer literal result expression
100
, it is conceivable that α could be
instantiated to Byte
, but reduction will not in fact produce such a
bound.
A constraint formula of the form ‹MethodReference → T›, where T mentions at least one inference variable, is reduced as follows:
If T is not a functional interface type, or if T is a functional interface type that does not have a function type (§9.9), the constraint reduces to false.
Otherwise, if there does not exist a potentially applicable method for the method reference when targeting T, the constraint reduces to false.
Otherwise, if the method reference is exact (§15.13.1), then let P_{1}, ..., P_{n} be the parameter types of the function type of T, and let F_{1}, ..., F_{k} be the parameter types of the potentially applicable method. The constraint reduces to a new set of constraints, as follows:
In the special case where n = k+1, the parameter
of type P_{1} is to act as the target reference of the
invocation. The method reference expression necessarily has
the form ReferenceType ::
[TypeArguments] Identifier. The constraint
reduces to ‹P_{1} <:
ReferenceType› and,
for all i (2 ≤ i ≤ n), ‹P_{i}
→ F_{i-1}›.
In all other cases, n = k, and the constraint reduces to, for all i (1 ≤ i ≤ n), ‹P_{i} → F_{i}›.
If the function type's result is not void
, let R be its
return type. Then, if the result of the potentially
applicable compile-time declaration is void
, the
constraint reduces to false. Otherwise, the constraint
reduces to ‹R' → R›, where R' is the
result of applying capture conversion
(§5.1.10) to the return type of the
potentially applicable compile-time declaration.
Otherwise, the method reference is inexact, and:
If one or more of the function type's parameter types is not a proper type, the constraint reduces to false.
This condition never arises in practice, due to the handling of inexact method references in §18.5.1 and the substitution applied to the target type in §18.5.2.2.
Otherwise, a search for a compile-time declaration is performed, as specified in §15.13.1. If there is no compile-time declaration for the method reference, the constraint reduces to false. Otherwise, there is a compile-time declaration, and: (let R be the result of the function type)
Otherwise, if the method reference expression elides TypeArguments, and the compile-time declaration is a generic method, and the return type of the compile-time declaration mentions at least one of the method's type parameters, then:
If R mentions one of the type parameters of the function type, the constraint reduces to false.
In this case, a constraint in terms of R might lead an inference variable to be bound by an out-of-scope type variable. Since instantiating an inference variable with an out-of-scope type variable is nonsensical, we prefer to avoid the situation by giving up immediately whenever the possibility arises. This simplification is not completeness-preserving.
If R does not mention one of the type parameters of the function type, then the constraint reduces to the bound set B_{3} which would be used to determine the method reference's compatibility when targeting the return type of the function type, as defined in §18.5.2.1. B_{3} may contain new inference variables, as well as dependencies between these new variables and the inference variables in T.
The strategy used to determine a return type for a generic referenced method follows the pattern used earlier in this section for generic method invocations. This may involve "lifting" bounds into the outer context and inferring dependencies between the two sets of inference variables.
Otherwise, let R' be the result of applying capture
conversion (§5.1.10) to the return
type of the invocation type (§15.12.2.6)
of the compile-time declaration.
If R' is void
, the constraint reduces to false;
otherwise, the constraint reduces to ‹R' →
R›.
A constraint formula of the form ‹S → T› is reduced as follows:
If S and T are proper types, the constraint reduces to true if S is compatible in a loose invocation context with T (§5.3), and false otherwise.
Otherwise, if S is a primitive type, let S' be the result of applying boxing conversion (§5.1.7) to S. Then the constraint reduces to ‹S' → T›.
Otherwise, if T is a primitive type, let T' be the result of applying boxing conversion (§5.1.7) to T. Then the constraint reduces to ‹S = T'›.
Otherwise, if T is a parameterized type of the form
G<
T_{1}, ..., T_{n}>
, and there exists no type of
the form G<
...>
that is a supertype of S, but
the raw type G is a supertype of S, then the constraint
reduces to true.
Otherwise, if T is an array type of the form G<
T_{1},
..., T_{n}>
[]
^{k}, and
there exists no type of the form
G<
...>
[]
^{k}
that is a supertype of S, but the raw type
G[]
^{k} is a supertype of S,
then the constraint reduces to true. (The notation
[]
^{k} indicates an array type
of k dimensions.)
The fourth and fifth cases are implicit uses of unchecked conversion (§5.1.9). These, along with any use of unchecked conversion in the first case, may result in compile-time unchecked warnings, and may influence a method's invocation type (§15.12.2.6).
Boxing T to T' is not
completeness-preserving; for example, if T were long
, S might be
instantiated to Integer
, which is not a subtype of Long
but could
be unboxed and then widened to long
. We avoid this problem in most
cases by giving special treatment to inference-variable return types
that we know are already constrained to be certain boxed primitive
types; see §18.5.2.1.
Similarly, the treatment of unchecked conversion sacrifices completeness in cases in which T is not a parameterized type (for example, if T is an inference variable). It is not usually clear in such situations whether the unchecked conversion is necessary or not. Since unchecked conversions introduce unchecked warnings, inference prefers to avoid them unless it is clearly necessary.
A
constraint formula of the form ‹S <:
T› is
reduced as follows:
If S and T are proper types, the constraint reduces to true if S is a subtype of T (§4.10), and false otherwise.
Otherwise, if S is the null type, the constraint reduces to true.
Otherwise, if T is the null type, the constraint reduces to false.
Otherwise, if S is an inference variable, α, the
constraint reduces to the bound α <:
T.
Otherwise, if T is an inference variable, α, the
constraint reduces to the bound S <:
α.
Otherwise, the constraint is reduced according to the form of T:
If T is a parameterized class or interface type, or an
inner class type of a parameterized class or interface type
(directly or indirectly), let A_{1}, ..., A_{n} be the type
arguments of T. Among the supertypes of S, a
corresponding class or interface type is identified, with
type arguments B_{1}, ..., B_{n}. If no such type exists, the
constraint reduces to false. Otherwise, the constraint
reduces to the following new constraints: for all i (1
≤ i ≤ n), ‹B_{i} <=
A_{i}›.
If T is any other class or interface type, then the constraint reduces to true if T is among the supertypes of S, and false otherwise.
If T is an array type, T'[]
, then among the
supertypes of S that are array types, a most specific type
is identified, S'[]
(this may be S
itself). If no such array type exists, the constraint
reduces to false. Otherwise:
If T is an intersection type, I_{1} &
... &
I_{n},
the constraint reduces to the following new constraints: for
all i (1 ≤ i ≤ n), ‹S
<:
I_{i}›.
A
constraint formula of the form ‹S <=
T›, where
S and T are type arguments (§4.5.1), is
reduced as follows:
A constraint formula of the form ‹S = T›, where S and T are types, is reduced as follows:
If S and T are proper types, the constraint reduces to true if S is the same as T (§4.3.4), and false otherwise.
Otherwise, if S or T is the null type, the constraint reduces to false.
Otherwise, if S is an inference variable, α, and T is not a primitive type, the constraint reduces to the bound α = T.
Otherwise, if T is an inference variable, α, and S is not a primitive type, the constraint reduces to the bound S = α.
Otherwise, if S and T are class or interface types with the same erasure, where S has type arguments B_{1}, ..., B_{n} and T has type arguments A_{1}, ..., A_{n}, the constraint reduces to the following new constraints: for all i (1 ≤ i ≤ n), ‹B_{i} = A_{i}›.
Otherwise, if S and T are array types, S'[]
and T'[]
, the constraint reduces to
‹S' = T'›.
Otherwise, if S and T are intersection types, a correspondence between the elements of S and the elements of T is established. An element of S, S_{i}, corresponds to an element of T, T_{j}, if S_{i} and T_{j} are either the same type, or both parameterizations of the same generic class or interface, or both array types.
If each element of S corresponds to exactly one element of T, and vice versa, then the constraint reduces to the following new constraints: for each element S_{i} of S and the corresponding element T_{j} of T, ‹S_{i} = T_{j}›. If not, the constraint reduces to false.
This rule does not accommodate inference variables appearing directly as elements of an intersection type (rather than nested in a parameterized type). Due to the restrictions on type parameter declarations (§4.4), such intersection types do not arise in practice.
A constraint formula of the form ‹S = T›, where S and T are type arguments (§4.5.1), is reduced as follows:
If S and T are types, the constraint is reduced as described above.
If S has the form ?
and T has the form ?
, the
constraint reduces to true.
If S has the form ?
and T has the form ?
extends
T', the constraint reduces to ‹Object
=
T'›.
If S has the form ?
extends
S' and T has the
form ?
, the constraint reduces to ‹S' =
Object
›.
If S has the form ?
extends
S' and T has the
form ?
extends
T', the constraint reduces to
‹S' = T'›.
If S has the form ?
super
S' and T has the form
?
super
T', the constraint reduces to
‹S' = T'›.
A constraint formula of the form ‹LambdaExpression →_{throws} T› is reduced as follows:
If T is not a functional interface type (§9.8), the constraint reduces to false.
Otherwise, let the target function type for the lambda expression be determined as specified in §15.27.3. If no valid function type can be found, the constraint reduces to false.
Otherwise, if the lambda expression is implicitly typed, and one or more of the function type's parameter types is not a proper type, the constraint reduces to false.
This condition never arises in practice, due to the substitution applied to the target type in §18.5.2.2.
Otherwise, if the function type's return type is neither void
nor a proper type, the constraint reduces to false.
This condition never arises in practice, due to the substitution applied to the target type in §18.5.2.2.
Otherwise, let E_{1}, ..., E_{n} be the types in the function
type's throws
clause that are not proper
types. If the lambda expression is implicitly typed, let its
parameter types be the function type's parameter types. If the
lambda body is a poly expression or a block containing a poly
result expression, let the targeted return type be the function
type's return type. Let X_{1}, ..., X_{m} be the checked exception
types that the lambda body can throw
(§11.2). Then there are two cases:
If n = 0
(the function type's throws
clause
consists only of proper types), then if there exists some
i (1 ≤ i ≤ m) such that X_{i} is not a
subtype of any proper type in the throws
clause, the
constraint reduces to false; otherwise, the constraint
reduces to true.
If n > 0
, the constraint reduces to a set of
subtyping constraints: for all i (1 ≤ i ≤
m), if X_{i} is not a subtype of any proper type in the
throws
clause, then the constraints include, for all
j (1 ≤ j ≤ n), ‹X_{i} <:
E_{j}›. In addition, for all j (1 ≤ j
≤ n), the constraint reduces to the bound throws
E_{j}.
A constraint formula of the form ‹MethodReference →_{throws} T› is reduced as follows:
If T is not a functional interface type, or if T is a functional interface type but does not have a function type (§9.9), the constraint reduces to false.
Otherwise, let the target function type for the method reference expression be the function type of T. If the method reference is inexact (§15.13.1) and one or more of the function type's parameter types is not a proper type, the constraint reduces to false.
Otherwise, if the method reference is inexact and the function
type's result is neither void
nor a proper type, the
constraint reduces to false.
Otherwise, let E_{1}, ..., E_{n} be the types in the function
type's throws
clause that are not proper
types. Let X_{1}, ..., X_{m} be the checked exceptions in the
throws
clause of the invocation type of the method reference's
compile-time declaration (§15.13.2) (as
derived from the function type's parameter types and return
type). Then there are two cases:
If n = 0
(the function type's throws
clause
consists only of proper types), then if there exists some
i (1 ≤ i ≤ m) such that X_{i} is not a
subtype of any proper type in the throws
clause, the
constraint reduces to false; otherwise, the constraint
reduces to true.
If n > 0
, the constraint reduces to a set of
subtyping constraints: for all i (1 ≤ i ≤
m), if X_{i} is not a subtype of any proper type in the
throws
clause, then the constraints include, for all
j (1 ≤ j ≤ n), ‹X_{i} <:
E_{j}›. In addition, for all j (1 ≤ j
≤ n), the constraint reduces to the bound throws
E_{j}.
Constraints on checked exceptions are handled separately from constraints on return types, because return type compatibility influences applicability of methods (§18.5.1), while exceptions only influence the invocation type after overload resolution is complete (§18.5.2). This could be simplified by including exception compatibility in the definition of lambda expression compatibility (§15.27.3), but this would lead to possibly surprising cases in which exceptions that can be thrown by an explicitly typed lambda body change overload resolution.
The exceptions thrown by a lambda body cannot be
determined until (i) the parameter types of the lambda are known, and
(ii) the target type of result expressions in the body is known. (The
second requirement is to account for generic method invocations in
which, for example, the same type parameter appears in the return type
and the throws
clause.) Hence, we require both of these, as derived
from the target type T, to be proper types.
One consequence is that lambda expressions returned from other lambda expressions cannot generate constraints from their thrown exceptions. These constraints can only be generated from top-level lambda expressions.
Note that the handling of the case in which more
than one inference variable appears in a function type's throws
clause is not completeness-preserving. Either variable may, on its
own, satisfy the constraint that each checked exception be declared,
but we cannot be sure which one is intended. So, for predictability,
we constrain them both.
As bound sets are generated and grown during inference, it is possible that new bounds can be inferred based on the assertions of the original bounds. The process of incorporation identifies these new bounds and adds them to the bound set.
Incorporation can happen in two scenarios. One scenario is that the bound set contains complementary pairs of bounds; this implies new constraint formulas, as specified in §18.3.1. The other scenario is that the bound set contains a bound involving capture conversion; this implies new bounds and may imply new constraint formulas, as specified in §18.3.2. In both scenarios, any new constraint formulas are reduced, and any new bounds are added to the bound set. This may trigger further incorporation; ultimately, the set will reach a fixed point and no further bounds can be inferred.
If incorporation of a bound set has reached a fixed point, and the set does not contain the bound false, then the bound set has the following properties:
For each combination of a proper lower bound L
and a proper
upper bound U of an inference variable, L
<:
U.
If every inference variable mentioned by a bound has an instantiation, the bound is satisfied by the corresponding substitution.
Given a dependency α = β, every bound of α matches a bound of β, and vice versa.
Given a dependency α <:
β, every lower bound
of α is a lower bound of β, and every upper bound of
β is an upper bound of α.
The assertion that incorporation reaches a fixed point oversimplifies the matter slightly. Building on the work of Kennedy and Pierce, On Decidability of Nominal Subtyping with Variance, this property can be proven by making the argument that the set of types that may appear in the bound set is finite. The argument relies on two assumptions:
New capture variables are not generated when reducing subtyping constraints (§18.2.3).
Expansive inheritance paths are not pursued.
This specification does not currently guarantee these properties (it is imprecise about the handling of wildcards when reducing subtyping constraints, and does not detect expansive inheritance paths), but may do so in a future version. (This is not a new problem: the Java subtyping algorithm is also at risk of non-termination.)
(In this section, S and T are inference variables or types, and U is a proper type. For conciseness, a bound of the form α = T may also match a bound of the form T = α.)
When a bound set contains a pair of bounds that match one of the following rules, a new constraint formula is implied:
When a bound set contains a pair of bounds α <:
S and
α <:
T, and there exists a supertype of S of the form
G<
S_{1}, ..., S_{n}>
and a supertype of T of the form
G<
T_{1}, ..., T_{n}>
(for some generic class or
interface, G), then for all i (1 ≤ i ≤ n), if
S_{i} and T_{i} are types (not wildcards), the constraint formula
‹S_{i} = T_{i}› is implied.
When a
bound set contains a bound of the form G<
α_{1}, ...,
α_{n}>
= capture(G<
A_{1}, ..., A_{n}>
), new
bounds are implied and new constraint formulas may be implied, as
follows.
Let P_{1},
..., P_{n} represent the type parameters of G and let B_{1}, ..., B_{n}
represent the bounds of these type parameters. Let θ represent
the substitution [
P_{1}:=α_{1}, ..., P_{n}:=α_{n}]
.
Let R be a type that is not an inference
variable (but is not necessarily a proper type).
A set of bounds on α_{1}, ..., α_{n} is implied, generated from the declared bounds of P_{1}, ..., P_{n} as specified in §18.1.3.
In addition, for all i (1 ≤ i ≤ n):
Given a bound set that does not contain the bound false, a subset of the inference variables mentioned by the bound set may be resolved. This means that a satisfactory instantiation may be added to the set for each inference variable, until all the requested variables have instantiations.
Dependencies in the bound set may require that the variables be resolved in a particular order, or that additional variables be resolved. Dependencies are specified as follows:
Given a bound of one of the following forms, where T is either an inference variable β or a type that mentions β:
If α appears on the left-hand side of another bound of the
form G<
..., α, ...>
=
capture(G<
...>
), then β depends on the
resolution of α. Otherwise, α depends on the
resolution of β.
An inference variable α appearing on the left-hand side of
a bound of the form G<
..., α, ...>
=
capture(G<
...>
) depends on the resolution of
every other inference variable mentioned in this bound (on both
sides of the = sign).
An inference variable α depends on the resolution of an inference variable β if there exists an inference variable γ such that α depends on the resolution of γ and γ depends on the resolution of β.
An inference variable α depends on the resolution of itself.
Given a set of inference variables to resolve, let V be the union of this set and all variables upon which the resolution of at least one variable in this set depends.
If every variable in V has an instantiation, then resolution succeeds and this procedure terminates.
Otherwise, let { α_{1}, ..., α_{n} } be a non-empty subset of uninstantiated variables in V such that (i) for all i (1 ≤ i ≤ n), if α_{i} depends on the resolution of a variable β, then either β has an instantiation or there is some j such that β = α_{j}; and (ii) there exists no non-empty proper subset of { α_{1}, ..., α_{n} } with this property. Resolution proceeds by generating an instantiation for each of α_{1}, ..., α_{n} based on the bounds in the bound set:
If the bound set does not contain a bound of the form
G<
..., α_{i}, ...>
=
capture(G<
...>
) for all i (1 ≤ i
≤ n), then a candidate instantiation T_{i} is defined
for each α_{i}:
If α_{i} has one or more proper
lower bounds, L_{1}
, ..., L_{k}
, then T_{i} = lub(L_{1}
, ...,
L_{k}
) (§4.10.4).
Otherwise, if the bound set contains throws
α_{i}, and
each proper upper bound of α_{i} is a supertype of
RuntimeException
, then T_{i} = RuntimeException
.
Otherwise, where α_{i} has proper upper bounds U_{1}, ..., U_{k}, T_{i} = glb(U_{1}, ..., U_{k}) (§5.1.10).
The bounds α_{1} = T_{1}, ..., α_{n} = T_{n} are incorporated with the current bound set.
If the result does not contain the bound false, then the result becomes the new bound set, and resolution proceeds by selecting a new set of variables to instantiate (if necessary), as described above.
Otherwise, the result contains the bound false, so a second attempt is made to instantiate { α_{1}, ..., α_{n} } by performing the step below.
If the bound set contains a bound of the form G<
...,
α_{i}, ...>
= capture(G<
...>
) for some
i (1 ≤ i ≤ n), or;
If the bound set produced in the step above contains the bound false;
then let Y_{1}, ..., Y_{n} be fresh type variables whose bounds are as follows:
For all i (1 ≤ i ≤ n), if α_{i}
has one or more proper lower bounds
L_{1}
, ..., L_{k}
, then let the lower bound of Y_{i} be
lub(L_{1}
, ..., L_{k}
); if not, then Y_{i} has no lower
bound.
For all i (1 ≤ i ≤ n), where α_{i}
has upper bounds U_{1}, ..., U_{k}, let the upper bound of
Y_{i} be glb(U_{1} θ, ..., U_{k} θ), where θ is
the substitution [
α_{1}:=Y_{1}, ..., α_{n}:=Y_{n}]
.
If the type variables Y_{1}, ..., Y_{n} do not have well-formed bounds (that is, a lower bound is not a subtype of an upper bound, or an intersection type is inconsistent), then resolution fails.
Otherwise, for all i (1 ≤ i ≤ n), all
bounds of the form G<
..., α_{i}, ...>
=
capture(G<
...>
) are removed from the current
bound set, and the bounds α_{1} = Y_{1}, ..., α_{n} = Y_{n}
are incorporated.
If the result does not contain the bound false, then the result becomes the new bound set, and resolution proceeds by selecting a new set of variables to instantiate (if necessary), as described above.
Otherwise, the result contains the bound false, and resolution fails.
The first method of instantiating an inference variable derives the instantiation from that variable's bounds. Sometimes, however, complex dependencies mean that the result is not within the variable's bounds. In that case, a different method of instantiation is performed, analogous to capture conversion (§5.1.10): fresh type variables are introduced, with bounds derived from the bounds of the inference variables. Note that the lower bounds of these "capture" variables are computed using only proper types: this is important in order to avoid attempts to perform typing computations on uninstantiated type variables.
Using the inference processes defined above, the following analyses are performed at compile time.
Given a
method invocation that provides no explicit type arguments, the
process to determine whether a potentially applicable generic method
m
is applicable is as follows:
Where P_{1}, ..., P_{p} (p ≥ 1) are the type parameters
of m
, let α_{1}, ..., α_{p} be inference variables, and
let θ be the substitution [
P_{1}:=α_{1}, ...,
P_{p}:=α_{p}]
.
An initial bound set, B_{0}, is generated from the declared bounds of P_{1}, ..., P_{p}, as described in §18.1.3.
For all i (1 ≤ i ≤ p), if P_{i} appears in
the throws
clause of m
, then the bound throws
α_{i} is
implied. These bounds, if any, are incorporated with B_{0} to
produce a new bound set, B_{1}.
A set of constraint formulas, C, is generated as follows.
Let F_{1}, ..., F_{n} be the formal parameter types of m
, and
let e_{1}
, ..., e_{k}
be the actual argument expressions of the
invocation. Then:
To test for applicability by strict invocation:
If k ≠ n, or if there exists an i (1
≤ i ≤ n) such that e_{i}
is pertinent to
applicability (§15.12.2.2) and either
(i) e_{i}
is a standalone expression of a primitive type but
F_{i} is a reference type, or (ii) F_{i} is a primitive type
but e_{i}
is not a standalone expression of a primitive type;
then the method is not applicable and there is no need to
proceed with inference.
Otherwise, C includes, for all i (1 ≤ i
≤ k) where e_{i}
is pertinent to applicability,
‹e_{i}
→ F_{i} θ›.
To test for applicability by loose invocation:
If k ≠ n, the method is not applicable and there is no need to proceed with inference.
Otherwise, C includes, for all i (1 ≤ i
≤ k) where e_{i}
is pertinent to applicability,
‹e_{i}
→ F_{i} θ›.
To test for applicability by variable arity invocation:
Let F'_{1}, ..., F'_{k} be the first k variable
arity parameter types of m
(§15.12.2.4). C includes, for all
i (1 ≤ i ≤ k) where e_{i}
is pertinent
to applicability, ‹e_{i}
→ F'_{i}
θ›.
C is reduced (§18.2) and the resulting bounds are incorporated with B_{1} to produce a new bound set, B_{2}.
Finally, the method m
is applicable if B_{2} does not contain
the bound false and resolution of all the inference
variables in B_{2} succeeds (§18.4).
Consider the following method invocation and assignment:
List<Number> ln = Arrays.asList(1, 2.0);
A most specific applicable method for the invocation must be identified as described in §15.12. The only potentially applicable method (§15.12.2.1) is declared as follows:
public static <T> List<T> asList(T... a)
Trivially (because of its arity), this method is neither applicable by strict invocation (§15.12.2.2) nor applicable by loose invocation (§15.12.2.3). But since there are no other candidates, in a third phase the method is checked for applicability by variable arity invocation.
The initial bound set, B, is a trivial upper bound for a single inference variable, α:
{ α <:
Object
}
The initial constraint formula set is as follows:
{ ‹1
→ α›,
‹2.0
→ α› }
These are reduced to a new bound set, B_{1}:
{ α <:
Object
,
Integer
<:
α,
Double
<:
α }
Then, to test whether the method is applicable, we attempt to resolve these bounds. We succeed, producing the rather complex instantiation
α = Number & Comparable<?
extends Number & Comparable<?>>
We have thus demonstrated that the method is applicable; since no other candidates exist, it is the most specific applicable method. Still, the type of the method invocation, and its compatibility with the target type in the assignment, is not determined until further inference can occur, as described in the next section.
Given a method invocation expression that provides no explicit type
arguments, and a corresponding most specific applicable generic method
m
, the process to infer the invocation type
(§15.12.2.6) of the chosen method may require
resolving additional constraints, both to assert compatibility with a
target type and to assert validity of the method invocation's argument
expressions.
It is important to note that multiple "rounds" of inference are involved in finding the type of a method invocation. This is necessary, for example, to allow a target type to influence the type of the invocation without allowing it to influence the choice of an applicable method. The first round (§18.5.1) produces a bound set and tests that a resolution exists, but does not commit to that resolution. Subsequent rounds reduce additional constraints until a final resolution step determines the "real" type of the expression.
If the method invocation expression is a poly expression (§15.12), its compatibility with a target type T is determined as follows.
If the method invocation expression appears in a strict invocation context and T is a primitive type, the expression is not compatible with T.
Let B_{2} be the bound set produced by reduction in order to
demonstrate that m
is applicable in §18.5.1.
(While it was necessary in §18.5.1 to demonstrate that the inference variables in B_{2} could be resolved, in order to establish applicability, the instantiations produced by this resolution step are not considered part of B_{2}.)
Let B_{3} be the bound set derived from B_{2} as follows.
Let R be the return type of m
,
and let θ be the substitution [
P_{1}:=α_{1}, ...,
P_{p}:=α_{p}]
defined in §18.5.1
to replace the type parameters of m
with inference variables,
and let T be the invocation's target type. Then:
If unchecked conversion was necessary for the method to be applicable during constraint set reduction in §18.5.1, the constraint formula ‹|R| → T› is reduced and incorporated with B_{2}.
Otherwise, if R θ is a parameterized type,
G<
A_{1}, ..., A_{n}>
, and one of A_{1}, ...,
A_{n} is a wildcard, then, for fresh inference variables
β_{1}, ..., β_{n}, the constraint formula
‹G<
β_{1}, ..., β_{n}>
→
T› is reduced and incorporated, along with the bound
G<
β_{1}, ..., β_{n}>
=
capture(G<
A_{1}, ..., A_{n}>
), with
B_{2}.
Otherwise, if R θ is an inference variable α, and one of the following is true:
T is a reference type, but is not a
wildcard-parameterized type, and either (i) B_{2} contains
a bound of one of the forms α = S or S
<:
α, where S is a wildcard-parameterized
type, or (ii) B_{2} contains two bounds of the forms S_{1}
<:
α and S_{2} <:
α, where S_{1}
and S_{2} have supertypes that are two different
parameterizations of the same generic class or
interface.
T is a parameterization of a generic class or
interface, G, and B_{2} contains a bound of one of the
forms α = S or S <:
α, where
there exists no type of the form G<
...>
that is a supertype of S, but the raw type
|G<
...>
| is a supertype of S.
T is a primitive type, and one of the primitive wrapper classes mentioned in §5.1.7 is an instantiation, upper bound, or lower bound for α in B_{2}.
then α is resolved in B_{2}, and where the capture of the resulting instantiation of α is U, the constraint formula ‹U → T› is reduced and incorporated with B_{2}.
Otherwise, the constraint formula ‹R θ → T› is reduced and incorporated with B_{2}.
The method invocation expression is compatible with T if B_{3} does not contain the bound false and resolution of all the inference variables in B_{3} succeeds (§18.4).
Consider the example from the previous section:
List<
Number>
ln = Arrays.asList(1, 2.0);
The most specific applicable method was identified as:
public static<
T>
List<
T>
asList(T... a)
In order to complete type-checking of the method
invocation, we must determine whether it is compatible with its target
type, List
.
<
Number>
The bound set used to demonstrate applicability in the previous section, B_{2}, was:
{ α <:
Object
,
Integer
<:
α,
Double
<:
α }
The new constraint formula set is as follows:
{ ‹List
→ <
α>
List
› }
<
Number>
This compatibility constraint produces an equality bound for α, which is included in the new bound set, B_{3}:
{ α <:
Object
,
Integer
<:
α,
Double
<:
α,
α = Number
}
These bounds are trivially resolved:
α = Number
Finally, we perform a substitution on the declared
return type of asList
to determine that the method
invocation has type List
;
clearly, this is compatible with the target type.
<
Number>
This inference strategy is different than the Java
SE 7 Edition of The Java® Language Specification, which would have instantiated α based
on its lower bounds (before even considering the invocation's target
type), as we did in the previous section. This would result in a type
error, since the resulting type is not a subtype
of List
.
<
Number>
Under various special circumstances, based on the bounds appearing in B_{2}, we eagerly resolve an inference variable that appears as the return type of the invocation. This is to avoid unfortunate situations in which the usual constraint, ‹R θ → T›, is not completeness-preserving. It is, unfortunately, possible that by eagerly resolving the variable, we are unable to make use of bounds that would be inferred later. It is also possible that, in some cases, bounds that will later be inferred from the invocation arguments (such as implicitly typed lambda expressions) would have caused a different outcome if they had been present in B_{2}. Despite these limitations, the strategy allows for reasonable outcomes in typical use cases, and is backwards compatible with the algorithm in the Java SE 7 Edition of The Java® Language Specification.
The invocation type for the chosen method is determined after considering additional constraints that may be implied by the argument expressions of the method invocation expression, as follows:
If the method invocation expression is a poly expression, let B_{3} be the bound set generated in §18.5.2.1 to demonstrate compatibility with the actual target type of the method invocation.
If the method invocation expression is not a poly expression,
let B_{3} be the same as the bound set produced by reduction in
order to demonstrate that m
is applicable in
§18.5.1.
(While it was necessary in §18.5.1 and §18.5.2.1 to demonstrate that the inference variables in the bound set could be resolved, the instantiations produced by these resolution steps are not considered part of B_{3}.)
A set of constraint formulas, C, is generated as follows.
Let e_{1}
, ..., e_{k}
be the actual argument expressions of the
method invocation expression.
If m
is applicable by strict or loose invocation, let F_{1},
..., F_{k} be the formal parameter types of m
; if m
is
applicable by variable arity invocation, let F_{1}, ..., F_{k} the
first k variable arity parameter types of m
(§15.12.2.4).
Let θ be the substitution [
P_{1}:=α_{1}, ...,
P_{p}:=α_{p}]
defined in §18.5.1
to replace the type parameters of m
with inference
variables.
If e_{i}
is not pertinent to applicability, C contains
‹e_{i}
→ F_{i} θ›.
Additional constraints may be included, depending on the
form of e_{i}
:
If e_{i}
is a LambdaExpression,
C contains ‹LambdaExpression →_{throws} F_{i} θ›,
and the lambda body is searched for additional constraints:
For a block lambda body, the search is applied recursively to each result expression.
For a poly class instance creation expression or a poly method invocation expression , C contains all the constraint formulas that would appear in the set C generated by §18.5.2 when inferring the poly expression's invocation type.
For a parenthesized expression, the search is applied recursively to the contained expression.
For a conditional expression, the search is applied recursively to the second and third operands.
For a lambda expression, the search is applied recursively to the lambda body.
If e_{i}
is a MethodReference,
C contains ‹MethodReference →_{throws} F_{i} θ›.
If e_{i}
is a poly class instance creation expression or
a poly method invocation expression, C contains all
the constraint formulas that would appear in the set C
generated by §18.5.2 when inferring
the poly expression's invocation type.
If e_{i}
is a parenthesized expression,
these rules are applied recursively to the contained expression.
If e_{i}
is a conditional expression,
these rules are applied recursively to the second and third operands.
While C is not empty, the following process is repeated, starting with the bound set B_{3} and accumulating new bounds into a "current" bound set, ultimately producing a new bound set, B_{4}:
A subset of constraints is selected in C, satisfying the property that, for each constraint, no input variable can influence an output variable of another constraint in C. The terms input variable and output variable are defined below. An inference variable α can influence an inference variable β if α depends on the resolution of β (§18.4), or vice versa; or if there exists a third inference variable γ such that α can influence γ and γ can influence β.
If this subset is empty, then there is a cycle (or cycles) in the graph of dependencies between constraints. In this case, the constraints in C that participate in a dependency cycle (or cycles) and do not depend on any constraints outside of the cycle (or cycles) are considered. A single constraint is selected from these considered constraints, as follows:
If any of the considered constraints have the form ‹Expression → T›, then the selected constraint is the considered constraint of this form that contains the expression to the left (§3.5) of the expression of every other considered constraint of this form.
If no considered constraint has the form ‹Expression → T›, then the selected constraint is the considered constraint that contains the expression to the left of the expression of every other considered constraint.
The input variables α_{1}, ..., α_{m} of all the selected constraint(s) are resolved.
Where T_{1}, ..., T_{m} are the instantiations of α_{1},
..., α_{m}, the substitution [
α_{1}:=T_{1}, ...,
α_{m}:=T_{m}]
is applied to every constraint.
The constraint(s) resulting from substitution are reduced and incorporated with the current bound set.
Finally, if B_{4} does not contain the bound false, the inference variables in B_{4} are resolved.
If resolution succeeds with instantiations T_{1}, ..., T_{p} for
inference variables α_{1}, ..., α_{p}, let θ' be
the substitution [
P_{1}:=T_{1}, ..., P_{p}:=T_{p}]
. Then:
If unchecked conversion was necessary for the method to be
applicable during constraint set reduction in
§18.5.1, then the parameter types of
the invocation type of m
are obtained by applying
θ' to the parameter types of m
's type, and the
return type and thrown types of the invocation type of m
are given by the erasure of the return type and thrown types
of m
's type.
If unchecked conversion was not necessary for the method to
be applicable, then the invocation type of m
is obtained
by applying θ' to the type of m
.
If B_{4} contains the bound false, or if resolution fails, then a compile-time error occurs.
The process of reducing additional argument constraints may require carefully ordering constraint formulas of the forms ‹Expression → T›, ‹LambdaExpression →_{throws} T›, and ‹MethodReference →_{throws} T›. To facilitate this ordering, the input variables of these constraints are defined as follows:
If T is an inference variable, it is the (only) input variable.
If T is a functional interface type, and a function type
can be derived from T (§15.27.3),
then the input variables include (i) if the lambda expression
is implicitly typed, the inference variables mentioned by
the function type's parameter types; and (ii) if the function
type's return type, R, is not void
, then for each result
expression e
in the lambda body (or for the body itself if
it is an expression), the input variables of ‹e
→ R›.
For ‹LambdaExpression →_{throws} T›:
If T is an inference variable, it is the (only) input variable.
If T is a functional interface type, and a function type can be derived, as described in §15.27.3, the input variables include (i) if the lambda expression is implicitly typed, the inference variables mentioned by the function type's parameter types; and (ii) the inference variables mentioned by the function type's return type.
If T is an inference variable, it is the (only) input variable.
If T is a functional interface type with a function type, and if the method reference is inexact (§15.13.1), the input variables are the inference variables mentioned by the function type's parameter types.
For ‹MethodReference →_{throws} T›:
If T is an inference variable, it is the (only) input variable.
If T is a functional interface type with a function type, and if the method reference is inexact (§15.13.1), the input variables are the inference variables mentioned by the function type's parameter types and the function type's return type.
For ‹Expression → T›, if Expression is a parenthesized expression:
Where the contained expression of Expression is Expression', the input variables are the input variables of ‹Expression' → T›.
For ‹ConditionalExpression → T›:
Where the conditional expression has the form e_{1}
?
e_{2}
:
e_{3}
, the input variables are the input variables of
‹e_{2}
→ T› and ‹e_{3}
→
T›.
For all other constraint formulas, there are no input variables.
The output variables of these constraints are all inference variables mentioned by the type on the right-hand side of the constraint, T, that are not input variables.
Where a lambda expression with explicit parameter types P_{1}, ...,
P_{n} targets a functional interface type F<
A_{1}, ...,
A_{m}>
with at least one wildcard type argument, then a
parameterization of F may be derived as the ground target type of
the lambda expression as follows.
Let Q_{1}, ..., Q_{k} be the parameter types of the function type of the
type F<
α_{1}, ..., α_{m}>
, where α_{1}, ...,
α_{m} are fresh inference variables.
If n ≠ k, no valid parameterization exists. Otherwise, a set of constraint formulas is formed with, for all i (1 ≤ i ≤ n), ‹P_{i} = Q_{i}›. This constraint formula set is reduced to form the bound set B.
If B contains the bound false, no valid parameterization
exists. Otherwise, a new parameterization of the functional interface
type, F<
A'_{1}, ..., A'_{m}>
, is constructed as
follows, for 1 ≤ i ≤ m:
If B contains an instantiation (§18.1.3) for α_{i}, T, then A'_{i} = T.
If F<
A'_{1}, ..., A'_{m}>
is not a well-formed
type (that is, the type arguments are not within their bounds), or if
F<
A'_{1}, ..., A'_{m}>
is not a subtype of
F<
A_{1}, ..., A_{m}>
, no valid parameterization
exists. Otherwise, the inferred parameterization is either
F<
A'_{1}, ..., A'_{m}>
, if all the type
arguments are types, or the non-wildcard parameterization
(§9.9) of F<
A'_{1}, ...,
A'_{m}>
, if one or more type arguments are still
wildcards.
In order to determine the function type of a wildcard-parameterized functional interface, we have to "instantiate" the wildcard type arguments with specific types. The "default" approach is to simply replace the wildcards with their bounds, as described in §9.8, but this produces spurious errors in cases where a lambda expression has explicit parameter types that do not correspond to the wildcard bounds. For example:
Predicate<? super Integer> p = (Number n) ->
n.equals(23);
The lambda expression is
a Predicate<Number>
, which is a subtype
of Predicate<? super Integer>
but
not Predicate<Integer>
. The analysis in this
section is used to infer that Number
is an
appropriate choice for the type argument
to Predicate
.
That said, the analysis here, while described in terms of general type inference, is intentionally quite simple. The only constraints are equality constraints, which means that reduction amounts to simple pattern matching. A more powerful strategy might also infer constraints from the body of the lambda expression. But, given possible interactions with inference for surrounding and/or nested generic method invocations, this would introduce a lot of extra complexity.
When testing that one applicable method is more specific than another (§15.12.2.5), where the second method is generic, it is necessary to test whether some instantiation of the second method's type parameters can be inferred to make the first method more specific than the second.
Let m_{1}
be the first method and m_{2}
be the second method. Where m_{2}
has type parameters P_{1}, ..., P_{p}, let α_{1}, ..., α_{p} be
inference variables, and let θ be the substitution
[
P_{1}:=α_{1}, ..., P_{p}:=α_{p}]
.
Let e_{1}
, ..., e_{k}
be the argument expressions of the corresponding
invocation. Then:
If m_{1}
and m_{2}
are applicable by strict or loose invocation
(§15.12.2.2,
§15.12.2.3), then let S_{1}, ..., S_{k} be
the formal parameter types of m_{1}
, and let T_{1}, ..., T_{k} be
the result of θ applied to the formal parameter types of
m_{2}
.
If m_{1}
and m_{2}
are applicable by variable arity invocation
(§15.12.2.4), then let S_{1}, ..., S_{k} be
the first k variable arity parameter types of m_{1}
, and let
T_{1}, ..., T_{k} be the result of θ applied to the first
k variable arity parameter types of m_{2}
.
Note that no substitution is applied to S_{1}, ...,
S_{k}; even if m_{1}
is generic, the type parameters of m_{1}
are treated
as type variables, not inference variables.
The process to determine if m_{1}
is more specific than m_{2}
is as
follows:
First, an initial bound set, B, is generated from the declared bounds of P_{1}, ..., P_{p}, as specified in §18.1.3.
Second, for all i (1 ≤ i ≤ k), a set of constraint formulas or bounds is generated.
If T_{i} is a proper type, the result is true if S_{i} is
more specific than T_{i} for e_{i}
(§15.12.2.5), and false
otherwise. (Note that S_{i} is always a proper type.)
Otherwise, if S_{i} and T_{i} are not both functional interface
types, the constraint formula ‹S_{i} <:
T_{i}›
is generated.
Otherwise, if the interface of S_{i} is a superinterface or a
subinterface of the interface of T_{i} (or, where S_{i} or T_{i} is
an intersection type, some interface of S_{i} is a superinterface
or a subinterface of some interface of T_{i}), the constraint
formula ‹S_{i} <:
T_{i}› is generated.
Otherwise, let MT_{S} be the function type of the capture of S_{i}, let MT_{S}' be the function type of S_{i} (without capture), and let MT_{T} be the function type of T_{i}. If MT_{S} and MT_{T} have a different number of formal parameters or type parameters, or if MT_{S} and MT_{S}' do not have the same type parameters (§8.4.4), the result is false. Otherwise, the following constraint formulas or bounds are generated from the type parameters, formal parameter types, and return types of MT_{S} and MT_{T}:
Let A_{1}, ..., A_{n} be the type parameters of MT_{S}, and let B_{1}, ..., B_{n} be the type parameters of MT_{T}.
Let θ' be the substitution [
B_{1}:=A_{1}, ...,
B_{n}:=A_{n}]
. Then, for all j (1 ≤ j
≤ n):
If the bound A_{j} mentions one of A_{1}, ..., A_{n}, and the bound of B_{j} is not a proper type, then producing an equality constraint would raise the possibility of an inference variable being bounded by an out-of-scope type variable. Since instantiating an inference variable with an out-of-scope type variable is nonsensical, we prefer to avoid the situation by giving up immediately whenever the possibility arises. This simplification is not completeness-preserving. (The same comment applies to the treatment of formal parameter types and return types below.)
Let U_{1}, ..., U_{k} be the formal parameter types of MT_{S}, and let V_{1}, ..., V_{k} be the formal parameter types of MT_{T}. Then, for all j (1 ≤ j ≤ k):
If U_{j} mentions one of A_{1}, ..., A_{n}, and V_{j} is not a proper type, false.
Otherwise,
‹V_{j} θ' <:
U_{j}›, and,
where U_{1}', ..., U_{k}' are the formal parameter types of MT_{S}',
and A_{1}', ..., A_{n}' are the type parameters of MT_{S}',
‹V_{j}[
B_{1}:=A_{1}', ..., B_{n}:=A_{n}']
= U_{j}'›
Let R_{S} be the return type of MT_{S}, and let R_{T} be the return type of MT_{T}. Then:
If R_{S} mentions one of A_{1}, ..., A_{n}, and R_{T} is not a proper type, false.
Otherwise, if e_{i}
is an explicitly typed lambda expression:
Otherwise, if R_{S} and R_{T} are functional interface
types, and e_{i}
has at least one result expression,
then for each result expression in e_{i}
, this entire
second step is repeated to infer constraints under
which R_{S} is more specific than R_{T} θ' for
the given result expression.
Otherwise, if R_{S} is a primitive type and R_{T} is not,
and e_{i}
has at least one result expression,
and each result expression of e_{i}
is a standalone
expression (§15.2) of a primitive
type, true.
Otherwise, if R_{T} is a primitive type and R_{S} is not,
and e_{i}
has at least one result expression,
and each result expression of e_{i}
is either a
standalone expression of a reference type or a poly
expression, true.
Otherwise, if e_{i}
is a parenthesized expression, these
rules for constraints derived from R_{S} and R_{T} are
applied recursively for the contained expression.
Otherwise, if e_{i}
is a conditional expression, these
rules for constraints derived from R_{S} and R_{T} are
applied recursively for each of the second and third operands.
Third, if m_{2}
is applicable by variable arity invocation and
has k+1 parameters, then where S_{k+1} is the k+1'th
variable arity parameter type of m_{1}
and T_{k+1} is the
result of θ applied to the k+1'th variable arity
parameter type of m_{2}
, the constraint ‹S_{k+1}
<:
T_{k+1}› is generated.
Fourth, the generated bounds and constraint formulas are reduced and incorporated with B to produce a bound set B'.
If B' does not contain the bound false, and
resolution of all the inference variables in B' succeeds,
then m_{1}
is more specific than m_{2}
.