A Route Pattern describes a pattern that will match a certain set of HTTP request paths. The pattern is matched against the path component of the request URI.
/objects/:object/:id?
This route pattern will match the following paths:
This section is non normative
The syntax of Route Patterns is similar to and inspired by the pattern routing syntax found in a number of web frameworks, including:
Route Patterns emerge out of a desire to create a formal definition of the ad-hoc pattern syntax that these and similar frameworks have popularised.
A goal of Route Patterns is to ensure that it is not possible to define a suite of Route Patterns that are ambiguous, i.e that for any given request path only one or zero Route Patterns can be chosen to match against the path. As a consequence the Route Pattern syntax may be considered less flexible/expressive than the ad-hoc syntaxes that the above web frameworks define.
This is a conscious design trade-off. In the ad-hoc syntaxes, any ambiguity is resolved by the order in which patterns are declared, the first declared pattern is tested first, the second declared pattern tested second and so on. Developers can order the pattern declarations to ensure more specific patterns are tested before less specific patterns. This requires one central code location where routes are declared and requires careful ordering of the patterns to avoid errors. These requirements may not scale to larger applications where many developers are defining route patterns, and may not be fully aware of conflicting/overlapping route patterns, or to applications where route patterns need to be defined in many different locations (e.g. in a pluggable architecture).
The Route Pattern syntax is also somewhat similar to the URI Template syntax, but the applications of URI Templates and Route Patterns differ. URI Templates focus on forming concrete URIs from a template, Route Patterns focus on decomposing the path portion of a URI into it’s component parts.
A Route Pattern is a string of printable Unicode characters that contains zero or more embedded variable expressions. An expression MAY be a Named Parameter, delimited by a leading colon (‘:’) and a trailing slash (‘/’), or end of string, or an expression MAY be a Glob Parameter indicated by the wildcard character (‘*’). A pattern that contains one or more Named Parameters is termed a Named Pattern. A pattern that contains a Glob Parameter is termed a Glob Pattern. A pattern MUST NOT contain a mixture of Named Patterns and Glob Expressions. A pattern lacking any variable expressions is termed a Literal Pattern.
Route-Pattern = named-pattern / glob-pattern / literal-pattern
The slash (‘/’) character delimits the pattern into Path Segments. A Path Separator MUST NOT be followed by another Path Separator. The leading Path Separator in a Route Pattern is implied and may be omitted.
The set of reserved characters is those defined by RFC 3986 Section 2.2.
reserved = gen-delims / sub-delims gen-delims = ":" / "/" / "?" / "#" / "[" / "]" / "@" sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
The characters outside of expressions and path separators in a Route Pattern are termed Literal Values. They MAY contain any printable Unicode character except the Reserved Characters.
The start of a Named Parameter is indicated by the colon character (‘:’). The end of a Named Parameter is indicated by a Path Separator or the end of string. The Named Pattern MAY be suffixed with a Modifier. A given parameter name MUST only appear once in each route pattern. A Route Pattern MAY have zero or more Named Patterns.
named-expression-pattern = *(literal / path-separator / named-expression ) valid-name = [a-zA-Z0-9] / '-' / '_' char = [a-zA-Z] name = char valid-name* param-decl = name ('*' / '?' ) named-expression = ':' param-decl path-separator / ':' param-decl <eos>
A Modifier modifies the matching behavior of a Named Parameter. Only a single Named Parameter in a Route Pattern MAY contain a Modifier and it MUST be the last Named Parameter in the pattern. A Modifier is suffixed to the end of a named parameter expression.
The Eager Modifier is indicated by the asterisk character (‘*’) and instructs the matcher to eagerly consume all characters matching the Named Pattern including the Path Separator character up to the end of the string.
/foo/:all-children*
This pattern will match the following paths:
The Eager Modifier MUST match at least one character, so the above pattern will not match the following path:
The Optional Modifier is indicated by the question mark character (‘?’) and instructs the matcher that the Named Pattern will match zero or more characters until the end of string is reached.
A Compound Named Parameter is a Named Parameter where the matching text in the request path is decomposed into named components. Each component is delimited by the comma character (‘,’).
A Compound Named Parameter MAY have an Optional Modifier, but MUST NOT have an Eager Modifier.
A Glob Parameter is denoted by the wildcard Modifier (the ‘*’ character). The wildcard Modifier MUST appear at the end of the pattern and MUST be preceded by the path separator. Only a single Glob Parameter is permitted in a pattern. A Glob Parameter MUST NOT occur in the same pattern as a Named Parameter.
glob-pattern = *(literal / path-separator / ) / path-separator '*'
A Glob Parameter matches zero or more characters until the end of the string.
A Route Pattern is composed of the following tokens:
A Route Pattern is matched against the URL encoded form of a request path, by matching each token against it’s corresponding segment of the request path. The tokens are matched in left to right order, the first token matching the left-most segment of the request path, the second token matching the next left most segment and so on.
The rules for matching each token type are defined below:
Each path separator token MUST match exactly one ‘/’ character in the request path. A Path Separator MUST NOT match the URL Encoded form of the ‘/’ character, i.e. it MUST NOT match the following octets: %2F or the following octets: %2f. Since the leading Path Separator in a Route Pattern is optional, the leading Path Separator in a request path is also optional and MAY be omitted.
Each literal value token MUST match the exact same characters in the request path. Each literal value MUST be URL encoded and compared to the URL encoded request path.
A Named Parameter token matches one or more characters up until the next occurrence of a Path Separator or end of string.
If a Named Parameter has an Optional Modifier then it will match zero or more characters up until the end of string.
A Compound Named Parameter token matches one or more characters up until the next occurrence of a Path Separator or end of string, wherein the matched characters are further delimited by the comma (‘,’) character. If the Compound Named Parameters has N components, then there MUST be at most N-1 commas in the matched text. If there are more than N-1 comma characters then there MUST be no match. Trailing comma characters MAY be omitted in the matched request path.
Component values in the request path that must contain the comma character MUST use the percent encoded form of the comma character (‘%2C’)
If a Compound Named Parameter has an Optional Modifier then it will match zero or more characters up until the end of string.
The pattern /line-items/:order_id,item_id/detail will match the following paths:
Trailing comma separators MAY be omitted so the following path will also be matched:
If a component value contains the comma character, it must be percent encoded in the request path, for example given the pattern: /books/title,author, then:
A Collection of Route Patterns is termed a Route Pattern Set. A Route Pattern Set MUST be unambiguous, meaning that for a given request path it should be possible to choose at most one Route Pattern from the set to match the request path.
Route Patterns MUST be ordered within the Route Pattern Set from most specific pattern to least specific pattern. Matching of a request path against a Route Pattern Set MUST proceed from the most specific pattern to least specific pattern. Matching MUST stop at the first matching Route Pattern encountered.
Equivalent or overlapping Route Patterns MUST NOT occur in the same Route Pattern Set.
Named Patterns are equivalent if the only difference between the patterns is the names assigned to parameters
The following two patterns are not permitted in the same Route Pattern Set because the only difference is the name assigned to the Named Parameter:
Both Named Patterns will match the exact same set of request paths, which causes ambiguity about which one should be chosen to match a given request path.
Overlapping Patterns are Route Patterns where for a subset of request paths, more than one Route Pattern matches, and the Token Precedence Ordering described below does not help resolve which Route Pattern should be chosen.
A Route Pattern Set MUST NOT contain two or more Named Patterns, which differ only in the use of a Modifier.
The following three patterns are not permitted in the same Route Pattern Set because the only difference is the modifier assigned to the Named Parameter:
A Glob Pattern MUST NOT overlap with a Literal Pattern in the same Route Pattern Set
The following Literal and Glob Pattern overlap, because the Glob Pattern will also match the same request path as the Literal Pattern
Patterns MUST be ordered in reverse lexicographical ordering. As a consequence, the longest path sharing a common prefix will be matched first.
The different token types are assigned a precedence order from most specific to least specific, which enables a deterministic sort order to be determined for a Route Pattern Set.
Literal Values and Path Separators have the highest precedence as they require an exact match. Literal values are ordered in reverse lexicographical order, so that longer literal tokens are tested before shorter tokens.
A Compound Named Parameter has second highest precedence, as the requirement to match the comma characters within the matching value makes it more specific than a Named Parameter.
An Optional Compound Named Parameter has third highest precedence, it is less specific than a Compound Named Parameter because it MAY match an empty string.
A Named Parameter has fourth highest precedence, matching one or more characters until the next Path Separator or end of string.
An Optional Named Parameter has fifth highest precedence, matching zero or more characters, not including the Path Separator until the end of string.
Given the following Route Pattern Set:
The expected ordering of these Route Patterns from most specific to least specific is:
This section is non normative
One means by which the specified Route Pattern Ordering can be implemented is to convert each pattern to a canonical string representation and then order the canonical strings in reverse lexicographical order. To accomplish this each different parameter token in the pattern is replaced with a single low value character as shown in the following list, with the lowest precedence pattern getting the lowest value character, and the highest precedence getting the highest value character.
By applying this table to the patterns in the previous example we can see than the canonical strings for each pattern is:
Because the substitute characters used fall in the Reserved Character Set, they will never overlap with any literal tokens, and thus never result in any ambiguous overlap between patterns.