In a Ruby on Rails monolith serving 10M+ daily requests for an e-commerce marketplace, the router sits on the critical path for every request. A small regression in route matching can increase P99 latency, cause incorrect controller dispatch (user-facing 404s), or accidentally expose admin endpoints if precedence is wrong.
You’re asked to explain, at an implementation level, how a Rails-style router could work.
GET /users/42/orders/9 when you have static routes (e.g., /users/new), dynamic segments (e.g., /users/:id), and wildcards (e.g., /assets/*path)? What precedence rules prevent /users/new from being captured by /users/:id?:id) and wildcard captures extracted efficiently and safely?Discuss internal representation (e.g., tries/DFAs/compiled regex), trade-offs (memory vs speed), complexity, and common pitfalls (route shadowing, ambiguous matches, and pathological patterns). You don’t need to write Ruby, but your explanation should be concrete enough to implement in any language.
A robust router must define deterministic precedence to avoid ambiguous dispatch. Typically, routes with more static segments win over dynamic segments, and wildcards are lowest priority; ties can be broken by declaration order.
routes = [
"/users/new", # static
"/users/:id", # dynamic
"/assets/*path" # wildcard
]
# /users/new must match the static route, not :id
Instead of matching entire strings, split paths into segments and match per segment. This makes it easier to reason about precedence and enables trie-based indexing for fast lookups.
def split_path(p: str) -> list[str]:
return [seg for seg in p.split('/') if seg]
# "/users/42/orders/9" -> ["users", "42", "orders", "9"]
A trie stores routes by segments, allowing O(L) matching where L is number of segments, rather than O(R) over all routes. Nodes can have separate edges for static literals, a single dynamic edge (":id"), and a wildcard edge ("*path").
class Node:
def __init__(self):
self.static = {} # literal -> Node
self.dynamic = None # (param_name, Node)
self.wildcard = None # (param_name, Node)
self.handler = None
Compiling routes into regexes can be simple but may devolve into linear scans unless grouped/indexed. A trie is predictable and fast for typical REST paths, while regex shines for complex constraints but can introduce backtracking risks.
Key pitfalls include route shadowing (a generic dynamic route hiding a specific one), ambiguous matches, and unsafe wildcard captures. Correct routers also normalize paths (trailing slashes, URL decoding rules) consistently before matching.