In a nutshell, content negotiation takes an abstract resource URL like
http://example.org/2005/chart and maps it to the files on the filesystem based on the available files and their mime-types, and the mime-types in the requestor’s
Given that URL, an
Accept: header suggesting
image/svg+xml; q=1, image/*; q=0.5 and the files
/www/example.org/2005/chart.svg, the server would see that there is a
image/svg type file, which matches the highest preference, and return that along with a
Varies: Accept header.
The efficiency problems come from needing to know the available files and their mime-types. At the most efficient, an expensive scan for available files will happen for one hit, and be cached for subsequent hits. However, cache consistency is a difficult problem, and many of the solutions are as inefficient as no caching at all. Very recent linux kernels support the
inotify mechanism which would work to monitor efficiently and keep the cache consistent, but it’s not a generally portable solution.
The simplest implementation would take the URL, and check to see if it’s immediately satisfiable — this is the same efficiency as normal serving, without content-negotiation. If it’s not found, then ir must perform a directory listing (one open call, some read calls). This gets expensive for huge directories. (Directories of over 1000 files, though the expense depends on the type of filesystem). Candidates are selected, mime-types mapped, and selected according to the criteria in the HTTP spec. Unless there are extremely many alternatives or an absurdly large
Accept: header, computing this isn’t computationally intensive, on the order of O(m * n).
However, to send
Content-Length: headers, at least one
stat() call must be made, and to handle dangling symbolic links, a
stat() for every file under consideration (though since dangling links are an edge case, this could be implemented as a fallback, not normal operation.).
The biggest issues are the ones dealing with unusually large directories, where a linear scan of the listing can take a long time, and if caching is performed, how to keep cache consistency and still gain from the cache.
Thoughts are always welcome. I’ll probably implement this in Lighttpd at some point.