CNP content selectors (draft)

Overview

This is a draft of a select CNP parameter, which facilitates content selection within response body data.

Syntax

The parameter would have the following form:

select={selector}:{query}

Where {selector} is a fixed string that chooses how to parse the selector query and {query} is the query which the specified selector performs within the content.

The selector itself could also be provided in the query part of a CNP URL:

cnp://{host}/{path}?{selector}:{query}

Or, perhaps, as an attribute-value pair in it:

cnp://{host}/{path}?select={selector}:{query}

Functionality

The result of a selector is either error with reason=not_supported for an unsupported selector, error with reason=invalid for invalid query string for the specified selector, the entire requested file without changes as if no select parameter was provided in the request or the contents of the requested path filtered through the requested selector query with the select response parameter set to the same value as the request parameter.

The last case is the one where selector was actually used. The returned content should generally be based on the requested path.

Only one selector can be provided in a request. To select multiple subsets of a page, use multiple requests (perhaps in combination with batch requests).

An if_modified parameter may only be set on a request for a cached page when all of the host, path and selector are equivalent to the cached ones.

Selectors

The following selectors would be defined as the standard ones:

byte

select=byte:{from}-{to}
select=byte:{from}-
select=byte:-{to}
select=byte:-

The byte selector selects a subset of the content bytes. The {from} value represents the start byte; it defaults to 0 when absent. The {to} value represents the end byte; it defaults to the end of file when absent or larger than the file length. If {to} is less than {from}, it should be treated as if they were equal (thus resulting in an empty response).

The byte selector is mostly equivalent to the HTTP Range: bytes={from}-{to} header with only one byte range.

cnm

select=cnm:{selector}

The cnm selector selects the content based on CNM selectors.

For example, with the following document:

title
	Foo
content
	text
		Bar
	section Baz
		section Quux
			text
				Qwe
		text
			Asd
		list
			text
				Zxc
			section 123
				text
					Test

The following selectors can be performed:

select=cnm:/Baz/Quux
content
	section Baz
		section Quux
			text
				Qwe
select=cnm:!#Baz
content
	section Baz
		section Quux
		text 
			Asd
		list
			text
				Zxc
			section 123
select=cnm:!
title
	Foo
content
	text
		Bar
	section Baz
select=cnm:!/
content
	text
		Bar
	section Baz

info

select=info:

The info selector selects the information about the provided path. It takes no query string; if any is provided, it causes an error reason=invalid.

The response body (not header) should contain the CNP header that would have been used to answer this request without the selector. That header line should include all information, including the length parameter, if applicable.

The info selector replaces HTTP HEAD requests. This way, a CNP response can be understood on its own without requiring the request as context. If the info header had been provided as the response header, the length parameter would not match the actual body length (of zero) and would have to be handled specially.

Other selectors

Other selectors can also be defined later.

Some ideas:

  • options: similar to HTTP OPTIONS request, but instead lists supported features for specified path (e.g. available selectors).

  • time: select a timespan in e.g. video or audio content (could replace byte range requests for seeking streaming videos).

  • css or xpath: select parts of HTML documents.

  • line: select a line (or line range) in a text document or source code.

  • language: choose the language of the document (content negotiation); possible problem is that it can't be combined with other selectors.

  • type: content type negotiation, useful for APIs.