Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove URN support #1930

Open
wants to merge 12 commits into
base: master
Choose a base branch
from
9 changes: 0 additions & 9 deletions doc/Programming-Guide/03_MajorComponents.dox
Original file line number Diff line number Diff line change
Expand Up @@ -329,13 +329,4 @@ TODO: get RFCs linked from ietf
we have made almost all of the cachemgr information available
via SNMP.

\section URNSupport URN Support
\par
We are experimenting with URN support in Squid version 1.2.
Note, we're not talking full-blown generic URN's here. This
is primarily targeted toward using URN's as an smart way
of handling lists of mirror sites. For more details, please
see (http://squid.nlanr.net/Squid/urn-support.html) URN Support in Squid
.

*/
1 change: 0 additions & 1 deletion doc/debug-sections.txt
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,6 @@ section 49 SNMP Interface
section 49 SNMP support
section 50 Log file handling
section 51 Filedescriptor Functions
section 52 URN Parsing
section 53 AS Number handling
section 53 Radix Tree data structure implementation
section 54 Interprocess Communication
Expand Down
9 changes: 9 additions & 0 deletions doc/release-notes/release-7.sgml.in
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ The Squid-@SQUID_RELEASE@ change history can be <url url="https://github.com/squ
<item>Removed purge tool
<item>Remove deprecated languages
<item>Remove Ident protocol support
<item>Remove URN protocol support
</itemize>

<p>Most user-facing changes are reflected in squid.conf (see further below).
Expand Down Expand Up @@ -123,6 +124,14 @@ in the position of what used to be a %ui record field.
<p>If necessary, an external ACL helper can be written to perform Ident transactions
and deliver the user identity to Squid through the **user=** annotation.

<sect1>Removed URN protocol support

<p>Squid URN resolution code has been neglected for a very long time and caused
multiple security vulnerabilities. This feature was rarely used (if at all).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come on be honest. All attempts to update the code were vetoed by you. Otherwise this code would very much have been updated by at least four authors in the past 10 years.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Come on be honest. All attempts to update the code were vetoed by you. Otherwise this code would very much have been updated by at least four authors in the past 10 years.

I hope that any vetoes were correct, but this discussion and implications of dishonesty feel out of scope: PR text describes code state. It does not speculate about the reasons that led to that code state.


<p>If necessary, a similar feature can be implemented externally, using
url_rewrite_program helpers or adaptation services.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a bug. You have removed the ability for Squid to correctly:
a) parse any request-target with "urn:" scheme, and
b) send a valid URN anywhere (including helpers, ICAP, eCAP, and even cache.log - receive unknown:// at most)

A squid lacking "foo:" support should reject all "foo:" URLs on initial parse/validate. Failure to do that re-opens one of those security vulnerabilities I closed off by fixing the URN NID validation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You have removed the ability for Squid to ... send a valid URN anywhere (including helpers, ICAP, eCAP...)

Agreed. I have adjusted PR description and release notes (commit 14914f4) to precondition external support on enhancing Squid to handle unknown (to Squid) URI schemes (which should not be limited to URN scheme, of course). Fortunately, we do not need to debate the details of that hypothetical enhancement -- folks implementing it should initiate that debate outside this PR.


<sect>Changes to squid.conf since Squid-@SQUID_RELEASE_OLD@
<p>
This section gives an account of those changes in three categories:
Expand Down
1 change: 0 additions & 1 deletion errors/template.am
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,5 @@ ERROR_TEMPLATES = \
templates/ERR_TOO_BIG \
templates/ERR_UNSUP_HTTPVERSION \
templates/ERR_UNSUP_REQ \
templates/ERR_URN_RESOLVE \
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR does not update PO/POT files based on an earlier recommendation. If those files should be updated to reflect ERR_URN_RESOLVE removal, please let me know, and we will update them (you can even preview most of those changes in earlier branch commit c00c79a that was later reverted).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These templates are used to generate the "langpack" releases which get installed for use by much older versions of Squid. The template file needs to be retained until no supported version of Squid tries to load it on startup.

This is also why the ERR_ESI remains.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The template file needs to be retained

Restored at 0f42fbd.

templates/ERR_WRITE_ERROR \
templates/ERR_ZERO_SIZE_OBJECT
38 changes: 0 additions & 38 deletions errors/templates/ERR_URN_RESOLVE

This file was deleted.

20 changes: 2 additions & 18 deletions src/FwdState.cc
Original file line number Diff line number Diff line change
Expand Up @@ -55,7 +55,6 @@
#include "ssl/PeekingPeerConnector.h"
#include "Store.h"
#include "StoreClient.h"
#include "urn.h"
#if USE_OPENSSL
#include "ssl/cert_validate_message.h"
#include "ssl/Config.h"
Expand Down Expand Up @@ -388,19 +387,8 @@ FwdState::Start(const Comm::ConnectionPointer &clientConn, StoreEntry *entry, Ht
return;
}

switch (request->url.getScheme()) {

case AnyP::PROTO_URN:
urnStart(request, entry, al);
return;

default:
FwdState::Pointer fwd = new FwdState(clientConn, entry, request, al);
fwd->start(fwd);
return;
}

/* NOTREACHED */
FwdState::Pointer fwd = new FwdState(clientConn, entry, request, al);
fwd->start(fwd);
}

void
Expand Down Expand Up @@ -1272,10 +1260,6 @@ FwdState::dispatch()
Ftp::StartGateway(this);
break;

case AnyP::PROTO_URN:
fatal_dump("Should never get here");
break;

case AnyP::PROTO_WHOIS:
whoisStart(this);
break;
Expand Down
2 changes: 1 addition & 1 deletion src/HttpRequest.cc
Original file line number Diff line number Diff line change
Expand Up @@ -812,7 +812,7 @@ HttpRequest::manager(const CbcPointer<ConnStateData> &aMgr, const AccessLogEntry
char *
HttpRequest::canonicalCleanUrl() const
{
return urlCanonicalCleanWithoutRequest(effectiveRequestUri(), method, url.getScheme());
return urlCanonicalCleanWithoutRequest(effectiveRequestUri(), method);
}

/// a helper for handling PortCfg cases of FindListeningPortAddress()
Expand Down
8 changes: 0 additions & 8 deletions src/Makefile.am
Original file line number Diff line number Diff line change
Expand Up @@ -455,8 +455,6 @@ squid_SOURCES = \
tunnel.cc \
tunnel.h \
typedefs.h \
urn.cc \
urn.h \
wccp.cc \
wccp.h \
wccp2.cc \
Expand Down Expand Up @@ -1944,8 +1942,6 @@ tests_testHttpRange_SOURCES = \
tools.h \
tests/stub_tunnel.cc \
tunnel.h \
urn.cc \
urn.h \
tests/stub_wccp2.cc \
wccp2.h \
wordlist.cc \
Expand Down Expand Up @@ -2331,8 +2327,6 @@ tests_testHttpRequest_SOURCES = \
tools.h \
tests/stub_tunnel.cc \
tunnel.h \
urn.cc \
urn.h \
tests/stub_wccp2.cc \
wccp2.h \
wordlist.cc \
Expand Down Expand Up @@ -2626,8 +2620,6 @@ tests_testCacheManager_SOURCES = \
tools.h \
tests/stub_tunnel.cc \
tunnel.h \
urn.cc \
urn.h \
tests/stub_wccp2.cc \
wccp2.h \
wordlist.cc \
Expand Down
1 change: 0 additions & 1 deletion src/adaptation/ecap/Host.cc
Original file line number Diff line number Diff line change
Expand Up @@ -56,7 +56,6 @@ Adaptation::Ecap::Host::Host()
libecap::protocolHttps.assignHostId(AnyP::PROTO_HTTPS);
libecap::protocolFtp.assignHostId(AnyP::PROTO_FTP);
libecap::protocolWais.assignHostId(AnyP::PROTO_WAIS);
libecap::protocolUrn.assignHostId(AnyP::PROTO_URN);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thus rendering eCAP unable to meet the release notes claimed capability of performing Trivial-HTTP Resolver gateway.
("AnyP::PROTO_UNKNOWN" are passed as static string "unknown", not as the received scheme image)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Host application IDs being configured by this code are an optimization that speeds up string comparison for common cases. eCAP code should function correctly without that optimization. If it does not, it is an out-of-scope bug (in eCAP adapter or host application code).

libecap::protocolWhois.assignHostId(AnyP::PROTO_WHOIS);
protocolIcp.assignHostId(AnyP::PROTO_ICP);
#if USE_HTCP
Expand Down
2 changes: 0 additions & 2 deletions src/adaptation/ecap/MessageRep.cc
Original file line number Diff line number Diff line change
Expand Up @@ -149,8 +149,6 @@ Adaptation::Ecap::FirstLineRep::protocol() const
return libecap::protocolWais;
case AnyP::PROTO_WHOIS:
return libecap::protocolWhois;
case AnyP::PROTO_URN:
return libecap::protocolUrn;
case AnyP::PROTO_ICP:
return protocolIcp;
#if USE_HTCP
Expand Down
1 change: 0 additions & 1 deletion src/anyp/ProtocolType.h
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,6 @@ typedef enum {
#if USE_HTCP
PROTO_HTCP,
#endif
PROTO_URN,
PROTO_WHOIS,
PROTO_ICY,
PROTO_TLS,
Expand Down
85 changes: 14 additions & 71 deletions src/anyp/Uri.cc
Original file line number Diff line number Diff line change
Expand Up @@ -325,11 +325,6 @@ AnyP::Uri::parse(const HttpRequestMethod& method, const SBuf &rawUrl)
if (scheme == AnyP::PROTO_NONE)
return false; // invalid scheme

if (scheme == AnyP::PROTO_URN) {
parseUrn(tok); // throws on any error
return true;
}

// URLs then have "//"
static const SBuf doubleSlash("//");
if (!tok.skip(doubleSlash))
Expand Down Expand Up @@ -531,48 +526,6 @@ AnyP::Uri::parse(const HttpRequestMethod& method, const SBuf &rawUrl)
}
}

/**
* Governed by RFC 8141 section 2:
*
* assigned-name = "urn" ":" NID ":" NSS
* NID = (alphanum) 0*30(ldh) (alphanum)
* ldh = alphanum / "-"
* NSS = pchar *(pchar / "/")
*
* RFC 3986 Appendix D.2 defines (as deprecated):
*
* alphanum = ALPHA / DIGIT
*
* Notice that NID is exactly 2-32 characters in length.
*/
void
AnyP::Uri::parseUrn(Parser::Tokenizer &tok)
{
static const auto nidChars = CharacterSet("NID","-") + CharacterSet::ALPHA + CharacterSet::DIGIT;
static const auto alphanum = (CharacterSet::ALPHA + CharacterSet::DIGIT).rename("alphanum");
SBuf nid;
if (!tok.prefix(nid, nidChars, 32))
throw TextException("NID not found", Here());

if (!tok.skip(':'))
throw TextException("NID too long or missing ':' delimiter", Here());

if (nid.length() < 2)
throw TextException("NID too short", Here());

if (!alphanum[*nid.begin()])
throw TextException("NID prefix is not alphanumeric", Here());

if (!alphanum[*nid.rbegin()])
throw TextException("NID suffix is not alphanumeric", Here());

setScheme(AnyP::PROTO_URN, nullptr);
host(nid.c_str());
// TODO validate path characters
path(tok.remaining());
debugs(23, 3, "Split URI into proto=urn, nid=" << nid << ", " << Raw("path",path().rawContent(),path().length()));
}

/// Extracts and returns a (suspected but only partially validated) uri-host
/// IPv6address, IPv4address, or reg-name component. This function uses (and
/// quotes) RFC 3986, Section 3.2.2 syntax rules.
Expand Down Expand Up @@ -695,23 +648,18 @@ AnyP::Uri::absolute() const

absolute_.append(getScheme().image());
absolute_.append(":",1);
if (getScheme() != AnyP::PROTO_URN) {
absolute_.append("//", 2);
const bool allowUserInfo = getScheme() == AnyP::PROTO_FTP ||
getScheme() == AnyP::PROTO_UNKNOWN;

if (allowUserInfo && !userInfo().isEmpty()) {
static const CharacterSet uiChars = CharacterSet(UserInfoChars())
.remove('%')
.rename("userinfo-reserved");
absolute_.append(Encode(userInfo(), uiChars));
absolute_.append("@", 1);
}
absolute_.append(authority());
} else {
absolute_.append(host());
absolute_.append(":", 1);
absolute_.append("//", 2);
const bool allowUserInfo = getScheme() == AnyP::PROTO_FTP ||
getScheme() == AnyP::PROTO_UNKNOWN;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to exclude URI with image() of "urn:" (which is now part of AnyP::PROTO_UNKNOWN) or we open a security vulnerability for sensitive data exfiltration.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, these changes are what render ICAP and helpers unable to meet the release notes claimed capability of performing Trivial-HTTP Resolver gateway.

Copy link
Contributor Author

@rousskov rousskov Nov 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to exclude URI with image() of "urn:" (which is now part of AnyP::PROTO_UNKNOWN) or we open a security vulnerability for sensitive data exfiltration.

PR code treats all unknown (to Squid) URI schemes the same. This code had received unknown non-URN schemes prior to PR changes. Thus, the "we open a vulnerability" assertion is false: Either that vulnerability existed before these changes, or these changes do not open it.

Also, these changes are what render ICAP and helpers unable to meet the release notes claimed capability of performing Trivial-HTTP Resolver gateway.

That problem was flagged and addressed in another change request. If necessary, let's continue this part of the discussion there.


if (allowUserInfo && !userInfo().isEmpty()) {
static const CharacterSet uiChars = CharacterSet(UserInfoChars())
.remove('%')
.rename("userinfo-reserved");
absolute_.append(Encode(userInfo(), uiChars));
absolute_.append("@", 1);
}
absolute_.append(authority());
absolute_.append(path()); // TODO: Encode each URI subcomponent in path_ as needed.
}

Expand All @@ -723,15 +671,15 @@ AnyP::Uri::absolute() const
* and never copy the query-string part in the first place
*/
char *
urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &method, const AnyP::UriScheme &scheme)
urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &method)
{
LOCAL_ARRAY(char, buf, MAX_URL);

snprintf(buf, sizeof(buf), SQUIDSBUFPH, SQUIDSBUFPRINT(url));
buf[sizeof(buf)-1] = '\0';

// URN, CONNECT method, and non-stripped URIs can go straight out
if (Config.onoff.strip_query_terms && !(method == Http::METHOD_CONNECT || scheme == AnyP::PROTO_URN)) {
// CONNECT method and non-stripped URIs can go straight out
if (Config.onoff.strip_query_terms && method != Http::METHOD_CONNECT) {
// strip anything AFTER a question-mark
// leaving the '?' in place
if (auto t = strchr(buf, '?')) {
Expand Down Expand Up @@ -814,10 +762,6 @@ urlIsRelative(const char *url)
void
AnyP::Uri::addRelativePath(const char *relUrl)
{
// URN cannot be merged
if (getScheme() == AnyP::PROTO_URN)
return;

// TODO: Handle . and .. segment normalization

const auto lastSlashPos = path_.rfind('/');
Expand Down Expand Up @@ -962,7 +906,6 @@ urlCheckRequest(const HttpRequest * r)
/* does method match the protocol? */
switch (r->url.getScheme()) {

case AnyP::PROTO_URN:
case AnyP::PROTO_HTTP:
return true;

Expand Down
9 changes: 2 additions & 7 deletions src/anyp/Uri.h
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ namespace AnyP

/**
* Represents a Uniform Resource Identifier.
* Can store both URL or URN representations.
*
* Governed by RFC 3986
*/
Expand Down Expand Up @@ -138,8 +137,6 @@ class Uri
SBuf &absolute() const;

private:
void parseUrn(Parser::Tokenizer&);

SBuf parseHost(Parser::Tokenizer &) const;
int parsePort(Parser::Tokenizer &) const;

Expand Down Expand Up @@ -192,9 +189,7 @@ operator <<(std::ostream &os, const Uri &url)
os << url.getScheme().image();
os << ":";

// no authority section on URN
if (url.getScheme() != PROTO_URN)
os << "//" << url.authority();
os << "//" << url.authority();

// path is what it is - including absent
os << url.path();
Expand All @@ -211,7 +206,7 @@ void urlInitialize(void);
/// call HttpRequest::canonicalCleanUrl() instead if you have HttpRequest
/// \returns a pointer to a local static buffer containing request URI
/// that honors strip_query_terms and %-encodes unsafe URI characters
char *urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &, const AnyP::UriScheme &);
char *urlCanonicalCleanWithoutRequest(const SBuf &url, const HttpRequestMethod &);
const char *urlCanonicalFakeHttps(const HttpRequest * request);
bool urlIsRelative(const char *);
char *urlRInternal(const char *host, unsigned short port, const char *dir, const char *name);
Expand Down
2 changes: 1 addition & 1 deletion src/anyp/UriScheme.h
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ using KnownPort = uint16_t;
/// validated/supported port number (if any)
using Port = std::optional<KnownPort>;

/** This class represents a URI Scheme such as http:// https://, wais://, urn: etc.
/** This class represents a URI Scheme such as http:// https://, wais:// etc.
* It does not represent the PROTOCOL that such schemes refer to.
*/
class UriScheme
Expand Down
2 changes: 1 addition & 1 deletion src/cf.data.pre
Original file line number Diff line number Diff line change
Expand Up @@ -1261,7 +1261,7 @@ ENDIF
# destination TCP port (or port range) of the request [fast]
#
# Port 0 matches requests that have no explicit and no default destination
# ports (e.g., HTTP requests with URN targets)
# ports (e.g., HTTP requests with ICY, ICP, and HTCP targets).

acl aclname localport 3128 ... # TCP port the client connected to [fast]
# NP: for interception mode this is usually '80'
Expand Down
Loading