Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Restrict squid.conf preprocessor space characters to SP and HT (#1858)
Here, "preprocessor space" refers to a portion of squid.conf grammar that covers preprocessor directives (e.g., `if` and `include` statements) and separation of a directive name from directive parameters. This change replaces all known xisspace() uses in preprocessor code with code that only recognizes SP and HT as space characters. Prior to this change, Squid preprocessor also classified as space ASCII VT and FF characters as well as any other locale-specific characters classified as space by isspace(3). Squid preprocessor still delimits input lines using LF and terminates each read line at the first CR or LF character (if any), whichever comes earlier. This behavior precludes CR and LF characters from reaching individual directive line parsers. However, directive parameter values loaded by ConfigParser when honoring "quoted filename" or parameters(filename) syntax are (still) tokenized using w_space which includes an ASCII CR character. Thus, a "foo_CR_bar" 9-character sequence is still interpreted as a single "foo_" token by the preprocessor and as two tokens ("foo_" and "_bar") by ConfigParser::strtokFile(). After preprocessing, directive parameters are (still) parsed by parsers specific to that directive type. Many of those parsers are based on ConfigParser::TokenParse() that effectively uses the same "SP and HT only" logic for inlined directive parameters. However, some of those parsers define "space" differently (and may use non-space-like characters for token delimiters). For example, response_delay_pool still indirectly uses isspace(3) to parse integer parameters, and the following "until eol" directives still use xisspace() to skip space before their parameter value: debug_options, err_html_text, mail_program, sslcrtd_program, and sslcrtvalidator_program. More than 100 uses of xisspace() and indirect uses of isspace(3) remain in Squid code. Most of those use cases are in configuration parsing code. Restricting all relevant use cases one-by-one may not be practical. On the other hand, restricting all configuration input lines to prohibit VT, FF, CR, and locale-specific characters classified as space by isspace(3) will break rare configs that use those characters in places like URL paths and regexes. Due to inconsistencies highlighted above, there is no consensus regarding this change among Squid developers. This experiment may need to be reverted or further grammar changes may need to be implemented based on testing and deployment experience. Co-authored-by: Amos Jeffries <[email protected]>
- Loading branch information