haproxy

mirror of https://github.com/haproxy/haproxy.git synced 2026-04-15 21:59:41 -04:00

Author	SHA1	Message	Date
Willy Tarreau	e158b7efb7	CLEANUP: h1: make use of the multi-byte matching functions Instead of leaving the hard-coded non-trivial operations in the H1 parsing code, let's just rely on the new intops functions that do the same and that are less prone to being accidentally touched. It was verified that the resulting code is exactly the same.	2024-04-24 16:05:38 +02:00
Willy Tarreau	b9bf16b382	BUG/MINOR: h1: fix detection of upper bytes in the URI In 1.7 with commit `5f10ea30f4` ("OPTIM: http: improve parsing performance of long URIs") we improved the URI parser's performance on platforms supporting unaligned accesses by reading 4 chars at a time in a 32-bit word. However, as reported in GH issue #2545, there's a bug in the way the top bytes are checked, as the parser will stop when all 4 of them are above 7e instead of when one of them is, so certain patterns can be accepted through if the last ones are all valid. The fix requires to negate the value but on the other hand it allows to parallelize some of the tests and fuse the masks, which could even end up slightly faster. This needs to be backported to all stable versions, but be careful, this code moved a lot over time, from proto_http.c to h1.c, to http_msg.c, to h1.c again. Better just grep for "24242424" or "21212121" in each version to find it. Big kudos to Martijn van Oosterhout (@kleptog) for spotting this problem while analyzing that piece of code, and reporting it.	2024-04-24 11:50:36 +02:00
Willy Tarreau	fadabc430f	CLEANUP: h1: remove unused function h1_measure_trailers() This one stopped being used in 2.1 when HTX became mandatory, let's drop it.	2024-01-31 15:22:12 +01:00
Willy Tarreau	0d76a284b6	BUG/MEDIUM: h1: always reject the NUL character in header values Ben Kallus kindly reported that we still hadn't blocked the NUL character from header values as clarified in RFC9110 and that, even though there's no known issure related to this, it may one day be used to construct an attack involving another component. Actually, both Christopher and I sincerely believed we had done it prior to releasing 2.9, shame on us for missing that one and thanks to Ben for the reminder! The change was applied, it was confirmed to properly reject this NUL byte from both header and trailer values, and it's still possible to force it to continue to be supported using the usual pair of unsafe "option accept-invalid-http-{request\|response}" for those who would like to keep it for whatever reason that wouldn't make sense. This was tagged medium so that distros also remember to apply it as a preventive measure. It should progressively be backported to all versions down to 2.0.	2024-01-31 15:22:12 +01:00
Christopher Faulet	e7964eac2d	BUG/MEDIUM: h1: Ignore C-L value in the H1 parser if T-E is also set In fact, during the parsing there is already a test to remove the Content-Length header if a Transfer-Encoding one is found. However, in the parser, the content-length value was still used to set the body length (the final one and the remaining one). This value is thus also used to set the extra field in the HTX message and is then used during the sending stage to announce the chunk size. So, Content-Length header value must be ignored by the H1 parser to properly reformat the message when it is sent. This patch must be backported as far as 2.6. Lower versions don"t handle this case.	2023-10-04 15:34:18 +02:00
Willy Tarreau	22731762d9	BUG/MINOR: http: skip leading zeroes in content-length values Ben Kallus also noticed that we preserve leading zeroes on content-length values. While this is totally valid, it would be safer to at least trim them before passing the value, because a bogus server written to parse using "strtol(value, NULL, 0)" could inadvertently take a leading zero as a prefix for an octal value. While there is not much that can be done to protect such servers in general (e.g. lack of check for overflows etc), at least it's quite cheap to make sure the transmitted value is normalized and not taken for an octal one. This is not really a bug, rather a missed opportunity to sanitize the input, but is marked as a bug so that we don't forget to backport it to stable branches. A combined regtest was added to h1or2_to_h1c which already validates end-to-end syntax consistency on aggregate headers.	2023-08-09 11:28:48 +02:00
Willy Tarreau	6492f1f29d	BUG/MAJOR: http: reject any empty content-length header value The content-length header parser has its dedicated function, in order to take extreme care about invalid, unparsable, or conflicting values. But there's a corner case in it, by which it stops comparing values when reaching the end of the header. This has for a side effect that an empty value or a value that ends with a comma does not deserve further analysis, and it acts as if the header was absent. While this is not necessarily a problem for the value ending with a comma as it will be cause a header folding and will disappear, it is a problem for the first isolated empty header because this one will not be recontructed when next ones are seen, and will be passed as-is to the backend server. A vulnerable HTTP/1 server hosted behind haproxy that would just use this first value as "0" and ignore the valid one would then not be protected by haproxy and could be attacked this way, taking the payload for an extra request. In field the risk depends on the server. Most commonly used servers already have safe content-length parsers, but users relying on haproxy to protect a known-vulnerable server might be at risk (and the risk of a bug even in a reputable server should never be dismissed). A configuration-based work-around consists in adding the following rule in the frontend, to explicitly reject requests featuring an empty content-length header that would have not be folded into an existing one: http-request deny if { hdr_len(content-length) 0 } The real fix consists in adjusting the parser so that it always expects a value at the beginning of the header or after a comma. It will now reject requests and responses having empty values anywhere in the C-L header. This needs to be backported to all supported versions. Note that the modification was made to functions h1_parse_cont_len_header() and http_parse_cont_len_header(). Prior to 2.8 the latter was in h2_parse_cont_len_header(). One day the two should be refused but the former is also used by Lua. The HTTP messaging reg-tests were completed to test these cases. Thanks to Ben Kallus of Dartmouth College and Narf Industries for reporting this! (this is in GH #2237).	2023-08-09 09:27:38 +02:00
Willy Tarreau	2eab6d3543	BUG/MINOR: h1: do not accept '#' as part of the URI component Seth Manesse and Paul Plasil reported that the "path" sample fetch function incorrectly accepts '#' as part of the path component. This can in some cases lead to misrouted requests for rules that would apply on the suffix: use_backend static if { path_end .png .jpg .gif .css .js } Note that this behavior can be selectively configured using "normalize-uri fragment-encode" and "normalize-uri fragment-strip". The problem is that while the RFC says that this '#' must never be emitted, as often it doesn't suggest how servers should handle it. A diminishing number of servers still do accept it and trim it silently, while others are rejecting it, as indicated in the conversation below with other implementers: https://lists.w3.org/Archives/Public/ietf-http-wg/2023JulSep/0070.html Looking at logs from publicly exposed servers, such requests appear at a rate of roughly 1 per million and only come from attacks or poorly written web crawlers incorrectly following links found on various pages. Thus it looks like the best solution to this problem is to simply reject such ambiguous requests by default, and include this in the list of controls that can be disabled using "option accept-invalid-http-request". We're already rejecting URIs containing any control char anyway, so we should also reject '#'. In the H1 parser for the H1_MSG_RQURI state, there is an accelerated parser for bytes 0x21..0x7e that has been tightened to 0x24..0x7e (it should not impact perf since 0x21..0x23 are not supposed to appear in a URI anyway). This way '#' falls through the fine-grained filter and we can add the special case for it also conditionned by a check on the proxy's option "accept-invalid-http-request", with no overhead for the vast majority of valid URIs. Here this information is available through h1m->err_pos that's set to -2 when the option is here (so we don't need to change the API to expose the proxy). Example with a trivial GET through netcat: [08/Aug/2023:16:16:52.651] frontend layer1 (#2): invalid request backend <NONE> (#-1), server <NONE> (#-1), event #0, src 127.0.0.1:50812 buffer starts at 0 (including 0 out), 16361 free, len 23, wraps at 16336, error at position 7 H1 connection flags 0x00000000, H1 stream flags 0x00000810 H1 msg state MSG_RQURI(4), H1 msg flags 0x00001400 H1 chunk len 0 bytes, H1 body len 0 bytes : 00000 GET /aa#bb HTTP/1.0\r\n 00021 \r\n This should be progressively backported to all stable versions along with the following patch: REGTESTS: http-rules: add accept-invalid-http-request for normalize-uri tests Similar fixes for h2 and h3 will come in followup patches. Thanks to Seth Manesse and Paul Plasil for reporting this problem with detailed explanations.	2023-08-08 19:56:11 +02:00
Willy Tarreau	a8598a2eb1	BUG/CRITICAL: http: properly reject empty http header field names The HTTP header parsers surprizingly accepts empty header field names, and this is a leftover from the original code that was agnostic to this. When muxes were introduced, for H2 first, the HPACK decompressor needed to feed headers lists, and since empty header names were strictly forbidden by the protocol, the lists of headers were purposely designed to be terminated by an empty header field name (a principle that is similar to H1's empty line termination). This principle was preserved and generalized to other protocols migrated to muxes (H1/FCGI/H3 etc) without anyone ever noticing that the H1 parser was still able to deliver empty header field names to this list. In addition to this it turns out that the HPACK decompressor, despite a comment in the code, may successfully decompress an empty header field name, and this mistake was propagated to the QPACK decompressor as well. The impact is that an empty header field name may be used to truncate the list of headers and thus make some headers disappear. While for H2/H3 the impact is limited as haproxy sees a request with missing headers, and headers are not used to delimit messages, in the case of HTTP/1, the impact is significant because the presence (and sometimes contents) of certain sensitive headers is detected during the parsing. Thus, some of these headers may be seen, marked as present, their value extracted, but never delivered to upper layers and obviously not forwarded to the other side either. This can have for consequence that certain important header fields such as Connection, Upgrade, Host, Content-length, Transfer-Encoding etc are possibly seen as different between what haproxy uses to parse/forward/route and what is observed in http-request rules and of course, forwarded. One direct consequence is that it is possible to exploit this property in HTTP/1 to make affected versions of haproxy forward more data than is advertised on the other side, and bypass some access controls or routing rules by crafting extraneous requests. Note, however, that responses to such requests will normally not be passed back to the client, but this can still cause some harm. This specific risk can be mostly worked around in configuration using the following rule that will rely on the bug's impact to precisely detect the inconsistency between the known body size and the one expected to be advertised to the server (the rule works from 2.0 to 2.8-dev): http-request deny if { fc_http_major 1 } !{ req.body_size 0 } !{ req.hdr(content-length) -m found } !{ req.hdr(transfer-encoding) -m found } !{ method CONNECT } This will exclusively block such carefully crafted requests delivered over HTTP/1. HTTP/2 and HTTP/3 do not need content-length, and a body that arrives without being announced with a content-length will be forwarded using transfer-encoding, hence will not cause discrepancies. In HAProxy 2.0 in legacy mode ("no option http-use-htx"), this rule will simply have no effect but will not cause trouble either. A clean solution would consist in modifying the loops iterating over these headers lists to check the header name's pointer instead of its length (since both are zero at the end of the list), but this requires to touch tens of places and it's very easy to miss one. Functions such as htx_add_header(), htx_add_trailer(), htx_add_all_headers() would be good starting points for such a possible future change. Instead the current fix focuses on blocking empty headers where they are first inserted, hence in the H1/HPACK/QPACK decoders. One benefit of the current solution (for H1) is that it allows "show errors" to report a precise diagnostic when facing such invalid HTTP/1 requests, with the exact location of the problem and the originating address: $ printf "GET / HTTP/1.1\r\nHost: localhost\r\n:empty header\r\n\r\n" \| nc 0 8001 HTTP/1.1 400 Bad request Content-length: 90 Cache-Control: no-cache Connection: close Content-Type: text/html <html><body><h1>400 Bad request</h1> Your browser sent an invalid request. </body></html> $ socat /var/run/haproxy.stat <<< "show errors" Total events captured on [10/Feb/2023:16:29:37.530] : 1 [10/Feb/2023:16:29:34.155] frontend decrypt (#2): invalid request backend <NONE> (#-1), server <NONE> (#-1), event #0, src 127.0.0.1:31092 buffer starts at 0 (including 0 out), 16334 free, len 50, wraps at 16336, error at position 33 H1 connection flags 0x00000000, H1 stream flags 0x00000810 H1 msg state MSG_HDR_NAME(17), H1 msg flags 0x00001410 H1 chunk len 0 bytes, H1 body len 0 bytes : 00000 GET / HTTP/1.1\r\n 00016 Host: localhost\r\n 00033 :empty header\r\n 00048 \r\n I want to address sincere and warm thanks for their great work to the team composed of the following security researchers who found the issue together and reported it: Bahruz Jabiyev, Anthony Gavazzi, and Engin Kirda from Northeastern University, Kaan Onarlioglu from Akamai Technologies, Adi Peleg and Harvey Tuch from Google. And kudos to Amaury Denoyelle from HAProxy Technologies for spotting that the HPACK and QPACK decoders would let this pass despite the comment explicitly saying otherwise. This fix must be backported as far as 2.0. The QPACK changes can be dropped before 2.6. In 2.0 there is also the equivalent code for legacy mode, which doesn't suffer from the list truncation, but it would better be fixed regardless. CVE-2023-25725 was assigned to this issue.	2023-02-14 08:48:54 +01:00
Christopher Faulet	e16ffb0383	BUG/MINOR: h1: Replace authority validation to conform RFC3986 Except for CONNECT method, where a normalization is performed, we expected to have an exact match between the authority and the host header value. However it was too strict. Indeed, default port must be handled and the matching must respect the RFC3986. There is already a scheme based normalizeation performed on the URI later, on the HTX message. And we cannot normalize the URI during H1 parsing to be able to report proper errors on the original raw buffer. And a systematic read-only normalization to validate the authority will consume CPU for only few requests. At the end, we decided to perform extra-checks when the exact match fails. Now, following authority/host are considered as equivalent: http: domain.com <=> domain.com:80 <=> domain.com: https: domain.com <=> domain.com:443 <=> domain.com: This patch depends on: * MINOR: h1: Consider empty port as invalid in authority for CONNECT * MINOR: http: Considere empty ports as valid default ports It is a bug regarding the RFC3986. Technically, I may be backported as far as 2.4. However, this must be discussed first. If it is backported, the commits above must be backported too.	2022-11-22 17:49:10 +01:00
Christopher Faulet	75348c2e8b	MINOR: h1: Consider empty port as invalid in authority for CONNECT For now, this change is useless because http_get_host_port() returns IST_NULL when the port is empty. But this will change. For other methods, empty ports are valid. But not for CONNECT method. To still return a 400-Bad-Request if a CONNECT is performed with an empty port, istlen() is used to test the port, instead of isttest().	2022-11-22 16:27:52 +01:00
Willy Tarreau	55d2e8577e	BUILD: h1: silence an initiialized warning with gcc-4.7 and -Os Building h1.c with gcc-4.7 -Os produces the following warning: src/h1.c: In function 'h1_headers_to_hdr_list': src/h1.c:1101:36: warning: 'ptr' may be used uninitialized in this function [-Wmaybe-uninitialized] In fact ptr may be taken from sl.rq.u.ptr which is only initialized after passing through the relevant states, but gcc doesn't know which states are visited. Adding an ALREADY_CHECKED() statement there is sufficient to shut it up and doesn't affect the emitted code. This may be backported to stable versions to make sure that builds on older distros and systems is clean.	2022-10-04 08:02:03 +02:00
Christopher Faulet	3f5fbe9407	BUG/MEDIUM: h1: Improve authority validation for CONNCET request From time to time, users complain to get 400-Bad-request responses for totally valid CONNECT requests. After analysis, it is due to the H1 parser performs an exact match between the authority and the host header value. For non-CONNECT requests, it is valid. But for CONNECT requests the authority must contain a port while it is often omitted from the host header value (for default ports). So, to be sure to not reject valid CONNECT requests, a basic authority validation is now performed during the message parsing. In addition, the host header value is normalized. It means the default port is removed if possible. This patch should solve the issue #1761. It must be backported to 2.6 and probably as far as 2.4.	2022-07-07 09:35:58 +02:00
Tim Duesterhus	7750850594	CLEANUP: Reapply ist.cocci with `--include-headers-for-types --recursive-includes` Previous uses of `ist.cocci` did not add `--include-headers-for-types` and `--recursive-includes` preventing Coccinelle seeing `struct ist` members of other structs. Reapply the patch with proper flags to further clean up the use of the ist API. The command used was: spatch -sp_file dev/coccinelle/ist.cocci -in_place --include-headers --include-headers-for-types --recursive-includes --dir src/	2022-03-21 08:30:47 +01:00
Christopher Faulet	02c893332b	BUG/MEDIUM: h1: Properly reset h1m flags when headers parsing is restarted If H1 headers are not fully received at once, the parsing is restarted a last time when all headers are finally received. When this happens, the h1m flags are sanitized to remove all value set during parsing. But some flags where erroneously preserved. Among others, H1_MF_TE_CHUNKED flag was not removed, what could lead to parsing error. To fix the bug and make things easy, a mask has been added with all flags that must be preserved. It will be more stable. This mask is used to sanitize h1m flags. This patch should fix the issue #1469. It must be backported to 2.5.	2021-12-02 09:46:29 +01:00
Tim Duesterhus	4c8f75fc31	CLEANUP: Apply ist.cocci Make use of the new rules to use `istend()`.	2021-11-08 08:05:39 +01:00
Christopher Faulet	545fbba273	MINOR: h1: Change T-E header parsing to fail if chunked encoding is found twice According to the RFC7230, "chunked" encoding must not be applied more than once to a message body. To handle this case, h1_parse_xfer_enc_header() is now responsible to fail when a parsing error is found. It also fails if the "chunked" encoding is not the last one for a request. To help the parsing, two H1 parser flags have been added: H1_MF_TE_CHUNKED and H1_MF_TE_OTHER. These flags are set, respectively, when "chunked" encoding and any other encoding are found. H1_MF_CHNK flag is used when "chunked" encoding is the last one.	2021-09-28 16:21:25 +02:00
Christopher Faulet	631c7e8665	MEDIUM: h1: Force close mode for invalid uses of T-E header Transfer-Encoding header is not supported in HTTP/1.0. However, softwares dealing with HTTP/1.0 and HTTP/1.1 messages may accept it and transfer it. When a Content-Length header is also provided, it must be ignored. Unfortunately, this may lead to vulnerabilities (request smuggling or response splitting) if an intermediary is only implementing HTTP/1.0. Because it may ignore Transfer-Encoding header and only handle Content-Length one. To avoid any security issues, when Transfer-Encoding and Content-Length headers are found in a message, the close mode is forced. The same is performed for HTTP/1.0 message with a Transfer-Encoding header only. This change is conform to what it is described in the last HTTP/1.1 draft. See also httpwg/http-core#879. Note that Content-Length header is also removed from any incoming messages if a Transfer-Encoding header is found. However it is not true (not yet) for responses generated by HAProxy.	2021-09-28 16:21:25 +02:00
Amaury Denoyelle	69294b20ac	MINOR: http: use http uri parser for authority Replace http_get_authority by the http_uri_parser API. The new function is renamed http_parse_authority. Replace duplicated scheme parsing code by http_parse_scheme invocation. A new http_uri_parser state is declared to mark the authority parsing as done.	2021-07-08 17:11:17 +02:00
Amaury Denoyelle	aad333a9fc	MEDIUM: h1: add a WebSocket key on handshake if needed Add the header Sec-Websocket-Key when generating a h1 handshake websocket without this header. This is the case when doing h2-h1 conversion. The key is randomly generated and base64 encoded. It is stored on the session side to be able to verify response key and reject it if not valid.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	c193823343	MEDIUM: h1: generate WebSocket key on response if needed Add the Sec-Websocket-Accept header on a websocket handshake response. This header may be missing if a h2 server is used with a h1 client. The response key is calculated following the rfc6455. For this, the handshake request key must be stored in the h1 session, as a new field name ws_key. Note that this is only done if the message has been prealably identified as a Websocket handshake request.	2021-01-28 16:37:14 +01:00
Amaury Denoyelle	18ee5c3eb0	MINOR: h1: reject websocket handshake if missing key If a request is identified as a WebSocket handshake, it must contains a websocket key header or else it can be reject, following the rfc6455. A new flag H1_MF_UPG_WEBSOCKET is set on such messages. For the request te be identified as a WebSocket handshake, it must contains the headers: Connection: upgrade Upgrade: websocket This commit is a compagnon of "MEDIUM: h1: generate WebSocket key on response if needed" and "MEDIUM: h1: add a WebSocket key on handshake if needed". Indeed, it ensures that a WebSocket key is added only from a http/2 side and not for a http/1 bogus peer.	2021-01-28 16:37:14 +01:00
Willy Tarreau	f278eec37a	BUILD: tree-wide: cast arguments to tolower/toupper to unsigned char NetBSD apparently uses macros for tolower/toupper and complains about the use of char for array subscripts. Let's properly cast all of them to unsigned char where they are used. This is needed to fix issue #729.	2020-07-05 21:50:02 +02:00
Ilya Shipitsin	47d17182f4	CLEANUP: assorted typo fixes in the code and comments This is 10th iteration of typo fixes	2020-06-26 11:27:28 +02:00
Willy Tarreau	f1d32c475c	REORG: include: move channel.h to haproxy/channel{,-t}.h The files were moved with no change. The callers were cleaned up a bit and a few of them had channel.h removed since not needed.	2020-06-11 10:18:58 +02:00
Willy Tarreau	5413a87ad3	REORG: include: move common/h1.h to haproxy/h1.h The file was moved as-is. There was a wrong dependency on dynbuf.h instead of buf.h which was addressed. There was no benefit to splitting this between types and functions.	2020-06-11 10:18:57 +02:00
Willy Tarreau	0017be0143	REORG: include: split common/http-hdr.h into haproxy/http-hdr{,-t}.h There's only one struct and 2 inline functions. It could have been merged into http.h but that would have added a massive dependency on the hpack parts for nothing, so better keep it this way since hpack is already freestanding and portable.	2020-06-11 10:18:57 +02:00
Willy Tarreau	4c7e4b7738	REORG: include: update all files to use haproxy/api.h or api-t.h if needed All files that were including one of the following include files have been updated to only include haproxy/api.h or haproxy/api-t.h once instead: - common/config.h - common/compat.h - common/compiler.h - common/defaults.h - common/initcall.h - common/tools.h The choice is simple: if the file only requires type definitions, it includes api-t.h, otherwise it includes the full api.h. In addition, in these files, explicit includes for inttypes.h and limits.h were dropped since these are now covered by api.h and api-t.h. No other change was performed, given that this patch is large and affects 201 files. At least one (tools.h) was already freestanding and didn't get the new one added.	2020-06-11 10:18:42 +02:00
Christopher Faulet	7032a3fd0a	BUG/MEDIUM: h1: Don't compare host and authority if only h1 headers are parsed When only request headers are parsed, the host header should not be compared to the request authority because no start-line was parsed. Thus there is no authority. Till now this bug was hidden because this parsing mode was only used for the response in the FCGI multiplexer. Since the HTTP checks refactoring, the request headers may now also be parsed without the start-line. This patch fixes the issue #610. It must be backported to 2.1.	2020-05-04 09:27:01 +02:00
Willy Tarreau	02ac950a11	CLEANUP: http/h1: rely on HA_UNALIGNED_LE instead of checking for CPU families Now that we have flags indicating the CPU's capabilities, better use them instead of missing some updates for new CPU families (ARMv8 was missing there).	2020-02-21 16:32:57 +01:00
Christopher Faulet	1703478e2d	BUG/MINOR: h1: Report the right error position when a header value is invalid During H1 messages parsing, when the parser has finished to parse a full header line, some tests are performed on its value, depending on its name, to be sure it is valid. The content-length is checked and converted in integer and the host header is also checked. If an error occurred during this step, the error position must point on the header value. But from the parser point of view, we are already on the start of the next header. Thus the effective reported position in the error capture is the beginning of the unparsed header line. It is a bit confusing when we try to figure out why a message is rejected. Now, the parser state is updated to point on the invalid value. This way, the error position really points on the right position. This patch must be backported as far as 1.9.	2020-01-06 13:58:21 +01:00
Christopher Faulet	bc7c03eba3	BUG/MINOR: h1: Don't test the host header during response parsing During the H1 message parsing, the host header is tested to be sure it matches the request's authority, if defined. When there are multiple host headers, we also take care they are all the same. Of course, these tests must only be performed on the requests. A host header in a response has no special meaning. This patch must be backported to 2.1.	2019-11-27 14:01:17 +01:00
Christopher Faulet	531b83e039	MINOR: h1: Reject requests if the authority does not match the header host As stated in the RCF7230#5.4, a client must send a field-value for the header host that is identical to the authority if the target URI includes one. So, now, by default, if the authority, when provided, does not match the value of the header host, an error is triggered. To mitigate this behavior, it is possible to set the option "accept-invalid-http-request". In that case, an http error is captured without interrupting the request parsing.	2019-10-14 22:28:50 +02:00
Christopher Faulet	497ab4f519	MINOR: h1: Reject requests with different occurrences of the header host There is no reason for a client to send several headers host. It even may be considered as a bug. However, it is totally invalid to have different values for those. So now, in such case, an error is triggered during the request parsing. In addition, when several headers host are found with the same value, only the first instance is kept and others are skipped.	2019-10-14 22:28:50 +02:00
Christopher Faulet	84f06533e1	BUG/MINOR: h1: Properly reset h1m when parsing is restarted Otherwise some processing may be performed twice. For instance, if the header "Content-Length" is parsed on the first pass, when the parsing is restarted, we skip it because we think another header with the same value was already seen. In fact, it is currently the only existing bug that can be encountered. But it is safer to reset all the h1m on restart to avoid any future bugs. This patch must be backported to 2.0 and 1.9	2019-09-04 10:30:11 +02:00
Christopher Faulet	711ed6ae4a	MAJOR: http: Remove the HTTP legacy code First of all, all legacy HTTP analyzers and all functions exclusively used by them were removed. So the most of the functions in proto_http.{c,h} were removed. Only functions to deal with the HTTP transaction have been kept. Then, http_msg and hdr_idx modules were entirely removed. And finally the structure http_msg was lightened of all its useless information about the legacy HTTP. The structure hdr_ctx was also removed because unused now, just like unused states in the enum h1_state. Note that the memory pool "hdr_idx" was removed and "http_txn" is now smaller.	2019-07-19 09:24:12 +02:00
Christopher Faulet	a51ebb7f56	MEDIUM: h1: Add an option to sanitize connection headers during parsing The flag H1_MF_CLEAN_CONN_HDR has been added to let the H1 parser sanitize connection headers. It means it will remove all "close" and "keep-alive" values during the parsing. One noticeable effect is that connection headers may be unfolded. In practice, this is not a problem because it is not frequent to have multiple values for the connection headers. If this flag is set, during the parsing The function h1_parse_next_connection_header() is called in a loop instead of h1_parse_conection_header(). No need to backport this patch	2019-04-12 22:06:53 +02:00
Christopher Faulet	68b1bbd767	BUG/MEDIUM: h1: Get the h1m state when restarting the headers parsing Since the commit `0f8fb6b7f` ("MINOR: h1: make the H1 headers block parser able to parse headers only"), when headers are not received in one time, a parsing error is returned because the local state in the function h1_headers_to_hdr_list() was not initialized with the previous one (in fact, it was not initialized at all). So now, we start the parsing of headers with the state H1_MSG_HDR_FIRST when the flag H1_MF_HDRS_ONLY is set. Otherwise, we always get it from the h1m. This patch must be backported to 1.9.	2019-01-04 16:23:03 +01:00
Willy Tarreau	0f8fb6b7f9	MINOR: h1: make the H1 headers block parser able to parse headers only Currently the H1 headers parser works for either a request or a response because it starts from the start line. It is also able to resume its processing when it was interrupted, but in this case it doesn't update the list. Make it support a new flag, H1_MF_HDRS_ONLY so that the caller can indicate it's only interested in the headers list and not the start line. This will be convenient to parse H1 trailers.	2019-01-04 10:48:03 +01:00
Willy Tarreau	afba57ae80	REORG: h1: merge types+proto into common/h1.h These two files are self-contained and do not depend on other layers, so let's remerge them together for easier manipulation.	2018-12-11 17:15:13 +01:00
Willy Tarreau	538746ad38	REORG: h1: move legacy http functions to http_msg.c Now that h1 and legacy HTTP are two distinct things, there's no need to keep the legacy HTTP parsers in h1.c since they're only used by the legacy code in proto_http.c, and h1.h doesn't need to include hdr_idx anymore. This concerns the following functions : - http_parse_reqline(); - http_parse_stsline(); - http_msg_analyzer(); - http_forward_trailers(); All of these were moved to http_msg.c.	2018-12-11 17:15:13 +01:00
Christopher Faulet	25da9e34f1	MINOR: h1: Add the flag H1_MF_NO_PHDR to not add pseudo-headers during parsing Some pseudo-headers are added during the headers parsing, mainly for the mux H2. With this flag, it is possible to not add them. This avoid some boring filtering in the mux H1.	2018-10-12 16:15:18 +02:00
Christopher Faulet	1dc2b49556	MINOR: h1: Change the union h1_sl to use indirect strings to store infos Instead of using offsets relating to the parsed buffer to store start line infos, we now use indirect strings. So now, these infos remain valid only if the origin buffer remains untouched. But it's not a real problem because this union is used during the parsing and never stored to a later use.	2018-10-12 16:14:57 +02:00
Christopher Faulet	ff08a92797	MINOR: h1: Add EOH marker during headers parsing When headers parsing ends, a pseudo header with an empty name and an empty value is added to the array of parsed headers to mark its end. It is convenient to loop on this array, but not really useful if we want remove the last header or add a new one, because we don't really know where is the last CRLF (the empty line ending the headers block). So now, instead the name of this pseudo header points on this last CRLF. Its length is still 0 and its value is still empty, so loops on the array remains unchanged.	2018-10-12 16:08:27 +02:00
Christopher Faulet	2912f87443	BUG/MEDIUM: h1: Really skip all updates when incomplete messages are parsed In h1_headers_to_hdr_list, when an incomplete message is parsed, all updates must be skipped until the end of the message is found. Then the parsing is restarted from the beginning. But not all updates were skipped, leading to invalid rewritting or segfault. No backport is needed.	2018-09-19 15:08:05 +02:00
Willy Tarreau	73373ab43a	MEDIUM: h1: deduplicate the content-length header Just like we used to do in proto_http, we now check that each and every occurrence of the content-length header field and each of its values are exactly identical, and we normalize the header to return the last value of the first header with spaces trimmed.	2018-09-14 19:04:28 +02:00
Willy Tarreau	2557f6a3e2	MEDIUM: h1: better handle transfer-encoding vs content-length The transfer-encoding header processing was a bit lenient in this part because it was made to read messages already validated by haproxy. We absolutely need to reinstate the strict processing defined in RFC7230 as is currently being done in proto_http.c. That is, transfer-encoding presence alone is enough to cancel content-length, and must be terminated by the "chunked" token, except in the response where we can fall back to the close mode if it's not last. For this we now use a specific parsing function which updates the flags and we introduce a new flag H1_MF_XFER_ENC indicating that the transfer-encoding header is present. Last, if such a header is found, we delete all content-length header fields found in the message.	2018-09-14 17:40:35 +02:00
Willy Tarreau	2ea6bb5c31	MINOR: h1: add headers to the list after controls, not before This will ease removal/skipping of duplicates such as content-length.	2018-09-14 17:40:35 +02:00
Willy Tarreau	98f5cf7a59	MINOR: h1: parse the Connection header field The new function h1_parse_connection_header() is called when facing a connection header in the generic parser, and it will set up to 3 bits in h1m->flags indicating if at least one "close", "keep-alive" or "upgrade" tokens was seen.	2018-09-13 14:52:31 +02:00
Willy Tarreau	ba5fbca33f	MINOR: h1: report in the h1m struct if the HTTP version is 1.1 or above This will be needed for the mux to know how to process the Connection header, and will save it from having to re-parse the request line since it's captured on the fly.	2018-09-13 14:34:09 +02:00

1 2

79 commits