llhttp

HTTP parser in LLVM IR

Usage no npm install needed!

<script type="module">
  import llhttp from 'https://cdn.skypack.dev/llhttp';
</script>

README

llhttp

Build Status

Port of http_parser to llparse.

Why?

Let's face it, http_parser is practically unmaintainable. Even introduction of a single new method results in a significant code churn.

This project aims to:

  • Make it maintainable
  • Verifiable
  • Improving benchmarks where possible

How?

Over time, different approaches for improving http_parser's code base were tried. However, all of them failed due to resulting significant performance degradation.

This project is a port of http_parser to TypeScript. llparse is used to generate the output C and/or bitcode artifacts, which could be compiled and linked with the embedder's program (like Node.js).

Peformance

So far llhttp outperforms http_parser:

input size bandwidth reqs/sec time
llhttp (C) 8192.00 mb 1497.88 mb/s 3020458.87 ops/sec 5.47 s
llhttp (bitcode) 8192.00 mb 1131.75 mb/s 2282171.24 ops/sec 7.24 s
http_parser 8192.00 mb 694.66 mb/s 1406180.33 req/sec 11.79 s

llhttp is faster by approximately 116%.

Maintenance

llhttp project has about 1400 lines of TypeScript code describing the parser itself and around 450 lines of C code and headers providing the helper methods. The whole http_parser is implemented in approximately 2500 lines of C, and 436 lines of headers.

All optimizations and multi-character matching in llhttp are generated automatically, and thus doesn't add any extra maintenance cost. On the contrary, most of http_parser's code is hand-optimized and unrolled. Instead describing "how" it should parse the HTTP requests/responses, a maintainer should implement the new features in http_parser cautiously, considering possible performance degradation and manually optimizing the new code.

Verification

The state machine graph is encoded explicitly in llhttp. The llparse automatically checks the graph for absence of loops and correct reporting of the input ranges (spans) like header names and values. In the future, additional checks could be performed to get even stricter verification of the llhttp.

Usage

#include "llhttp.h"

llhttp_t parser;
llhttp_settings_t settings;

/* Initialize user callbacks and settings */
llhttp_settings_init(&settings);

/* Set user callback */
settings.on_message_complete = handle_on_message_complete;

/* Initialize the parser in HTTP_BOTH mode, meaning that it will select between
 * HTTP_REQUEST and HTTP_RESPONSE parsing automatically while reading the first
 * input.
 */
llhttp_init(&parser, HTTP_BOTH, &settings);

/* Use `llhttp_set_type(&parser, HTTP_REQUEST);` to override the mode */

/* Parse request! */
const char* request = "GET / HTTP/1.1\r\n\r\n";
int request_len = strlen(request);

enum llhttp_errno err = llhttp_execute(&parser, request, request_len);
if (err == HPE_OK) {
  /* Successfully parsed! */
} else {
  fprintf(stderr, "Parse error: %s %s\n", llhttp_errno_name(err),
          parser.reason);
}

LICENSE

This software is licensed under the MIT License.

Copyright Fedor Indutny, 2018.

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.