Last week, I had a task to test what happens to a system when it attempts to send HTTP requests out to 3rd party webhooks and those 3rd parties fail to respond.
This came about when a customer was responding very slowly (sometimes more than an hour) to HTTP webhook requests, eventually using up all of the threads (on all of the machines) assigned to sending out webhooks and creating a backlog of unsent notifications to other customers.
It turned out that the root cause was that there is no default timeout on the Apache Commons HTTP Client used to send out HTTP requests - it will happily wait forever for a response.
While we reworked the queuing logic to avoid one customer affecting another, we needed to put a timeout in place. Easy enough, and easy enough to test in isolation, but I wanted to test it in-situ to make sure we’d got it right, so I used https://ngrok.io to route traffic from the Web to my local machine for testing and put together a quick Web server in Go to simulate some of the scenarios.
You can start it up with a few parameters to delay sending the first byte back to the HTTP client, then delay sending the rest of the response.
The default behaviour of the Go Web Server is to buffer output into 4KB chunks, which hampers testing the behaviour of sending some data, then hanging, so I set the response to be flushed to the client after each write as per this Stackoverflow question:
There’s also a client I wrote for testing, which just writes out a full stop (period) when it receives a byte to visually confirm the HTTP server behaviour.
There’s a denial-of-service attack of the same name https://en.wikipedia.org/wiki/Slowloris_(computer_security) which is, basically the same kind of thing that happened to the webhook service my team are supporting, except in reverse.
The nice photo of the Slow Loris is from https://commons.wikimedia.org/wiki/File:Nycticebus_pygmaeus_002.jpg