High-Performance Logging System

July 1997.

Conventional Web-servers log (archive) a variety of information about each hit or request. However, if a request invokes a program (using CGI, for example), the Web-server has no access to the operation of that program, and can not log any data thereof. This is a major limitation; for example, it cannot log the number of results that matched a search query.

Logging such information must intrinsically be the responsibility of the program. However, due care must be taken not to adversely impact performance. The logging system should not reduce throughput significantly. Furthermore, failures in the logging mechanism should not affect in any way the proper functioning of the program.

I designed a logging system to satisfy these constraints. It had several components:

The program would transmit the information to be logged packed into a UDP packet. Transmitting UDP packets is very light-weight. Furthermore, any failure down the line would not affect the program, as UDP packets are connectionless and don't have guaranteed delivery (block-and-wait semantics).
The UDP packets are received by one or more listeners (usually on another machine), and saved in a queue. This ensures that bursts of transmission can be accommodated. The queue uses semaphores to ensure that the multiple threads of the listeners don't corrupt the data structure.
The queue resides in shared memory. So archival processes, running in separate threads from the listeners, can retrieve the packets from the queue (again using the semaphores to ensure data integrity), and save them as they please.

All these components were implemented in Perl, with the system-level operations (UDP, semaphores, and shared memory) being implemented using the system-level calls. A bug in the Perl interpreter that handled semaphores proved an obstacle, whereupon I identified and fixed the bug in the Perl source-code and recompiled Perl.

The system ran close to the native speed of the machines, and could handle upto 1000 logging requests a second. In practice, the speed of the archival system to move the information into Oracle (about 100/s) proved to be the limiting factor; this was outside my control. However, even this lower speed was quite adequate for the purposes of Industry.Net.

Back
Rujith de Silva 1997-05-13