Skip to Main content Skip to Navigation
Theses

Boost the Reliability of the Linux Kernel : Debugging kernel oopses

Lisong Guo 1
1 Regal - Large-Scale Distributed Systems and Applications
LIP6 - Laboratoire d'Informatique de Paris 6, Inria Paris-Rocquencourt
Abstract : When a failure occurs in the Linux kernel, the kernel emits an error report called “kernel oops”, summarizing the execution context of the failure. Kernel oopses describe real Linux errors, and thus can help prioritize debugging efforts and motivate the design of tools to improve the reliability of Linux code. Nevertheless, the information is only meaningful if it is representative and can be interpreted correctly. In this thesis, we study a collection of kernel oopses over a period of 8 months from a repository that is maintained by Red Hat. We consider the overall features of the data, the degree to which the data reflects other information about Linux, and the interpretation of features that may be relevant to reliability. We find that the data correlates well with other information about Linux, but that it suffers from duplicate and missing information. We furthermore identify some potential pitfalls in studying features such as the sources of common faults and common failing applications. Furthermore, a kernel oops provides valuable first-hand information for a Linux kernel maintainer to conduct postmortem debugging, since it logs the status of the Linux kernel at the time of a crash. However, debugging based on only the information in a kernel oops is difficult. To help developers with debugging, we devised a solution to derive the offending line from a kernel oops, i.e., the line of source code that incurs the crash. For this, we propose a novel algorithm based on approximate sequence matching, as used in bioinformatics, to automatically pinpoint the offending line based on information about nearby machine-code instructions, as found in a kernel oops. Our algorithm achieves 92% accuracy compared to 26% for the traditional approach of using only the oops instruction pointer. We integrated the solution into a tool named OOPSA, which would relieve some burden for the developers with the kernel oops debugging.
Document type :
Theses
Complete list of metadatas

Cited literature [120 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-01128792
Contributor : Abes Star :  Contact
Submitted on : Tuesday, March 10, 2015 - 1:44:05 PM
Last modification on : Friday, January 8, 2021 - 5:46:03 PM
Long-term archiving on: : Thursday, June 11, 2015 - 10:55:59 AM

File

2014PA066378.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-01128792, version 1

Citation

Lisong Guo. Boost the Reliability of the Linux Kernel : Debugging kernel oopses. Operating Systems [cs.OS]. Université Pierre et Marie Curie - Paris VI, 2014. English. ⟨NNT : 2014PA066378⟩. ⟨tel-01128792⟩

Share

Metrics

Record views

529

Files downloads

741