Skip to Main content Skip to Navigation
Theses

Développement du système MathNat pour la formalisation automatique des textes mathématiques

Abstract : There is a wide gap between the language of mathematics and itsformalized versions. The term "language of mathematics" or"mathematical language" refers to prose that the mathematician uses inauthoring textbooks and publications. It mainly consists of naturallanguage, symbolic expressions and notations. It is flexible,structured and semantically well-understood by mathematicians.However, it is very difficult to formalize it automatically. Some ofthe main reasons are: complex and rich linguistic features of naturallanguage and its inherent ambiguity; intermixing of natural languagewith symbolic mathematics causing problems which are unique of itskind, and therefore, posing more ambiguity; and the possibility ofcontaining reasoning gaps, which are hard to fill using the currentstate of art theorem provers (both automated and interactive).One way to work around this problem is to abandon the use of thelanguage of mathematics. Therefore in current state of art of theoremproving, mathematics is formalized manually in very precise, specificand well-defined logical systems. The languages supported by thesesystems impose strong restrictions. For instance, these languages havenon-ambiguous syntax with a limited number of possible syntacticconstructions.This enterprise divides the world of mathematics in two groups. Thefirst group consists of a vast majority of mathematicians whose relyon the language of mathematics only. In contrast, the second groupconsists of a minority of mathematicians. They use formal systems suchas theorem provers (interactive ones mostly) in addition to thelanguage of mathematics.To bridge the gap between the language of mathematics and itsformalized versions, we may ask the following gigantic question:Can we build a program that understands the language of mathematicsused by mathematicians and can we mechanically verify its correctness?This problem can naturally be divided in two sub-problems, both very hard:1. Parsing mathematical texts (mainly proofs) and translating thoseparse trees to a formal language after resolving linguistic issues.2. Verification of this formal version of mathematics.The project MathNat (Mathematics in controlled Natural language) aimsat being the first step towards solving this problem, focusing mainlyon the first question.First, we develop a Controlled Language for Mathematics (CLM) which isa precisely defined subset of English with restricted grammar anddictionary. To make CLM natural and expressive, we support some richlinguistic features such as anaphoric pronouns and references,rephrasing of a sentence in multiple ways and the proper handling ofdistributive and collective readings.Second, we automatically translate CLM to a system independent formaldescription language (MathAbs), with a hope to make MathNat accessibleto any proof checking system. Currently, we translate MathAbs intoequivalent first order formulas for verification.
Document type :
Theses
Complete list of metadatas

Cited literature [42 references]  Display  Hide  Download

https://tel.archives-ouvertes.fr/tel-00680095
Contributor : Abes Star :  Contact
Submitted on : Saturday, March 17, 2012 - 4:13:30 PM
Last modification on : Friday, November 6, 2020 - 3:34:12 AM
Long-term archiving on: : Monday, June 18, 2012 - 5:15:18 PM

File

Thesis-Humayoun1.20110116.pdf
Version validated by the jury (STAR)

Identifiers

  • HAL Id : tel-00680095, version 1

Collections

Citation

Humayoun Muhammad. Développement du système MathNat pour la formalisation automatique des textes mathématiques. Mathématiques générales [math.GM]. Université de Grenoble, 2012. Français. ⟨NNT : 2012GRENM001⟩. ⟨tel-00680095⟩

Share

Metrics

Record views

754

Files downloads

849