Skip to Main content Skip to Navigation

It is necessary and possible to build (multilingual)
NL-based restricted e-commerce systems
with mixed sublanguage and content-oriented methods

Abstract : The survey of the available e-commerce systems shows that none of them is able to handle spontaneous users' requests online. Some systems avoid the hard problem of supporting free natural language interface by simplifying the user interaction styles either by using form filling or by using controlled languages. Other systems failed to support free natural language interface because they used inadequate NLP techniques.
The purpose of this thesis is to show that it is necessary and possible to build (multilingual) NL-based e-commerce systems with mixed sublanguage and content-oriented methods. The analysis of the sublanguage and the integration of content-oriented methods will definitely increase the accuracy and robustness of the processing.
To verify this assumption, we built an experimental system as a proof-of-concept. The system is a SMS-based classified ads selling and buying platform. It allows users to send classified ads of the articles/goods they would like to sell and to search for the goods/articles they desire using full natural language interface. To analyze the sublanguage, we first used a web based corpus to build the basic system which covers the Cars and Real Estate domains. This initial experimental deployment of the system was to collect real SMS-based spontaneous data, which were used to fine tune the system.
To enable semantic processing, a content representation language is defined to capture the meaning of a classified ad post. The semantic grammars of content extraction are coded using the EnCo specialized language for linguistic programming which we used previously in developing the first Arabic-UNL enconverter. To enhance the process of coding using EnCo, we have developed a methodology that facilitates this process and provides the means for a systematic and efficient coding.
Response generation is based on semantic matching (“looking for” and “sell” posts) and reasoning and is able to handle “no answer situations”. Not like other experimental systems, CATS was designed from the beginning to be a “production system”. It is currently deployed in Jordan by the largest mobile operator (Fastlink) after passing intensive testing by them. Testing the content extraction component with a real noisy free text shows a 90% F-measure. The average response time is around 10~30 seconds calculated during peak time (10 posts/minute).
Document type :
Complete list of metadata

Cited literature [220 references]  Display  Hide  Download
Contributor : Daoud Daoud Connect in order to contact the contributor
Submitted on : Friday, September 22, 2006 - 2:48:01 PM
Last modification on : Friday, March 25, 2022 - 11:09:41 AM
Long-term archiving on: : Tuesday, April 6, 2010 - 1:07:53 AM


  • HAL Id : tel-00097826, version 1




Daoud Daoud. It is necessary and possible to build (multilingual)
NL-based restricted e-commerce systems
with mixed sublanguage and content-oriented methods. Other [cs.OH]. Université Joseph-Fourier - Grenoble I, 2006. English. ⟨tel-00097826⟩



Record views


Files downloads