Welcome to the new version of CaltechAUTHORS. Login is currently restricted to library staff. If you notice any issues, please email coda@library.caltech.edu
Published December 12, 2003 | Submitted
Report Open

A Language Processor and a Sample Language

Ayres, Ronald

Abstract

This thesis explores shared data in list structures and ambiguity in language processing. Tolerance of ambiguity is necessary to support clear and modular expression. Data sharing is necessary to support ambiguity efficiently. Data sharing is useful also in compiled programs to save memory and time. Let us define some terms. A rewrite grammar is a set of replacement rules each of which specifies that a given phrase may be replaced by another given phrase. Each replacement rule expresses a local translation. A parser finds those sequences of replacements that bring a given text to a machine handleable form. Each such sequence represents a meaning or interpretation for the given text. Tolerance of ambiguity or multiple interpretations for a given text is necessary so that subsequent processing can place further constraints upon the input text. This thesis presents a parser which efficiently, handles general-rewrite grammars. To conserve computer time and memory, only essential differences among multiple interpretations are represented and processed. If several interpretations for a given text are valid, the parser yields a meaning which represents the ambiguity as, locally as possible. Even an exponential number of distinct meanings may be represented in a polynomial amount of memory. This thesis also presents a language processing system which supports semantic processing via independent rewrite grammars. Each grammar represents a distinct aspect of the language. A given sequence of grammars becomes a sequence of passes, or process steps. Each pass derives a meaning with respect to one grammar and uses that meaning to generate phrases which will be interpreted by the next pass. Although linguistic specification is usually done with context-free grammars, features of this parser which support general-rewrite grammars are essential for the integration of passes. Not only ambiguity, but also the locality of ambiguity is preserved from one pass to the next. It is necessary to preserve locality of ambiguity in order to avoid explosive computation arising from useless action among independent sets of interpretations. I have implemented a general-purpose programming language called ICL with this system. The fact that ICL's datatypes are processed by a rewrite grammar makes it simple to implement both user-defined datatype coercions and functions known as polymorphic operators whose definitions depend on parameter datatypes. Datatype coercions and Polymorphic operators reduce the amount,of specification required in algorithms to such an extent that a user can often modify declarations and achieve optimizations and changes in concept without modifying his algorithmic specification. ICL includes a simple and safe policy about pointers so that the user can ignore their existence completely if he wishes. ICL automatically maximizes data sharing and minimizes copying by adopting a "copy on write" policy. This policy supports the illusion that each and every reference to a data structure generates a complete copy of that data structure. This same technique is used in the language processor itself to facilitate data sharing among multiple interpretations in ambiguous cases.

Additional Information

© 1978 California Institute of Technology. Series numbering on title page: 2276-tr-78

Attached Files

Submitted - postscript.pdf

Files

postscript.pdf
Files (10.8 MB)
Name Size Download all
md5:9cf96a9a4e45d014d2481fb41b2d51b7
10.8 MB Preview Download

Additional details

Created:
August 19, 2023
Modified:
January 13, 2024