[ Home Page ] [ Eiffel Archive ] [ Eiffel Classes and Clusters ]

Arc de Triomphe clipart (2486 bytes)

Command Line Scanning Library

Written by Ian Elliott.

command_line.zip (26,079 bytes) - source code

1. Introduction

Parsing the command line and its arguments is a common requirement but command line syntax is varied and often context dependant. These factors make building a general facility for parsing the command line difficult. Rather than attempting an elaborate scheme this library provides a set of command token classes which scan the command line text and which can be suitably combined to parse the entire command line. Each command token instance can be queried to give the result of its scanning operation and the value of the substring it has recognised. The library currently contains command tokens to scan switch symbols, switch values, integers, WIN32 file specifications, separators and punctuation.

2. Implementation

On creation a command token parses the next section of the command line. There are two creation routines:

make (e: STRING) is
        -- Create by scanning source and committing
        Error_message_exists: e/=Void and then not e.empty


make_possible is
        -- Create by scanning source

The make routine is called when the next section of the command line must satisfy the token's syntax requirement. If the scan is successful the recognised string is available from the command token and the section of the command line scanned by this token is finished being parsed i.e. is committed. Sections after this can then be scanned by creating other command tokens. However, if the scan fails the error is reported by displaying to the screen the error message parameter. In addition, a fatal error is deemed to have occurred and any further command token created performs no scanning and reports no error. This behaviour makes for convenience by avoiding deeply nested status testing; the result is a simpler syntax checking structure.

The make_possible creation routine is called when the next token is one of a set of choices. There is no commitment when the command token is created. If the scan fails there is no fatal error, no report and another command token may be tried on the same section of command line text. Thus at any time during parsing a prefix of the command line is committed and the remainder of the text has yet to be committed even though some first part of this remainder may have been scanned. If the scan succeeds, indicated by the boolean is_actual, a commitment to this result can be made by calling the routine commit of that instance of command token:

commit is
        -- Commit
        Commitable: can_commit

A command token can commit if it has successfully scanned, has yet to be committed and no part of the section of the command line text it has covered has yet been committed. After commitment only the remainder of command line text following that scanned by this command token is available for further parsing.

A single shared instance of the class COMMAND_LINE is accessed by all command tokens. It holds the error status of the parsing operation. After a successful scan by a command token further checking may be required. For example, after scanning a file specification it may be necessary to check that the file with this specification actually exists. If further checking fails this fatal error can be signalled by calling the COMMAND_LINE routine put_fatal_error with a suitable error message:

put_fatal_error (m :STRING) is
        -- Put fatal error
        Not_fatally_errored: not is_fatal_error
        Message_exists: m/=Void

The routine displays the error message and sets the error status is_fatal_error to True.

On its creation, the instance of the class COMMAND_LINE gathers together all command line arguments and provides a single shared STRING_SOURCE instance. This is a string source handler through which all command tokens access the stream of command line characters. It provides routines to manipulate the stream: to progress through the stream, to look ahead, to rewind and to commit to the position reached so far.

3. Example

For an illustration consider a sort program whose command line syntax is:

Syntax = xsort [-d ] <file_spec>

The switch "-d" specifies a descending sorting order. The default sorting order is ascending.

The command line parser has to deal with separators between items, two possible file specification formats – short WIN32 and long WIN32 – and possible unwanted extra characters on the end of the line. Taking these considerations into account a working syntax is:


where the names of tokens in the syntax are the prefixes of names of COMMAND_TOKEN effective classes. The WORD_COMMAND_TOKEN class scans a sequence of any non-space characters. It is used to scan the first token which is always the command i.e. program, name. The SEPARATOR_COMMAND_TOKEN class scans any number of space characters. DASH_COMMAND_TOKEN simply scans for a "-" string. The class CODE_COMMAND_TOKEN scans a sequence of letters and the successful token's spelling can be checked that its value is correct. SHORT_WIN_FILE_COMMAND_TOKEN and LONG_WIN_FILE_COMMAND_TOKEN check for the short form and long form of WIN32 file specifications respectively. Finally the EMPTY_COMMAND_TOKEN can be used to check that no extra unrequired characters lurk at the end of the line. The class XSORT in file xsort.e, whilst not actually performing any sorting, shows how to use these classes to parse this syntax. The code structure reflects the structure of the working syntax in a fashion similar to a recursive descent parser with single token read ahead.

The XSORT example has been built under Object-Tools' Visual Eiffel version 2.5. The command line scanning library makes use of the Kernel library only. In addition XSORT accesses the Pool library for the class FILE_SYSTEM.

Warning: with Visual Eiffel long WIN32 file specifications containing substrings of 2 or more spaces are not correctly handled. The ARGUMENTS kernel class of Visual Eiffel slices up long file specifications where spaces occur. The original number of spaces is lost (this behaviour may be the same with other vendors' ARGUMENTS class).

4. Further Development

The library may be extended by adding command tokens which scan dates, file specifications with wild cards, other operating system file formats, strings and so on as required. Further, more than one token lookahead could be included and the ability to scan files which are referenced in the command line.

[ Home Page ] [ Eiffel Archive ] [ Eiffel Classes and Clusters ]