[ Home Page ]
[ Eiffel Archive ]
[ Eiffel Classes and Clusters]
 |
Visual Eiffel port of Halstenbach iss-match v1.0
|
Written by dougpardee@my-dejanews.com.
issmatve.zip (124,864 bytes) source code
OVERVIEW:
The iss-match library provides facilities for text searching and for
pattern matching.
- Text searching:
- SUBSTRING class searches for a fixed string, going either forward
or backward. It is suitable for searching through large quantities
of text. Searches can be case-sensitive or insensitive.
- Pattern matching:
- REGEXP class matches a text string against a regular expression.
Matching can be case-sensitive or insensitive. The regular expression
can be constrained to matching the entire string, to matching the
beginning or the end of the string, or can be allowed to search for a
matching sequence, either forward from the start or backward from the
end of the text string. Matching can treat new-line characters as
end-of-line or as ordinary characters.
- WILDCARD class is similar to REGEXP except that it uses a simplified
pattern syntax similar to file-name wildcards.
- MISSPELLED class is similar to REGEXP except that it searches for a
fixed text string, not a pattern. It allows a specified amount of
variation (misspelling) between the search string and the target.
- All three of these pattern matchers provide two interfaces. The full
"match_*" features allow you to query the character positions of the
matched string and (for REGEXP) of any substrings. The "matches_*"
features are queries which provide only a True/False match indication.
- REGEXP_EXTRACTOR and WILDCARD_EXTRACTOR are variants which provide
an easy way for you to do the matching and then retrieve the matching
substring(s). The extractors require a small amount of care in order
to prevent wasting memory. See the HTML page for more information.
It's not clear why no MISSPELLED_EXTRACTOR is provided, but it should
be simple enough to create.
- Text splitting:
- SPLITTER class breaks a text string into an arrya of substrings,
using any of the text search or pattern match objects to define
the separators.
INSTALLATION NOTES:
Not all of the files from the original iss-match have been included.
Various build files which were irrelevant to Windows and to Visual
Eiffel have been omitted.
The supplied eif_match.lib was compiled with Microsoft Visual C++.
In addition to this lib, you will also need to link in Microsoft's
"libc.lib" to provide basic string functions (strcpy, to_upper, etc.).
If you don't have access to Microsoft's libc.lib, you'll need to
recompile the three C routines contained in the Clib directory with
whatever C compiler you have (remember that Visual Eiffel's linker
requires that the object files be in COFF format). You may need to
add #defines into eif_match.h to define register1 through register6
as being nothing or "register".
Specify the eif_match.lib and libc.lib files in the "Object Files"
section of the Project/Options dialog.
There is an example program in the directory Examples, class TEST.
This should be built as a console EXE.
DETAILS OF THE VISUAL EIFFEL PORT:
This port should be fully consistent with the original when used
as client classes.
There are many changes which will affect classes which inherit
from these. Some of these changes were unavoidable, as Visual
Eiffel's implementations of ARRAY and STRING are internally
a bit different than the Halstenbach and ISE implementations.
The following changes made were:
- The REGEXP, SUBSTRING, and MISSPELLED classes needed considerable
rework because of the way that they were handling memory which was
shared between C and Eiffel. Two new classes, MEMORY_BLOCK and
ARRAY_MEMORY_BLOCK[G], were added in order to provide a unified
approach to the memory handling.
- REGEXP.match_against and REGEXP.matches_against also needed their
respective local variables "flags" changed from "like flag" (which
is of type INTEGER_REF) to type "INTEGER". This corrected a problem
where the variable was not being passed to the C code. (Presumably,
the address was being passed rather than the integer value).
- The WILDCARD class needed a bit of modification in its STRING
handling. Visual Eiffel has a documented incompatibility in that
STRING.make(n) sets the string's "count" to the length specified
rather than to 0 as required by ELKS-95. The original WILDCARD
code also used the obsolete feature names "append" and "extend".
These have been changed to "append_string" and "append_character"
respectively, which are the names specified by ELKS-95.
- The SPLITTER class needed a minor modification because Visual
Eiffel's ARRAY class retains the "count" and computes "upper",
rather than vice-versa.
- The TEST class was modified to use standard 'print' instead of
"print_line". A couple of typos were also corrected.
CREDITS:
Thanks to Halstenbach ACT GmbH for releasing iss-match 1.0 under the
Eiffel Forum Freeware License.
This Visual Eiffel port by Doug Pardee ,
released under the Eiffel Forum Freeware License.
Doug Pardee
dougpardee@my-dejanews.com
June 5, 1999
[ Home Page ]
[ Eiffel Archive ]
[ Eiffel Classes and Clusters]