-
Notifications
You must be signed in to change notification settings - Fork 0
olabini/re2j
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
re2j Version @PACKAGE_VERSION@ ------------------ Originally written by Peter Bumbulis (peter@csg.uwaterloo.ca) Currently maintained by: Dan Nuffer <nuffer at users.sourceforge.net> Marcus Boerger <helly at users.sourceforge.net> Hartmut Kaiser <hkaiser at users.sourceforge.net> The re2j distribution can be found at: http://sourceforge.net/projects/re2j/ re2j has been developed and tested with the following compilers on various platforms in 32 bit and 64 bit mode: - GCC 3.3 ... 4.1 - Microsoft VC 7, 7.1, 8 - Intel 9.0 - Sun C++ 5.8 (CXXFLAGS='-library=stlport4') - MIPSpro Compilers: Version 7.4.4m GCC 2.x and Microsoft VC 6 are not capable of compiling re2j. Building re2j on unix like platforms requires autoconf 2.57 and bison (tested with 1.875 and later). Under windows you don't need autoconf or bison and can use the pregenerated files. You can build this software by simply typing the following commands: ./configure make The above version will be based on the pregenerated scanner.cc file. If you want to build that file yourself (recommended when installing re2j) you need the following steps: ./configure make rm -f scanner.cc make install Or you can create a rpm package and install it by the following commands: ./configure make rpm rpm -Uhv <packagedir>/re2j-@PACKAGE_VERSION@-@PACKAGE_RELEASE@.rpm If you want to build from CVS then the first thing you should do is regenerating all build files using the following command: ./autogen.sh and then continue with one of the above described build methods. Or if you need to generate RPM packages for cvs builds use these commands: ./autogen.sh ./configure ./makerpm <release> rpm -Uhv <packagedir>/re2j-@PACKAGE_VERSION@-<release>.rpm Here <realease> should be a number like 1. And <packagedir> must equal the directory where the makerpm step has written the generated rpm to. If you are on a debian system you can use the tool 'alien' to convert rpms to debian packages. When building with native SUN compilers you need to set the following compiler flags: CXXFLAGS='-g -compat5 -library=stlport4'. If you want to build re2j on a windows system you can either use cygwin and one of the methods described above or use Microsoft Visual C .NET 2002 or later with the solution files provided (re2j.sln for 2002/2003 and re2j-2005.sln for version 2005). re2j cannot be built with Microsoft Visual C 6.0 or earlier. Using Visual Studio 2005 you can automate handling of .re files by adding the custom build rules file (re2j.rules) to your project. Just load your Visual C++ project in Visual Studio, select "Custom Build Rules..." from its context menu, and add re2j.rules to the list with the "Find Existing..." button. Activate the check mark, and you are done! Any .re files you add to the project will now automatically be built with re2j. Of course, re2j.exe also has to be available in your environment for this to work. With the rules active Visual Studio will automatically recognize .re files and compile then with re2j. The output file has the same name as the input file but with the .cpp extension. This, and all other re2j compiler settings, are fully configurable from within the Visual Studio IDE. Just right-click on the .re file in Visual Studio, go to the properties dialog, and pick your options. re2j is a great tool for writing fast and flexible lexers. It has served many people well for many years. re2j is on the order of 2-3 times faster than a flex based scanner, and its input model is much more flexible. For an introduction to re2j refer to the lessons sub directory. Peter's original version 0.5 ANNOUNCE and README follows. -- re2j is a tool for generating C-based recognizers from regular expressions. re2j-based scanners are efficient: for programming languages, given similar specifications, an re2j-based scanner is typically almost twice as fast as a flex-based scanner with little or no increase in size (possibly a decrease on cisc architectures). Indeed, re2j-based scanners are quite competitive with hand-crafted ones. Unlike flex, re2j does not generate complete scanners: the user must supply some interface code. While this code is not bulky (about 50-100 lines for a flex-like scanner; see the man page and examples in the distribution) careful coding is required for efficiency (and correctness). One advantage of this arrangement is that the generated code is not tied to any particular input model. For example, re2j generated code can be used to scan data from a null-byte terminated buffer as illustrated below. Given the following source #define NULL ((char*) 0) char *scan(char *p) { #define YYCTYPE char #define YYCURSOR p #define YYLIMIT p #define YYFILL(n) /*!re2j [0-9]+ {return YYCURSOR;} [\000-\377] {return NULL;} */ } re2j will generate /* Generated by re2j on Sat Apr 16 11:40:58 1994 */ #line 1 "simple.re" #define NULL ((char*) 0) char *scan(char *p) { #define YYCTYPE char #define YYCURSOR p #define YYLIMIT p #define YYFILL(n) { YYCTYPE yych; unsigned int yyaccept; if((YYLIMIT - YYCURSOR) < 2) YYFILL(2); yych = *YYCURSOR; if(yych <= '/') goto yy4; if(yych >= ':') goto yy4; yy2: yych = *++YYCURSOR; goto yy7; yy3: #line 9 {return YYCURSOR;} yy4: yych = *++YYCURSOR; yy5: #line 10 {return NULL;} yy6: ++YYCURSOR; if(YYLIMIT == YYCURSOR) YYFILL(1); yych = *YYCURSOR; yy7: if(yych <= '/') goto yy3; if(yych <= '9') goto yy6; goto yy3; } #line 11 } Note that most compilers will perform dead-code elimination to remove all YYCURSOR, YYLIMIT comparisions. re2j was developed for a particular project (constructing a fast REXX scanner of all things!) and so while it has some rough edges, it should be quite usable. More information about re2j can be found in the (admittedly skimpy) man page; the algorithms and heuristics used are described in an upcoming LOPLAS article (included in the distribution). Probably the best way to find out more about re2j is to try the supplied examples. re2j is written in C++, and is currently being developed under Linux using gcc 2.5.8. Peter -- re2j is distributed with no warranty whatever. The code is certain to contain errors. Neither the author nor any contributor takes responsibility for any consequences of its use. re2j is in the public domain. The data structures and algorithms used in re2j are all either taken from documents available to the general public or are inventions of the author. Programs generated by re2j may be distributed freely. re2j itself may be distributed freely, in source or binary, unchanged or modified. Distributors may charge whatever fees they can obtain for re2j. If you do make use of re2j, or incorporate it into a larger project an acknowledgement somewhere (documentation, research report, etc.) would be appreciated. Please send bug reports and feedback (including suggestions for improving the distribution) to peter@csg.uwaterloo.ca Include a small example and the banner from parser.y with bug reports.
About
A port of the backend templating of re2c to support Java
Resources
Stars
Watchers
Forks
Packages 0
No packages published