Package antlr
Class TokenStreamRewriteEngine
java.lang.Object
antlr.TokenStreamRewriteEngine
- All Implemented Interfaces:
IASDebugStream
,TokenStream
This token stream tracks the *entire* token stream coming from
a lexer, but does not pass on the whitespace (or whatever else
you want to discard) to the parser.
This class can then be asked for the ith token in the input stream.
Useful for dumping out the input stream exactly after doing some
augmentation or other manipulations. Tokens are index from 0..n-1
You can insert stuff, replace, and delete chunks. Note that the
operations are done lazily--only if you convert the buffer to a
String. This is very efficient because you are not moving data around
all the time. As the buffer of tokens is converted to strings, the
toString() method(s) check to see if there is an operation at the
current index. If so, the operation is done and then normal String
rendering continues on the buffer. This is like having multiple Turing
machine instruction streams (programs) operating on a single input tape. :)
Since the operations are done lazily at toString-time, operations do not
screw up the token index values. That is, an insert operation at token
index i does not change the index values for tokens i+1..n-1.
Because operations never actually alter the buffer, you may always get
the original token stream back without undoing anything. Since
the instructions are queued up, you can easily simulate transactions and
roll back any changes if there is an error just by removing instructions.
For example,
TokenStreamRewriteEngine rewriteEngine =
new TokenStreamRewriteEngine(lexer);
JavaRecognizer parser = new JavaRecognizer(rewriteEngine);
...
rewriteEngine.insertAfter("pass1", t, "foobar");}
rewriteEngine.insertAfter("pass2", u, "start");}
System.out.println(rewriteEngine.toString("pass1"));
System.out.println(rewriteEngine.toString("pass2"));
You can also have multiple "instruction streams" and get multiple
rewrites from a single pass over the input. Just name the instruction
streams and use that name again when printing the buffer. This could be
useful for generating a C file and also its header file--all from the
same buffer.
If you don't use named rewrite streams, a "default" stream is used.
Terence Parr, parrt at antlr.org
University of San Francisco
February 2004
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class
(package private) static class
(package private) static class
I'm going to try replacing range from x..y with (y-x)+1 ReplaceOp instructions.(package private) static class
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final String
protected BitSet
Which (whitespace) token(s) to throw outprotected int
track index of tokensprotected Map
Map String (program name) -> Integer indexstatic final int
static final int
protected Map
You may have multiple, named streams of rewrite operations.protected TokenStream
Who do we suck tokens from?protected List
Track the incoming list of tokens -
Constructor Summary
ConstructorsConstructorDescriptionTokenStreamRewriteEngine
(TokenStream upstream) TokenStreamRewriteEngine
(TokenStream upstream, int initialSize) -
Method Summary
Modifier and TypeMethodDescriptionprotected void
If op.index > lastRewriteTokenIndexes, just add to the end.protected void
addToSortedRewriteList
(String programName, TokenStreamRewriteEngine.RewriteOperation op) Add an instruction to the rewrite instruction list ordered by the instruction number (use a binary search for efficiency).void
delete
(int index) void
delete
(int from, int to) void
void
void
void
void
void
deleteProgram
(String programName) Reset the program so that no instructions existvoid
discard
(int ttype) Returns the entire text input to the lexer.int
protected int
getLastRewriteTokenIndex
(String programName) getOffsetInfo
(Token token) Returns the offset information for the tokenprotected List
getProgram
(String name) getToken
(int i) int
int
index()
void
insertAfter
(int index, String text) void
insertAfter
(Token t, String text) void
insertAfter
(String programName, int index, String text) void
insertAfter
(String programName, Token t, String text) void
insertBefore
(int index, String text) void
insertBefore
(Token t, String text) void
insertBefore
(String programName, int index, String text) void
insertBefore
(String programName, Token t, String text) void
void
void
void
void
void
void
rollback
(int instructionIndex) void
Rollback the instruction stream for a program so that the indicated instruction (via instructionIndex) is no longer in the stream.protected void
setLastRewriteTokenIndex
(String programName, int i) int
size()
toDebugString
(int start, int end) toOriginalString
(int start, int end) toString()
toString
(int start, int end)
-
Field Details
-
MIN_TOKEN_INDEX
public static final int MIN_TOKEN_INDEX- See Also:
-
DEFAULT_PROGRAM_NAME
- See Also:
-
PROGRAM_INIT_SIZE
public static final int PROGRAM_INIT_SIZE- See Also:
-
tokens
Track the incoming list of tokens -
programs
You may have multiple, named streams of rewrite operations. I'm calling these things "programs." Maps String (name) -> rewrite (List) -
lastRewriteTokenIndexes
Map String (program name) -> Integer index -
index
protected int indextrack index of tokens -
stream
Who do we suck tokens from? -
discardMask
Which (whitespace) token(s) to throw out
-
-
Constructor Details
-
TokenStreamRewriteEngine
-
TokenStreamRewriteEngine
-
-
Method Details
-
nextToken
- Specified by:
nextToken
in interfaceTokenStream
- Throws:
TokenStreamException
-
rollback
public void rollback(int instructionIndex) -
rollback
Rollback the instruction stream for a program so that the indicated instruction (via instructionIndex) is no longer in the stream. UNTESTED! -
deleteProgram
public void deleteProgram() -
deleteProgram
Reset the program so that no instructions exist -
addToSortedRewriteList
If op.index > lastRewriteTokenIndexes, just add to the end. Otherwise, do linear -
addToSortedRewriteList
protected void addToSortedRewriteList(String programName, TokenStreamRewriteEngine.RewriteOperation op) Add an instruction to the rewrite instruction list ordered by the instruction number (use a binary search for efficiency). The list is ordered so that toString() can be done efficiently. When there are multiple instructions at the same index, the instructions must be ordered to ensure proper behavior. For example, a delete at index i must kill any replace operation at i. Insert-before operations must come before any replace / delete instructions. If there are multiple insert instructions for a single index, they are done in reverse insertion order so that "insert foo" then "insert bar" yields "foobar" in front rather than "barfoo". This is convenient because I can insert new InsertOp instructions at the index returned by the binary search. A ReplaceOp kills any previous replace op. Since delete is the same as replace with null text, i can check for ReplaceOp and cover DeleteOp at same time. :) -
insertAfter
-
insertAfter
-
insertAfter
-
insertAfter
-
insertBefore
-
insertBefore
-
insertBefore
-
insertBefore
-
replace
-
replace
-
replace
-
replace
-
replace
-
replace
-
delete
public void delete(int index) -
delete
public void delete(int from, int to) -
delete
-
delete
-
delete
-
delete
-
discard
public void discard(int ttype) -
getToken
-
getTokenStreamSize
public int getTokenStreamSize() -
toOriginalString
-
toOriginalString
-
toString
-
toString
-
toString
-
toString
-
toDebugString
-
toDebugString
-
getLastRewriteTokenIndex
public int getLastRewriteTokenIndex() -
getLastRewriteTokenIndex
-
setLastRewriteTokenIndex
-
getProgram
-
size
public int size() -
index
public int index() -
getEntireText
Description copied from interface:IASDebugStream
Returns the entire text input to the lexer.- Specified by:
getEntireText
in interfaceIASDebugStream
- Returns:
- The entire text or
null
, if error occured or System.in was used.
-
getOffsetInfo
Description copied from interface:IASDebugStream
Returns the offset information for the token- Specified by:
getOffsetInfo
in interfaceIASDebugStream
- Parameters:
token
- the token whose information need to be retrieved- Returns:
- offset info, or
null
-