Introduction
Parsing text streams is a common task in many Java applications. The `StreamTokenizer` class in Java provides a flexible and efficient way to break down a stream of characters into tokens. In this blog post, we will explore the features and functionality of `StreamTokenizer` through 10 different code examples.
Example 1: Basic Usage
import java.io.*;
public class Example1 {
public static void main(String[] args) throws IOException {
String input = "Hello World 123.45 true";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example demonstrates the basic usage of `StreamTokenizer` to tokenize a string. It prints each token until the end of the stream is reached.
Example 2: Customizing Token Characters
import java.io.*;
public class Example2 {
public static void main(String[] args) throws IOException {
String input = "Name:John, Age:25, Gender:Male";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
tokenizer.wordChars(':', ':');
tokenizer.wordChars(',', ',');
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example customizes the word characters, allowing tokens to include colon and comma. It parses a string containing key-value pairs.
Example 3: Handling Numbers
import java.io.*;
public class Example3 {
public static void main(String[] args) throws IOException {
String input = "42 3.14 -7.5";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
if (tokenizer.ttype == StreamTokenizer.TT_NUMBER) {
System.out.println("Number: " + tokenizer.nval);
}
}
}
}
Explanation: This example focuses on handling numeric values. It identifies and prints the numeric tokens in the input string.
Example 4: Handling Quoted Strings
import java.io.*;
public class Example4 {
public static void main(String[] args) throws IOException {
String input = "\"Java Programming\" 'is fun'";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
tokenizer.quoteChar('"');
tokenizer.quoteChar('\'');
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
if (tokenizer.ttype == StreamTokenizer.TT_WORD) {
System.out.println("String: " + tokenizer.sval);
}
}
}
}
Explanation: This example handles quoted strings enclosed in either double or single quotes and prints each string token.
Example 5: Recognizing Words and Whitespace
import java.io.*;
public class Example5 {
public static void main(String[] args) throws IOException {
String input = "Java is amazing";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
tokenizer.wordChars('a', 'z');
tokenizer.whitespaceChars(' ', ' ');
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example demonstrates recognizing specific word characters and whitespace characters, producing meaningful tokens.
Example 6: Handling Comments
import java.io.*;
public class Example6 {
public static void main(String[] args) throws IOException {
String input = "/* This is a comment */ 42 // Another comment";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
tokenizer.slashStarComments(true);
tokenizer.slashSlashComments(true);
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example handles both block and line comments, ignoring them during tokenization.
Example 7: Handling EOF
import java.io.*;
public class Example7 {
public static void main(String[] args) throws IOException {
String input = "Java is great";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example showcases how to handle the end of the stream (EOF) during tokenization.
Example 8: Ignoring Whitespace
import java.io.*;
public class Example8 {
public static void main(String[] args) throws IOException {
String input = " Token1 Token2 ";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
tokenizer.whitespaceChars(' ', ' ');
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example demonstrates how to ignore leading and trailing whitespace in tokens.
Example 9: Handling Ordinary Characters
import java.io.*;
public class Example9 {
public static void main(String[] args) throws IOException {
String input = "Java@Programming";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
tokenizer.ordinaryChar('@');
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
}
}
Explanation: This example treats '@' as an ordinary character, considering it as part of a token.
Example 10: Resetting Tokenization
import java.io.*;
public class Example10 {
public static void main(String[] args) throws IOException {
String input = "Token1 Token2 Token3";
StringReader reader = new StringReader(input);
StreamTokenizer tokenizer = new StreamTokenizer(reader);
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Token: " + tokenizer.sval);
}
// Reset the tokenizer for re-use
tokenizer.resetSyntax();
while (tokenizer.nextToken() != StreamTokenizer.TT_EOF) {
System.out.println("Reused Token: " + tokenizer.sval);
}
}
}
Explanation: This example shows how to reset the tokenizer's state for re-use, allowing it to tokenize a new input stream.
Conclusion
In this comprehensive guide, we covered various aspects of the `StreamTokenizer` class in Java through 10 different code examples. From basic usage to advanced customization, you now have a solid foundation for efficiently parsing text streams in your Java applications. Experiment with these examples and adapt them to your specific use cases to master the art of stream tokenization.