Im trying to parse an input file as follows:
#*Nonmonotonic logic - context-dependent reasoning.
#@Victor W. Marek,Miroslaw Truszczynski
#t1993
#cArtificial Intelligence
#index3003478
#%3005567
#%3005568
#!abstracst
#*Wissensrepräsentation und Inferenz - eine grundlegende Einführung.
#@Wolfgang Bibel,Steffen Hölldobler,Torsten Schaub
#t1993
#cArtificial Intelligence
#index3005557
#%3005567
#!abstracts2
Im creating the parser for this file and Im looking for an output as follows:
Nonmonotonic logic - context-dependent reasoning. Victor W. Marek,Miroslaw Truszczynski 1993 Artificial Intelligence 3003478 300557,300558
Wissensrepr?sentation und Inferenz - eine grundlegende Einf?hrung. Wolfgang Bibel,Steffen H?lldobler,Torsten Schaub 1993 Artificial Intelligence 3005557 3003478
However the line started with #%
can be multiple and I could not figure out how to handle this. So the output is always double for the part with more than one #%
. For example:
Nonmonotonic logic - context-dependent reasoning. Victor W. Marek,Miroslaw Truszczynski 1993 Artificial Intelligence 3003478 300557
Nonmonotonic logic - context-dependent reasoning. Victor W. Marek,Miroslaw Truszczynski 1993 Artificial Intelligence 3003478 300557 300558
Wissensrepr?sentation und Inferenz - eine grundlegende Einf?hrung. Wolfgang Bibel,Steffen H?lldobler,Torsten Schaub 1993 Artificial Intelligence 3005557 3003478
Below is my code. Before, I also tried to change my code’s last if condition that handle #%
into while but it was not working as well.
Im thinking about detecting if the next line after the line starts with #%
is also started with #%
then it should be parsed into the same variable. However, I could not figure out the right syntax to do this. I triend hasNext() and next() but it had a syntax error in my java program. Im not really strong in programming so I asked for help here..
import java.util.Scanner;
import java.io.*;
import java.nio.charset.Charset;
import java.nio.charset.StandardCharsets;
import java.nio.file.Path;
import java.nio.file.Paths;
public class Citation2{
String title;
String author;
String year;
String conference;
String index;
String cite;
String abstracts;
String Line;
public static void main (String[] args) throws SQLException,
ClassNotFoundException, IOException{
Citation2 parser = new Citation2("D:/test.txt");
parser.processLineByLine();
}
public Citation2(String aFileName){
fFilePath = Paths.get(aFileName);
}
public final void processLineByLine() throws IOException, ClassNotFoundException, SQLException {
try (Scanner scanner = new Scanner(fFilePath, ENCODING.name())){
while (scanner.hasNextLine()){
processLine(scanner.nextLine());
}
}
}
protected void processLine(String aLine) throws ClassNotFoundException, SQLException {
if (aLine.startsWith("#*")) {
title = aLine.substring(2);
Line = title;
}
else if (aLine.startsWith("#@")){
author = aLine.substring(2);
Line = Line + "t" + author;
}
else if (aLine.startsWith("#t")){
year = aLine.substring(2);
Line = Line + "t" + year;
}
else if (aLine.startsWith("#c")){
conference = aLine.substring(2);
Line = Line + "t" + conference;
}
else if (aLine.startsWith("#index")){
index = aLine.substring(6);
Line = Line + "t" + index;
}
else if (aLine.startsWith("#%")){
cite = aLine.substring(2);
Line = Line + "t" + cite;
System.out.println(Line);
}
}
private final Path fFilePath;
private final static Charset ENCODING = StandardCharsets.UTF_8;
}
I wanted to do something like this but it has a syntax error on the next
.
else if (aLine.startsWith("#%")){
cite = aLine.substring(2);
if(aLine.next.startsWith("#@"))
{
cite = "," + cite;
}
Line = Line + "t" + cite;
System.out.println(Line);
}
3
Answers
To get the next line you will need to pass the scanner along as well. Currently you are passing a string which has no idea what the next line in the file is.
You should consider using a
StringBuilder
, it might be more efficient if you have a large file, since you don’t have to create new objects every time you concatenate.Here is an example:
Nice thing to do would be – create the
scanner
instance insideand pass it to
processLine
. InsideprocessLine
do as below