/** * @param fileName * @param forEachXpath * @param fields * @return * @throws IOException * @throws ParseExceptionHuge * @throws EncodingExceptionHuge * @throws EOFExceptionHuge * @throws EntityExceptionHuge */ private static Iterator<Map<String, Object>> constructRowIteratorHuge(String fileName, String forEachXpath, final List<Map<String, String>> fields) throws IOException, ParseExceptionHuge { JulieXMLBuffer buffer = new JulieXMLBuffer(); buffer.readFile(fileName); VTDGenHuge vg = new VTDGenHuge(); vg.setDoc(buffer); vg.parse(true); VTDNavHuge vn = vg.getNav(); return constructRowIterator(vn, forEachXpath, fields, fileName); }
import com.ximpleware.extended.*; public class mem_mapped_read { public static void main(String[] s) throws Exception{ VTDGenHuge vg = new VTDGenHuge(); if (vg.parseFile("test.xml",true,VTDGenHuge.MEM_MAPPED)){ VTDNavHuge vnh = vg.getNav(); AutoPilotHuge aph = new AutoPilotHuge(vnh); aph.selectXPath("//*"); int i = 0; while ((i=aph.evalXPath())!=-1){ System.out.println(" element name is "+vnh.toString(i)); } } } }
/** * This method returns the VTDNavHuge object after parsing, it also cleans * internal state so VTDGenHuge can process the next file. * @return com.ximpleware.extended.VTDNavHuge */ public VTDNavHuge getNav() { // call VTDNav constructor VTDNavHuge vn = new VTDNavHuge( rootIndex, encoding, ns, VTDDepth, xb, VTDBuffer, l1Buffer, l2Buffer, l3Buffer, docOffset, docLen); clear(); return vn; } /**
XMLBuffer xb = new XMLBuffer(); xb.readFile(fileName); this.setDoc(xb); this.parse(ns); // set namespace awareness to true return true; } else if (mode == MEM_MAPPED) { XMLMemMappedBuffer xmb = new XMLMemMappedBuffer(); xmb.readFile(fileName); this.setDoc(xmb); this.parse(ns); // set namespace awareness to true return true;
decide_encoding(); writeVTD(0,0,TOKEN_DOCUMENT,depth); while (true) { switch (parser_state) { break; case '?' : parser_state = process_qm_seen(); break; case '!' : // three possibility (comment, CDATA, DOCTYPE) parser_state = process_ex_seen(); break; default : throw new ParseExceptionHuge( "Other Error: Invalid char after <" + formatLineNumber()); throw new ParseExceptionHuge( "Other Error: Depth exceeds MAX_DEPTH" + formatLineNumber()); throw new ParseExceptionHuge( "Token Length Error: Starting tag prefix or qname length too long" + formatLineNumber()); writeVTD( (temp_offset), (length2 << 10) | length1,
throw new ParseExceptionHuge("Token Length Error:" +" PI name too long (>0xfffff)" + formatLineNumber()); writeVTD( (temp_offset), length1, throw new ParseExceptionHuge("Token Length Error:" +" PI name too long (>0xfffff)" + formatLineNumber()); writeVTD( (temp_offset) >> 1, (length1 >> 1), if (r.skipChar('>')) { temp_offset = offset; ch = getCharAfterSe(); if (ch == '<') { parser_state = STATE_LT_SEEN; entityIdentifier(); parser_state = STATE_TEXT; } else if (ch == ']') { throw new ParseExceptionHuge( "Error in text content: ]]> in text content" + formatLineNumber()); throw new ParseExceptionHuge(
&& r.skipChar('o') && r.skipChar('n')) { ch = getCharAfterS(); if (ch == '=') { writeVTD( temp_offset - 1, 7, depth); else writeVTD( (temp_offset -2) >> 1, 7, throw new ParseExceptionHuge( "XML decl error: Invalid char" + formatLineNumber()); } else throw new ParseExceptionHuge( "XML decl error: should be version" + formatLineNumber()); ch_temp = getCharAfterS(); if (ch_temp != '\'' && ch_temp != '"') throw new ParseExceptionHuge( "XML decl error: Invalid char to start attr name" + formatLineNumber()); temp_offset = offset;
throw new ParseExceptionHuge( "Error in comment: Invalid Char" + formatLineNumber()); writeVTD( temp_offset, length1, depth); else writeVTD( temp_offset >> 1, length1 >> 1, + formatLineNumber());
throw new ParseExceptionHuge( "Error in DOCTYPE: Invalid char" + formatLineNumber()); throw new ParseExceptionHuge("Token Length Error:" +" DTD val too long (>0xfffff)" + formatLineNumber()); writeVTD( temp_offset, length1, throw new ParseExceptionHuge("Token Length Error:" +" DTD val too long (>0xfffff)" + formatLineNumber()); writeVTD( temp_offset >> 1, length1 >> 1, depth); ch = getCharAfterS(); if (ch == '<') { parser_state = STATE_LT_SEEN; throw new ParseExceptionHuge( "Other Error: Invalid char in xml" + formatLineNumber()); return parser_state;
private int process_end_doc() throws ParseExceptionHuge, EncodingExceptionHuge, EOFExceptionHuge { int parser_state; ch = getCharAfterS(); /* eof exception should be thrown here for premature ending*/ if (ch == '<') { if (r.skipChar('?')) { /* processing instruction after end tag of root element*/ temp_offset = offset; parser_state = STATE_END_PI; return parser_state; } else if ( r.skipChar('!') && r.skipChar('-') && r.skipChar('-')) { // comments allowed after the end tag of the root element temp_offset = offset; parser_state = STATE_END_COMMENT; return parser_state; } } throw new ParseExceptionHuge( "Other Error: XML not terminated properly" + formatLineNumber()); } private int process_qm_seen()throws ParseExceptionHuge, EncodingExceptionHuge, EOFExceptionHuge {
private int process_qm_seen()throws ParseExceptionHuge, EncodingExceptionHuge, EOFExceptionHuge { temp_offset = offset; ch = r.getChar(); if (XMLChar.isNameStartChar(ch)) { //temp_offset = offset; if ((ch == 'x' || ch == 'X') && (r.skipChar('m') || r.skipChar('M')) && (r.skipChar('l') || r.skipChar('L'))) { ch = r.getChar(); if (ch == '?' || XMLChar.isSpaceChar(ch)) throw new ParseExceptionHuge( "Error in PI: [xX][mM][lL] not a valid PI targetname" + formatLineNumber()); offset = getPrevOffset(); } return STATE_PI_TAG; } throw new ParseExceptionHuge( "Other Error: First char after <? invalid" + formatLineNumber()); }
/** * parseFile with default mode set to IN_MEMORY * @param fileName * @param ns * @return boolean indicating whether the parseFile is a success * */ public boolean parseFile(String fileName, boolean ns){ return parseFile(fileName, ns, IN_MEMORY); }
/** * The entity aware version of getCharAfterS * @return int * @throws ParseExceptionHuge Super class for any exception during parsing. * @throws EncodingExceptionHuge UTF/native encoding exception. * @throws com.ximpleware.extended.EOFExceptionHuge End of file exception. */ private int getCharAfterSe() throws ParseExceptionHuge, EncodingExceptionHuge, EOFExceptionHuge { int n = 0; long temp; //offset saver while (true) { n = r.getChar(); if (!XMLChar.isSpaceChar(n)) { if (n != '&') return n; else { temp = offset; if (!XMLChar.isSpaceChar(entityIdentifier())) { offset = temp; // rewind return '&'; } } } } } /**
throw new ParseExceptionHuge( "Error in comment: Invalid char sequence to start a comment" + formatLineNumber()); case '[' : if (r.skipChar('C') throw new ParseExceptionHuge( "Error in CDATA: Wrong place for CDATA" + formatLineNumber()); throw new ParseExceptionHuge( "Error in CDATA: Invalid char sequence for CDATA" + formatLineNumber()); throw new ParseExceptionHuge( "Error for DOCTYPE: Only DOCTYPE allowed" + formatLineNumber()); if (depth != -1) throw new ParseExceptionHuge( "Error for DOCTYPE: DTD at wrong place" + formatLineNumber()); throw new ParseExceptionHuge( "Error for DOCTYPE: Invalid char sequence for DOCTYPE" + formatLineNumber()); + formatLineNumber());
decide_encoding(); writeVTD(0,0,TOKEN_DOCUMENT,depth); while (true) { switch (parser_state) { break; case '?' : parser_state = process_qm_seen(); break; case '!' : // three possibility (comment, CDATA, DOCTYPE) parser_state = process_ex_seen(); break; default : throw new ParseExceptionHuge( "Other Error: Invalid char after <" + formatLineNumber()); throw new ParseExceptionHuge( "Other Error: Depth exceeds MAX_DEPTH" + formatLineNumber()); throw new ParseExceptionHuge( "Token Length Error: Starting tag prefix or qname length too long" + formatLineNumber()); writeVTD( (temp_offset), (length2 << 10) | length1,
throw new ParseExceptionHuge("Token Length Error:" +" PI name too long (>0xfffff)" + formatLineNumber()); writeVTD( (temp_offset), length1, throw new ParseExceptionHuge("Token Length Error:" +" PI name too long (>0xfffff)" + formatLineNumber()); writeVTD( (temp_offset) >> 1, (length1 >> 1), if (r.skipChar('>')) { temp_offset = offset; ch = getCharAfterSe(); if (ch == '<') { parser_state = STATE_LT_SEEN; entityIdentifier(); parser_state = STATE_TEXT; } else if (ch == ']') { throw new ParseExceptionHuge( "Error in text content: ]]> in text content" + formatLineNumber()); throw new ParseExceptionHuge(
&& r.skipChar('o') && r.skipChar('n')) { ch = getCharAfterS(); if (ch == '=') { writeVTD( temp_offset - 1, 7, depth); else writeVTD( (temp_offset -2) >> 1, 7, throw new ParseExceptionHuge( "XML decl error: Invalid char" + formatLineNumber()); } else throw new ParseExceptionHuge( "XML decl error: should be version" + formatLineNumber()); ch_temp = getCharAfterS(); if (ch_temp != '\'' && ch_temp != '"') throw new ParseExceptionHuge( "XML decl error: Invalid char to start attr name" + formatLineNumber()); temp_offset = offset;
throw new ParseExceptionHuge( "Error in comment: Invalid Char" + formatLineNumber()); writeVTD( temp_offset, length1, depth); else writeVTD( temp_offset >> 1, length1 >> 1, + formatLineNumber());
throw new ParseExceptionHuge( "Error in DOCTYPE: Invalid char" + formatLineNumber()); throw new ParseExceptionHuge("Token Length Error:" +" DTD val too long (>0xfffff)" + formatLineNumber()); writeVTD( temp_offset, length1, throw new ParseExceptionHuge("Token Length Error:" +" DTD val too long (>0xfffff)" + formatLineNumber()); writeVTD( temp_offset >> 1, length1 >> 1, depth); ch = getCharAfterS(); if (ch == '<') { parser_state = STATE_LT_SEEN; throw new ParseExceptionHuge( "Other Error: Invalid char in xml" + formatLineNumber()); return parser_state;
private int process_end_doc() throws ParseExceptionHuge, EncodingExceptionHuge, EOFExceptionHuge { int parser_state; ch = getCharAfterS(); /* eof exception should be thrown here for premature ending*/ if (ch == '<') { if (r.skipChar('?')) { /* processing instruction after end tag of root element*/ temp_offset = offset; parser_state = STATE_END_PI; return parser_state; } else if ( r.skipChar('!') && r.skipChar('-') && r.skipChar('-')) { // comments allowed after the end tag of the root element temp_offset = offset; parser_state = STATE_END_COMMENT; return parser_state; } } throw new ParseExceptionHuge( "Other Error: XML not terminated properly" + formatLineNumber()); } private int process_qm_seen()throws ParseExceptionHuge, EncodingExceptionHuge, EOFExceptionHuge {