/** File: FileTable.java */ import java.io.*; /** * This class reads tab-delimited data from files into a Table and * can write data to files. The term Table (upper case 'T') denotes * the FileTable data structure. The class also parses tab-delimited * data with or without a header (1 to n-lines of header) from a * file into a Table * (tRows,tCols,tHdrRows,tPreface,tHeader,tFields,tData). *
* If there are multiple header lines, it uses the last header line as * the tFields[] data. You can optionally check for duplicate files * and ignore them otherwise it is a parsing error. *
* You can optionally search the input file for a blank line after which * the actually header appears. You can also specify a keyword in the * header line to denote where the Table starts. You can remove blank * lines at the end of the file to shorten the table length. It can be * used to create an empty table (not associated with a file). * *
*
* List of Methods * FileTable() - Constructor, set some defaults. * FileTable(tableName) - Constructor, set some defaults. * FileTable(tableName, rows, cols) - Constructor, set some defaults. * clearTable() - set to empty table. Generally called from Constructor * cloneTable() - clone the current Table. Copy data by value. * setNbrTableHdrLines() - number of lines of the Table header. * setHasTableHeaderFlag() - set Has Table Header Flag. * setDuplicateFieldsFlag() - set duplicate fields Flag. * setHasEmptyLineBeforeTableFlag() - set look for empty line(s) BEFORE table * setRmvTrailingBlankLinesFlag() - remove trailing blank lines in the Table * setRmvTrailingEmptyColumnsFlag() - set remove empty trailing columns flag. * setStartTableAtColStr() - start Table at string known to be in header. * setColNameIndexMapKeys() - set column names keys for ftIdxMap index-map. * setUseOnlyLastHeaderLineFlag() - set use only last header line flag * readTableRowFromFile() - random access read row r tokens from indexed file. * readLineRowFromFile() - random access read row r data from indexed file. * checkForBadTable() - verify that it is a well formed table * checkForBadTableHeader() - verify that is a well formed tField Table * parseTableFromString() - convert tab-delim string to Table structure. * readAndParseTableIndexMap() - read tab-delim file to Table Index-Map. * readAndParseTableAll() - read & convert tab-delim file to Table & ftIdxMap. * readAndParseTable() - read & convert tab-delim file to Table structure. * readAndParseTableFields() - read tab-delim file to tFields Table structure. * readAndParseTableFieldsAndIndexMap - read tab-delim file tFields & ftIdxMap * readAndParseTableData() - read & convert tab-delim file to Table structure. * computeFileLineTermSize() - compute a file's line terminator size. * cvtTableToTabDelimStr() - convert table to tab delimited string * rmvTrailingBlankLinesFromTable() - remove Table trailing blank lines. * deleteTableColumnByName() - delete a column from the Table. * deleteTableColumnByColIdx() - delete a column from table by column index * lookupFieldIdx() - lookup column index of field if exists. * lookupRowFromColumnData() - lookup row index from column data if exists. * lookupCol2DataFromCol1DataRow() - lookup col2 data from row w/col1 data match * appendRowToTable() - append row of data to table. * removeRowFromTable() - remove row of data from table. * deleteRowsFromTable() - delete rows from the Table and resize the Table. * deleteTrailingEmptyColumnsFromTable() - removes empty trailing columns * mapHeaderColNames() - map tField & tHeaders old to new names in map list. * setFieldsToTable() - set the Fields list to the Table and set tCols. * setHeadersToTable() - set the Headers list to the Table and set tHdrRows. * addEmptyColumnToExistingTable() - add empty column to existing Table. * addColumnToTable() - add a new column to Table with specified field name. * replaceColumnData() - replace data in a column in the Table. * setColumnDataByValue() - set all data in a column in Table to a value. * getColumnData() - lookup the column name and then return all column data. * reorderRowsBySortIndex() - reorder the entire table rows by the sortIdx. * reorderColumnsInTable() - reorder Table tFields & tData by column index * joinTableToCurrentTable() - join another Table ftJoin to current Table. * trimTableEnclWhitespace() - trim (clean) enclosing white space in Table. * trimArrayEnclWhitespace() - trim (clean) enclosing white space in a 1D array. * sortTableRowsByField() - sort table rows by field. * limitMaxRows() - keep max nbr rows in front of the Table. * limitMaxRowsSortedByField() - sort Table by field, then keep max nbr rows ** *
* This code is available at the HTMLtools project on SourceForge at * http://htmltools.sourceforge.net/ * under the "Common Public License Version 1.0" * * http://www.opensource.org/licenses/cpl1.0.php.
** It was derived and refactored from the open source * MAExplorer (http://maexplorer.sourceforge.net/), and * Open2Dprot (http://Open2Dprot.sourceforge.net/) Table modules. *
* $Date: 2009/11/23 11:45:56 $ $Revision: 1.37 $
*
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* @param tblStr is String to parse and convert to Table
* @return true if successful, else errMsgLog has reason failed.
* @see #rmvTrailingBlankLinesFromTable
* @see #deleteTrailingEmptyColumnsFromTable
* @see #checkForBadTable
*/
public boolean parseTableFromString(String tblStr)
{ /* parseTableFromString */
if(tblStr==null)
return(false);
/* [1] Setup parser variables */
int
tDataSize= 0,
nbrRowsIncr= 100,
lineTermSize= 0,
countLines= 0, /* local line counter */
countMoreHdrLines= 0; /* used to save hdr lines before parse
* after found blank line */
long
startByte= 0, /* buffer ptr at start of a line */
endByte= 0; /* buffer ptr at end of a line */
/* [1.1] Find terminator as either \r, \n or \r\n */
String
lineSep= this.getLineSeparatorFromString(tblStr);
/* Set to the first line of tData when found */
firstDataRowLineNbr= -1;
/* [2] Read and parse the tab-delimited data file into the Table
* and get the tPreface as all data before the Header.
*/
/* Read file and parse data into Table */
/* [2.1] Get the file line separator and it's size.
* It is either \r, \n or \r\n - or null if an error */
if(lineSep==null)
{ /* Handle errors */
errMsgLog += "MIF-RAPF no \r, \n or \r\n terminator in " +
"input string.\n";
lastErrMsgLog= errMsgLog;
return (false);
}
lineTermSize= lineSep.length();
/* [2.2] Init parse variables */
String
lookbackHdrs[]= new String[nbrTableHdrLines],
tokens[]= null, /* list of tokens in the inputLine */
inputLine = null; /* lines read from buffered reader */
long
lookbackStartBytePtr[]= new long[nbrTableHdrLines+1],
lookbackEndBytePtr[]= new long[nbrTableHdrLines+1];
boolean
foundDataRowFlag= false,
foundHdrRowFlag= false; /* We saw header row in file, else error */
tHdrRows= nbrTableHdrLines;
tRows= 0; /* count first row */
tCols= 0; /* there is at least 1 Row if fields*/
/* [3] Read lines from the input file */
int
fromIndex= 0,
idx= 0;
while(fromIndex!=-1 && idx!=-1)
{ /* process lines */
idx= tblStr.indexOf(lineSep, fromIndex);
if(idx==-1)
inputLine= tblStr;
else
{
inputLine= tblStr.substring(fromIndex,idx);
fromIndex= idx+lineTermSize; /* advance the line pointer */
}
countLines++;
/* [3.1] Parse the Header and save header lines, backwards,
* for parsing into the tHeader and tFields.
*/
if(!foundHdrRowFlag)
{ /* save until parsed header */
/* [3.1.1] Save Preface with stuff falling out of lookback */
String prefaceLine= lookbackHdrs[tHdrRows-1];
if(prefaceLine!=null)
{ /* only push non-null lines to the tPreface */
if(tPreface==null)
tPreface= prefaceLine +"\n";
else
tPreface += prefaceLine +"\n";
}
/* [3.1.2] Push data back into the look back list */
for(int lb=(tHdrRows-2); lb>=0; lb--)
{ /* lb is backwards index */
lookbackHdrs[lb+1]= lookbackHdrs[lb];
lookbackStartBytePtr[lb+1]= lookbackStartBytePtr[lb];
lookbackEndBytePtr[lb+1]= lookbackEndBytePtr[lb];
}
/* [3.2] Compute the line byte pointers with terminator */
startByte= endByte+1;
endByte += inputLine.length() + lineTermSize;
/* [3.3] Save current line at end of look back stack. */
lookbackHdrs[0]= inputLine;
lookbackStartBytePtr[0]= startByte;
lookbackEndBytePtr[0]= endByte;
/* [3.4] Read the rest of the header lines. */
if(countMoreHdrLines>0)
{ /* save more header lines until ready to parse */
if(--countMoreHdrLines<=0)
foundHdrRowFlag= true;
else
continue; /* keep saving lines */
}
/* [3.5] Look for the header starting with specific
* string in the tFields (last row of the tHeader.
*/
if(startTblColStr!=null)
{ /* look for string known in header for start of Table */
int idxKeyword= inputLine.indexOf(startTblColStr);
if(idxKeyword!=-1)
{ /* Found the header tFields row */
foundHdrRowFlag= true;
}
}
/* [3.6] Alternately, look for empty line after which the table
* will start. This can be problematic if there are multiple
* blank lines.
*/
else if(hasEmptyLineBeforeTableFlag)
{ /* scan through the preface */
if(inputLine.length()==0)
{ /* found empty line - assume before header */
countMoreHdrLines= nbrTableHdrLines;
continue;
} /* found empty line - assume before header */
}
else if(countLines>=tHdrRows)
{ /* captured header line since no empty lines before Table */
foundHdrRowFlag= true;
}
/* [3.7] Continue scanning for the header */
if(!foundHdrRowFlag && !foundDataRowFlag)
continue; /* keep scanning */
/* [4] Found the header - save it in tFields and tHeader */
else if(foundHdrRowFlag && !foundDataRowFlag)
{ /* Process the header rows */
if(countLines==1)
tPreface= null; /* there is no preface */
/* [4.1] Save the tFields[] */
inputLine= lookbackHdrs[0];
tFields= UtilCM.cvs2Array(inputLine, "\t");
tCols= tFields.length;
/* [4.2] Now save the header rows */
tHeader= new String[tHdrRows][];
lineStartHeaderBytePtr= new long[tHdrRows];
lineEndHeaderBytePtr= new long[tHdrRows];
for(int h=0;h
* It also sets the the Index-Map byte pointers
* lineStartHeaderBytePtr[], lineEndHeaderBytePtr[] for the tHeader and
* lineStartDataBytePtr[], and lineEndDataBytePtr[] for the tData.
*
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* @param inputFile is file to read and convert to table
* @return true if successful, else errMsgLog has reason failed.
* @see #readAndParseTableData
*/
public boolean readAndParseTableIndexMap(String inputFile)
{ /* readAndParseTableIndexMap */
return(readAndParseTableData(inputFile, false, false, true));
} /* readAndParseTableIndexMap */
/**
* readAndParseTableAll() - read & convert tab-delim file to Table
* structure. Set up the table data structures in this instance of
* FileTable:
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* It also sets the the Index-Map byte pointers
* lineStartHeaderBytePtr[], lineEndHeaderBytePtr[] for the tHeader and
* lineStartDataBytePtr[], and lineEndDataBytePtr[] for the tData.
* It creates the ftIdxMap index-Map Table as one of the elements
* of this Table.
*
* @param inputFile is file to read and convert to table
* @return true if successful, else errMsgLog has reason failed.
* @see #readAndParseTableData
*/
public boolean readAndParseTableAll(String inputFile)
{ /* readAndParseTableAll */
return(readAndParseTableData(inputFile, false, true, true));
} /* readAndParseTableAll */
/**
* readAndParseTable() - read & convert tab-delim file to Table
* structure. Set up the table data structures in this instance of
* FileTable:
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* It also sets the the Index-Map byte pointers
* lineStartHeaderBytePtr[], lineEndHeaderBytePtr[] for the tHeader and
* lineStartDataBytePtr[], and lineEndDataBytePtr[] for the tData.
* It does not create the ftIdxMap index-Map Table.
*
* @param inputFile is file to read and convert to table
* @return true if successful, else errMsgLog has reason failed.
* @see #readAndParseTableData
*/
public boolean readAndParseTable(String inputFile)
{ /* readAndParseTable */
return(readAndParseTableData(inputFile, false, true, false));
} /* readAndParseTable */
/**
* readAndParseTableFields() - read tab-delim file to tFields Table
* structure. Set up the table data structures in this instance of
* FileTable:
* (tHeader,tPreface,tRows,tCols,tFields[],tCols,tRows,tHdrNbr).
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* It also sets the the Index-Map byte pointers
* lineStartHeaderBytePtr[], lineEndHeaderBytePtr[] for the tHeader.
* It does not read the tData rows, but instead closes the file and returns.
* So the index map is incomplete.
*
* @param inputFile is file to read and convert to table
* @return true if successful, else errMsgLog has reason failed.
* @see #readAndParseTableData
*/
public boolean readAndParseTableFields(String inputFile)
{ /* readAndParseTableFields */
return(readAndParseTableData(inputFile, true, false, false));
} /* readAndParseTableFields */
/**
* readAndParseTableFieldsAndIndexMap - read tab-delim file to tFields
* and ftIdxMap Table. Set up the table data structures in this instance
* of FileTable:
* (tHeader,tPreface,tRows,tCols,tFields[],tCols,tRows,tHdrNbr).
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* It also sets the the Index-Map byte pointers
* lineStartHeaderBytePtr[], lineEndHeaderBytePtr[] for the tHeader.
* It creates the ftIdxMap index-Map Table as one of the elements
* of this Table.
* It does not read the tData rows, but instead closes the file and
* returns.
*
* @param inputFile is file to read and convert to table
* @return true if successful, else errMsgLog has reason failed.
* @see #readAndParseTableData
*/
public boolean readAndParseTableFieldsAndIndexMap(String inputFile)
{ /* readAndParseTableFieldsAndIndexMap */
return(readAndParseTableData(inputFile, true, false, true));
} /* readAndParseTableFieldsAndIndexMap */
/**
* readAndParseTableData() - read & convert tab-delim file to Table
* structure. Set up the table data structures in this instance of
* FileTable:
* (tHeader,tPreface,tRows,tCols,tFields[],tData[][],tCols,tRows,tHdrNbr).
*
* If the table is not at the front of the file, one option is to
* find the start of the table by searching for a this.startTblColStr
* string if you know a unique string in the header row. This is used
* if startTblColStr is not blank (set by setStartTableAtColStr()).
* If it matches, we then search backwards to the front of the string
* for a '\n' which is the end of the previous row.
*
* An alternate method for finding the start of the file, if the
* this.hasEmptyLineBeforeTableFlag is set, is to look for the first
* non-empty empty row after seeing an empty row that it is then
* assumed to be the Table data (Fields + data or data) follows.
* This can be problematic if there are multiple blank lines.
*
* When the front of the table is found, then it sets
* this.firstDataRowLineNbr to that line number.
* If there is no empty row found, then it assumes there is no extra
* data before the table and it sets this.firstDataRowLineNbr to 0.
* If there is text before the Table, then it is put into this.tPreface.
*
* It also sets the the Index-Map byte pointers
* lineStartHeaderBytePtr[], lineEndHeaderBytePtr[] for the tHeader and
* lineStartDataBytePtr[], and lineEndDataBytePtr[] for the tData if
* making an Index-Map ftIdxMap.
* If makeIndexMapFlag is set it creates the ftIdxMap index-Map Table
* as one of the elements of this Table.
*
* @param inputFile is file to read and convert to table
* @param onlyReadHeaderFlag only read the tPreface, tFields and tHeader
* and then stop reading the file unless making an index map.
* @param saveTblRowDataFlag to save the tData rows, else just compute
* the Index-Map for the data rows.
* @param makeIndexMapFlag make an index map of Table in this.ftIdxMap
* @return true if successful, else errMsgLog has reason failed.
* @see #rmvTrailingBlankLinesFromTable
* @see #deleteTrailingEmptyColumnsFromTable
* @see #checkForBadTable
*/
public boolean readAndParseTableData(String inputFile,
boolean onlyReadHeaderFlag,
boolean saveTblRowDataFlag,
boolean makeIndexMapFlag)
{ /* readAndParseTableData */
if(inputFile==null)
return(false);
/* [1] Setup parser variables */
int
DBUG_NBR_LINES= -1; /* (5) [TODO] remove this DEBUG code when working fully. */
int
tDataSize= 0,
nbrRowsIncr= 100,
lineTermSize= -1, /* will compute it */
countLines= 0, /* local line counter */
countMoreHdrLines= 0; /* used to save hdr lines before parse
* after found blank line */
long
startByte= 0, /* buffer ptr at start of a line */
endByte= 0; /* buffer ptr at end of a line. */
int
nColNameIndexMap= 0, /* this.colNameIndexMap.length if used */
ftIMcols= 0, /* Nbr of ftIdxMap field indexes if index-map used */
idxColNames[]= null; /* ftIdxMap field indexes if index-map used */
/* Fix file separators if need to do it. */
inputFile= mapPathFileSeparators(inputFile);
/* [1.1] Find terminator as either \r, \n or \r\n */
String
fileLineSep= this.getLineSeparatorFromFile(inputFile);
/* Set to the first line of tData when found */
firstDataRowLineNbr= -1;
/* If making Index-Map, setup the empty ftIdxMap Table */
if(!makeIndexMapFlag)
ftIdxMap= null; /* There is no map Table until read data rows */
else
{ /* setup the empty ftIdxMap Table */
ftIdxMap= new FileTable("File-Index-Map");
ftIdxMap.setHasTableHeaderFlag(true);
ftIdxMap.setDuplicateFieldsFlag(false);
ftIdxMap.setNbrTableHdrLines(1);
ftIdxMap.setRmvTrailingBlankLinesFlag(false);
ftIdxMap.setHasEmptyLineBeforeTableFlag(false);
} /* setup the empty ftIdxMap Table */
/* [2] Read and parse the tab-delimited data file into the Table
* and get the tPreface as all data before the Header.
*/
try
{ /* Read file and parse data into Table */
RandomAccessFile rafL= new RandomAccessFile(inputFile, "r");
/* [2.1] Get the file line separator and it's size.
* It is either \r, \n or \r\n - or null if an error */
if(fileLineSep==null)
{ /* Handle errors */
errMsgLog += "MIF-RAPF no \r, \n or \r\n terminator in '" +
inputFile + "'\n";
lastErrMsgLog= errMsgLog;
return (false);
}
/* [2.2] Init parse variables */
String
lookbackHdrs[]= new String[nbrTableHdrLines],
tokens[]= null, /* list of tokens in the inputLine */
inputLine = null; /* lines read from buffered reader */
long
lookbackStartBytePtr[]= new long[nbrTableHdrLines+1],
lookbackEndBytePtr[]= new long[nbrTableHdrLines+1];
boolean
beforeFirstDataLineFlag= true, /* set false on 1st data row */
foundDataRowFlag= false, /* set at end of the header */
foundHdrRowFlag= false; /* saw header row in file, else error */
tHdrRows= nbrTableHdrLines;
tRows= 0; /* count first row */
tCols= 0; /* there is at least 1 Row if fields*/
/* [3] Read lines from the input file using the RandomAccessFile
* readLine() fct that stops at and ignores line terminators "\n",
* "\r", or "\r\n".
*/
/* Compute the extra line terminate length we will add to each startByte
* after 1st so aligned correctly.
* Note that the lineTermSize of the file may be different from
* that of the operating system (for various reasons such as being
* edited by an editor that changes "\n" or "\r", to "\r\n").
* Therefore, we need to compute it for the current file.
*/
lineTermSize= computeFileLineTermSize(rafL, 0L);
if(lineTermSize==-1)
{
UtilCM.logMsg("Problem computing file line terminator size for "+
"file '"+inputFile+"- aborting.\n");
return(false);
}
endByte= -lineTermSize; /* set for first line computations */
long curFilePtr= rafL.getFilePointer(); /* Should be 0L */
while((inputLine= rafL.readLine()) != null)
{ /* process lines */
countLines++; /* first row is defined as line 1, not 0 */
/* [3.1] Compute the line byte pointers with terminator.
* Note: startByte starts and old endByte + lineTermSize + 1.
*/
int
//offset= ((countLines==1)? 0 : lineTermSize+1), /* no offset first line */
lineLth= inputLine.length(); /* does not include terminator */
long prevFilePtr= curFilePtr; /* Save it */
curFilePtr= rafL.getFilePointer(); /* After the read */
startByte= prevFilePtr; /* starts after prev line terminator */
endByte= curFilePtr-lineTermSize; /* does not include term.r */
/* [3.1.1] DEBUG TEST if Index-Map byte seek pointers are correct.
* [TODO] remove this DEBUG code when working fully.
*/
if(false && countLines
* if reorderRemainingColumnsAlphabeticlyFlag is set, then sort the
* remaining columns not specified, but that are used, alphabetically.
*
* @param reordColName - list of column names to be reordered
* @param reordColNbr - corresponding list of new column numbers
* starting at 1 (used specified, but we
* map to n-1).
* @param nReorderColName - number of mappings
* @param reorderRemainingColsFlag to sort the remaining
* columns not specified, but that are used,
* alphabetically.
* @return true if succeed else if fail the errMsgLog string
* is set with the reason it failed.
*/
public boolean reorderColumnsInTable(String reordColName[],
int reordColNbr[],
int nReorderColName,
boolean reorderRemainingColsFlag)
{ /* reorderColumnsInTable */
String
roColName,
copyReorderColName[]= new String[nReorderColName];
int
roColIdx,
copyReorderColNbr[]= new int[nReorderColName],
c;
boolean flag;
/* [1] Make sure have tFields */
if(tFields==null)
{ /* No tFields in the Table */
errMsgLog += "FT-RCIT there is no Table (no tFields) header" +
" - ignoring reorder.\n";
lastErrMsgLog= errMsgLog;
return(false);
}
if(tFields.length!=tCols)
{ /* No tFields in the Table */
errMsgLog += "FT-RCIT DRYROT |tFields|="+tFields.length+
" .NEQ tCols="+tCols + " - ignoring reorder.\n";
lastErrMsgLog= errMsgLog;
return(false);
}
if(nReorderColName==0 || reordColName==null)
{ /* No tFields in the Table */
errMsgLog += "FT-RCIT there is no Reorder columns list" +
" - ignoring reorder.\n";
lastErrMsgLog= errMsgLog;
return(false);
}
/* work off of local copy */
for(int i=0;i
* Copyright 2008, 2009 by Peter Lemkin
* E-Mail: lemkin@users.sourceforge.net
* http://lemkingroup.com/
*
*/
public class FileTable extends FileIO
{
/** Global utilities UtilCM instance */
public UtilCM
util;
/** Set to true (default) if has table header). */
public boolean
hasTableHeaderRowFlag;
/** Set to true if has table may have duplicate fields. */
public boolean
ignoreDuplicateFieldsFlag;
/** Set to true to look for empty line(s) BEFORE the table data */
public boolean
hasEmptyLineBeforeTableFlag;
/** Set to true if remove trailing blank lines from the table */
public boolean
rmvTrailingBlankLinesFlag;
/** Set to true to remove empty trailing columns defined as columns
* in a Table with a header as null column names.
*/
public boolean
rmvTrailingEmptyColumnsFlag;
/** Set to string known to be in header. When parsing the
* table input string, we look for this value to determine the
* line where the Table starts.
*/
public String
startTblColStr;
/** Reduce the number of header lines to 1 even if there are more
* than 1 header line.
*/
public boolean
useOnlyLastHeaderLineFlag;
/** size of input buffer */
public int
bufSize;
/** List of unmodified tField names, if not null with size
* nUnmodifiedFieldNames. This is the maximum size of tFields[],
* i.e., tCols and may be setup by other methods in other classes
* (e.g., see Convert..
*/
public String
unmodifiedFieldNames[]= null;
/** Size of unmodifiedFieldNames[], the list of unmodified tField[]
* names.
*/
public int
nUnmodifiedFieldNames= 0;
/** Line number of the first row with table data
* (Table Fields or Data if no Fields) */
public int
firstDataRowLineNbr;
/** Char index of input buffer first row with table data
* (Table Fields or Data if no Fields).
*/
public int
firstDataRowCharIdx;
/** Number of lines of the Table header "-hdr:{nbr of lines in header}"
* switches. The number of Table header lines with default of 1.
* If > 1 line, then the Table Fields searched for URL mapping are the
* last one in the header row. All header lines are bolded with TH
* rather than TD. The tokens for the last of these header lines is
* also saved in lastTblHdrRow[0:tCols] to be used by the Column to
* URL mapping...
*/
public int
nbrTableHdrLines= 1;
/** file to read or write I/O */
public String
fileName;
/** number of header rows (default is 1) */
public int
tHdrRows;
/** number of columns/row in the table. */
public int
tCols;
/** number of rows. i.e. number row Clone Id's */
public int
tRows;
/** Table name if any */
public String
tName;
/** Table preface if any. This is parsed from the initial text
* prior to the actual table if it was separated by a
* blank line or the table defined by a tField keyword.
*/
public String
tPreface;
/** names of table headers of size [0:tHdrRows-1][0:tCols-1].
* Some of the non-primary (i.e., tField[]) header rows may not have
* tCols of data (i.e., short rows). These are filled with "".
*/
public String
tHeader[][];
/** names of table fields. These are the primary table column names. */
public String
tFields[];
/** row data cell vectors [0:tRows-1][0:tCols-1] */
public String
tData[][];
/* ---- Index-maps of the data for random access of the data ---- */
/** Index-map byte pointers of the start of header lines synced
* with tHeader rows.
*/
public long
lineStartHeaderBytePtr[];
/** Index-map byte pointers of the start of header lines synced
* with tHeader rows.
*/
public long
lineEndHeaderBytePtr[];
/** Index-map byte pointers of the start of data lines synced
* with tData rows.
*/
public long
lineStartDataBytePtr[];
/** Index-map byte pointers of the start of data lines synced
* with tData rows.
*/
public long
lineEndDataBytePtr[];
/** List of column names to be used in the left side of the ftIdxMap
* Table if it is built. If it is null, do not build the index-map
* since these column names are the keys used to build the index-map
* Table.
*/
public String
colNameIndexMap[]= null;
/** FileTable index map where save the {colNames, "StartByte", "EndByte")
* The colNames is a list with arbitrary names that can not be
* either "StartByte" or "EndByte".
*/
public FileTable
ftIdxMap= null;
/** [HACK] If the table is sorted by sortTableRowsByField for numeric
* values with the useAbsValueFlag set true, and both + and - changes
* were found, it will set the needToSortTableAgainFlag so the calling
* program can resort one more time if it wants to keep the data in
* numeric order.
*/
public boolean
needToSortTableAgainFlag= false;
/** This is the (sorted) full table data BEFORE the length of the table
* was limited using limitMaxRowsXXX() methods. It is null unless
* the Table was limited in number of rows. This is useful
* if we need to get at the full Table later.
*/
public String
tDataFull[][]= null;
/**
* FileTable() - generic Constructor, set some defaults to 0.
*/
public FileTable()
{ /* FileTable */
clearTable(); /* set to empty table */
util= new UtilCM(); /* set up utility package */
} /* FileTable */
/**
* FileTable() - generic Constructor, set some defaults.
* @param tableName String name of table
* @see UtilCM
* @see #setDuplicateFieldsFlag
* @see #setNbrTableHdrLines
* @see #setRmvTrailingBlankLinesFlag
* @see #setHasEmptyLineBeforeTableFlag
*/
public FileTable(String tableName)
{ /* FileTable */
clearTable(); /* set to empty table */
util= new UtilCM(); /* set up utility package */
this.tName= tableName;
} /* FileTable */
/**
* FileTable() - Constructor to make empty table of known size
* @param tableName String name of table
* @param rows max rows in table
* @param cols max columns in table
* @see UtilCM
* @see #setDuplicateFieldsFlag
* @see #setNbrTableHdrLines
* @see #setRmvTrailingBlankLinesFlag
* @see #setHasEmptyLineBeforeTableFlag
*/
public FileTable(String tableName, int rows, int cols)
{ /* FileTable */
clearTable(); /* set to empty table */
util= new UtilCM(); /* set up utility package */
this.tName= tableName;
this.tRows= rows;
this.tCols= cols;
} /* FileTable */
/**
* clearTable() - set to empty table. Generally called from Constructor
*/
public void clearTable()
{ /* clearTable */
fileName= null;
tName= "none";
tRows= 0;
tCols= 0;
tHdrRows= 0;
tFields= null;
tHeader= null;
tData= null;
tDataFull= null;
tPreface= null;
unmodifiedFieldNames= null;
nUnmodifiedFieldNames= 0;
errMsgLog= "";
lastErrMsgLog= "";
startTblColStr= null;
firstDataRowLineNbr= 0;
firstDataRowCharIdx= 0;
lineStartHeaderBytePtr= null;
lineEndHeaderBytePtr= null;
lineStartDataBytePtr= null;
lineEndDataBytePtr= null;
/* Set the rest using set methods */
setHasTableHeaderFlag(true);
setDuplicateFieldsFlag(false);
setNbrTableHdrLines(1);
setHasEmptyLineBeforeTableFlag(false);
setRmvTrailingBlankLinesFlag(false);
setRmvTrailingEmptyColumnsFlag(false);
setStartTableAtColStr(null);
useOnlyLastHeaderLineFlag= false;
needToSortTableAgainFlag= false;
} /* clearTable */
/**
* cloneTable() - clone the current Table. Copy data by value.
* @param cloneTableName is new tName of the clone
* @param return FileTable instance of clone if successful, else null.
*/
public FileTable cloneTable(String cloneTableName)
{ /* cloneTable */
FileTable ftC= new FileTable(cloneTableName);
ftC.setHasTableHeaderFlag(this.hasTableHeaderRowFlag);
ftC.setDuplicateFieldsFlag(this.ignoreDuplicateFieldsFlag);
ftC.setNbrTableHdrLines(this.nbrTableHdrLines);
ftC.setRmvTrailingBlankLinesFlag(this.rmvTrailingBlankLinesFlag);
ftC.setRmvTrailingEmptyColumnsFlag(this.rmvTrailingEmptyColumnsFlag);
ftC.setHasEmptyLineBeforeTableFlag(this.hasEmptyLineBeforeTableFlag);
ftC.setStartTableAtColStr(this.startTblColStr);
ftC.setUseOnlyLastHeaderLineFlag(this.useOnlyLastHeaderLineFlag);
ftC.setFieldsToTable(this.tFields);
ftC.setHeadersToTable(this.tHeader, this.tHdrRows, this.tCols);
ftC.tRows= this.tRows;
if(this.tData==null)
ftC.tData= null;
else
{ /* Copy the data by value */
ftC.tData= new String[tRows][];
for(int r=0;r
* This does NOT save the tData rows.
*
* (tHeader,tPreface,tRows,tCols,tFields[],tData[][],tCols,tRows,tHdrNbr).
*
* (tHeader,tPreface,tRows,tCols,tFields[],tData[][],tCols,tRows,tHdrNbr).
*
* This does NOT save the tData rows.
*
* This does NOT save the tData rows.
*