Class MakeFlipTable

java.lang.Object
  extended by MakeFlipTable

public class MakeFlipTable
extends java.lang.Object

Class MakeFlipTable to flip a Table by either of two methods.

In both cases, it effectively transposes rows and columns but has other functionality as well. The first way is to do it in memory using an existing Table that contains tData. The second, for very large table files, is applicable when there are a small number of columns to be flipped. It works by using the Index-Map for the Table and then random accesses only the rows specified from the file rather than from memory for the data to flip.

Generate transposed files using random access file indexing to create a multi-line header (1 line for each column name in the list) using the list of columns previously specified when generating the index map file with the '-makeIndexMapFile:{colName1,colName2,...,colNameN}' command. It analyzes the index map Table and then uses all columns before the ("StartByte", "EndByte") columns to define the flipped Table header. See the '-flipColumnName:{flipColumnFile,flipColumnName}' or '-flipColumnName:{*LIST*,flipColumnName,v1,v2,...vn}' to restrict which flipped column data to use. See the '-flipRowFilterNames:{flipRowFilterNamesFile}' to restrict which flipped row data to use. It is set by '-flipTableByIndexMap:{flipDataFile,flipIndexMapFile}' switch.

 List of Methods
 ===================
 MakeFlipTable() - Constructor
 setFlippedOutputFile() - set the dir and name for flipped output file.
 makeWorkingTables() - Initialize working Tables required for flipping data
 processData() - create the flipped data Table and write it out
 extractDataRowsByColumnFilters() - extract ftData rows by column-filters 
 makeListOfSeekDataRowsByColumnFilters() filter rows from Index Map data
 matchColumnFilterData() - test a row of Index Map data to see if match Column lists
 matchRowFilterData() - test a String data row to for row name match.
 flip_ftRowsTo_ftFlippedTable() - create ftFlippedTable from ftRows Table
 saveFlipTableAsHTMLfile() - save flip Table as HTML file.
 saveFlipTableAsTextfile() - save flip Table as a tab-delimited text file.
 mapMultilineHdrHREFs() - create HTML for ftFlipped Table.
 
 List of Tables
 =================
 ftData    - MRR data Table
 ftIndex   - index-map Table for the ftData Table file
 ftRows    - extracted rows from the ftData table that will be flipped
 ftFlipped - flipped Table constructed from ftRows Table.
 
List of Switches and Globals.java variables set by them ============================================================ -flipTableByIndexMap:{flipDataFile,flipIndexMapFile,(opt)maxRows} cvt.flipTableByIndexMapFlag cvt.flipDataFile cvt.flipIndexMapFile cvt.maxFlipSeekRowsToExtract -flipColumnName:{flipColumnFile,flipColumnNames} cvt.flipColumnFile[0:cvt.nFlipColumns-1] cvt.flipColumnName[0:cvt.nFlipColumns-1] cvt.flipColumnValues[0:cvt.nFlipColumns-1][] cvt.nFlipColumns -flipExcludeColumnName:{flipExcludeColumnName} cvt.flipExcludeColumnName[0:cvt.nFlipExcludeColumns-1] cvt.nFlipExcludeColumns -flipOrderHdrColNames:{colHdrName1,colHdrName2,...,colHdrNameN} cvt.flipOrderHdrColList[0:nFlipOrderHdrColList-1] cvt.nFlipOrderHdrColList -flipRowFilterNamesFile:{flipRowFilterNamesFile} cvt.flipRowFilterNamesFile cvt.flipRowFilterNames cvt.nFlipRowFilterNames

This code is available at the HTMLtools project on SourceForge at http://htmltools.sourceforge.org/ under the "Common Public License Version 1.0" http://www.opensource.org/licenses/cpl1.0.php.

It was derived and refactored from the open source MAExplorer (http://maexplorer.sourceforge.org/), and Open2Dprot (http://Open2Dprot.sourceforge.net/) Table modules.

$Date: 2009/12/02 11:45:56 $ $Revision: 1.38 $
Copyright 2008, 2009 by Peter Lemkin E-Mail: lemkin@users.sourceforge.net http://lemkingroup.com/


Field Summary
 HTMLtools cvt
          Note all global variables are in Globals.java.
 boolean dataOK
          Processing succeeded and ftFlipped Table is valid.
 long[] endSeekByte
          List of end row byte seek pointers.
 FileTable fio
          Global fileTable instance
 java.lang.String[][] flipColNameFilterData
          These are the list of column name data for filtering each column to be used for the new flipped Table headers.
 java.lang.String flipDataPath
          Input data file path.
 java.lang.String flipIndexMapPath
          Input Index Map file path.
private  java.lang.String flipOutputFile
          The output flip .txt file.
 FileTable ftData
          Input data FileTable instance.
 FileTable ftFlipped
          Flipped data FileTable instance constructed from ftRows Table
 FileTable ftIndex
          Input data index-map FileTable instance mapping ftData file seeks (start,end) bytes.
 FileTable ftRows
          Extracted subset of data rows FileTable instance from the ftData Table
private  int glbMaxRowIndex
          Global max value index of name in table
private  java.lang.String glbMaxRowName
          Global max value name in table
private  int glbMeanRowIndex
          Global mean value index of name in table
private  java.lang.String glbMeanRowName
          Global mean value name in table
private  int glbMinRowIndex
          Global min value index of name in table
private  java.lang.String glbMinRowName
          Global min value name in table
private  int glbStdDevRowIndex
          Global stdDev value index of name in table
private  java.lang.String glbStdDevRowName
          Global stdDev value name in table
 int[] idxColData
          List of ftData.tField[] indexes of the column names.
 int[] idxColIMfilters
          List of ftIndex.tField[] index of the column names.
 int idxEndByte
          List of "EndByte" ftIndex.tField[] index of the column names
 int[] idxOrderedColHdrNames
          Ordered List of ftData.tField[] indexes of the column names for the flipped table header.
 int idxStartByte
          List of "StartByte" ftIndex.tField[] index of the column names
 int maxSeekRowsToExtract
          Maximum size of start/endSeekByte[] lists for reallocation.
 int[] nFlipColNameFilterData
          These are the sizes of the list of column name data for filtering each column to be used for the new flipped Table headers.
 int nSeekRows
          Size of start/endSeekByte[] lists for reallocation.
 int[] rowNbrToSeek
          List of row numbers corresponding to the seek pointers.
 java.lang.String sFlipTableReport
          Computation Strings that can be added to the report.
 java.lang.String sFlipTableReportHTML
           
private  java.lang.String sortTableTitle
          Sort Table title if any
 long[] startSeekByte
          List of start row byte seek pointers.
 
Constructor Summary
MakeFlipTable(HTMLtools cvt, int maxSeekRowsToExtract)
          MakeFlipTable() - Constructor
 
Method Summary
 boolean calcReportFoldChangeABstatistics(FileTable ftFlipped)
          calcReportFoldChangeABstatistics() - calc stats if cvt.reportFoldChangeFlag.
private  boolean extractDataRowsByColumnFilters()
          extractDataRowsByColumnFilters() - extract ftData rows by column-filters saving the resulting rows in ftRows.
private  boolean flip_ftRowsTo_ftFlippedTable()
          flip_ftRowsTo_ftFlippedTable() - create ftFlipped Table from ftRows Table
private  boolean getGlobalStatistics(FileTable ftIndex)
          getGlobalStatistics() - Setup Global statistics if .sidx file, then get (Statistics Index Map), it has Global statistics in * header[0:1].
private  void initClassVars()
          initClassVars() - reset the class variables
private  boolean makeListOfSeekDataRowsByColumnFilters(int maxSeekRows)
          makeListOfSeekDataRowsByColumnFilters() filter rows from Index Map data by filtering any of the flipColNameFilterData[] criteria in the index map (which has all of the fields we need to do the filtering) to generate a list of rows that we will actually random access read from the input data.
 boolean makeWorkingTables()
          makeWorkingTables() - Initialize working Tables required for flipping data.
 java.lang.String mapMultilineHdrHREFs(java.lang.String htmlFilePath)
          mapMultilineHdrHREFs() - create HTML for ftFlipped Table.
private  boolean matchColumnFilterData(java.lang.String[] rowData, java.lang.String[] colNames, java.lang.String[][] colNameFilterData, int[] idxColNamesInRowData, int nCols, boolean useExactMatchFlag)
          matchColumnFilterData() - test a row of Index-Map data to see if match Column lists.
private  boolean matchRowFilterData(java.lang.String sRowData, java.lang.String[] rowFilterNames, int nRowFilterNames)
          matchRowFilterData() - test a String data row to for row name match.
 boolean processData()
          processData() - create the flipped data Table and write it out by the following algorithm.
 boolean saveFlipTableAsHTMLfile(java.lang.String oDir, java.lang.String oFile)
          saveFlipTableAsHTMLfile() - save flip Table as HTML file.
 boolean saveFlipTableAsTextfile(java.lang.String oDir, java.lang.String oFile)
          saveFlipTableAsTextfile() - save flip Table as a tab-delimited text file.
 boolean setFlippedOutputFile(java.lang.String flipOutputFile)
          setFlippedOutputFile() - set the dir and name for flipped output file.
private  boolean thresholdFoldChangeColumnsInFlipTable(FileTable ftF, float fcThr, double[] fcAB)
          thresholdFoldChangeColumnsInFlipTable() - threshold columns < flipFCthreshold.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

cvt

public HTMLtools cvt
Note all global variables are in Globals.java. These global variables are in Globals.java accessed through HTMLtools cvt instance.


fio

public FileTable fio
Global fileTable instance


flipOutputFile

private java.lang.String flipOutputFile
The output flip .txt file. The HTML file has the same base name but has a .html extension instead of the .txt extension. This is set by setFlippedOutputFile().


ftData

public FileTable ftData
Input data FileTable instance. Since we buffer the Table, this just contains the header information.


ftIndex

public FileTable ftIndex
Input data index-map FileTable instance mapping ftData file seeks (start,end) bytes.


ftRows

public FileTable ftRows
Extracted subset of data rows FileTable instance from the ftData Table


ftFlipped

public FileTable ftFlipped
Flipped data FileTable instance constructed from ftRows Table


flipColNameFilterData

public java.lang.String[][] flipColNameFilterData
These are the list of column name data for filtering each column to be used for the new flipped Table headers. The data is contained in files cvt.flipColumnFile[0:cvt.nFlipColumns-1] with column names cvt.flipColumnName[0:cvt.nFlipColumns-1]. The data is in the 2nd dimension of the lists of size nFlipColNameFilterData[0:cvt.nFlipColumns-1]


nFlipColNameFilterData

public int[] nFlipColNameFilterData
These are the sizes of the list of column name data for filtering each column to be used for the new flipped Table headers. The data is contained in files cvt.flipColumnFile[0:cvt.nFlipColumns-1] with column names cvt.flipColumnName[0:cvt.nFlipColumns-1]. The data is in the 2nd dimension of the lists of size nFlipColNameFilterData[0:cvt.nFlipColumns-1]


idxColIMfilters

public int[] idxColIMfilters
List of ftIndex.tField[] index of the column names. Note: this is synced with idxColData[].


idxOrderedColHdrNames

public int[] idxOrderedColHdrNames
Ordered List of ftData.tField[] indexes of the column names for the flipped table header. Set indirectly by -flipOrderHdrColNames:{colHdrName1,colHdrName2,...,colHdrNameN} switch.


idxStartByte

public int idxStartByte
List of "StartByte" ftIndex.tField[] index of the column names


idxEndByte

public int idxEndByte
List of "EndByte" ftIndex.tField[] index of the column names


idxColData

public int[] idxColData
List of ftData.tField[] indexes of the column names. Note: this is synced with idxColIMfilters[].


flipDataPath

public java.lang.String flipDataPath
Input data file path.


flipIndexMapPath

public java.lang.String flipIndexMapPath
Input Index Map file path.


dataOK

public boolean dataOK
Processing succeeded and ftFlipped Table is valid.


maxSeekRowsToExtract

public int maxSeekRowsToExtract
Maximum size of start/endSeekByte[] lists for reallocation.


nSeekRows

public int nSeekRows
Size of start/endSeekByte[] lists for reallocation.


rowNbrToSeek

public int[] rowNbrToSeek
List of row numbers corresponding to the seek pointers.


startSeekByte

public long[] startSeekByte
List of start row byte seek pointers.


endSeekByte

public long[] endSeekByte
List of end row byte seek pointers.


glbMinRowName

private java.lang.String glbMinRowName
Global min value name in table


glbMaxRowName

private java.lang.String glbMaxRowName
Global max value name in table


glbMeanRowName

private java.lang.String glbMeanRowName
Global mean value name in table


glbStdDevRowName

private java.lang.String glbStdDevRowName
Global stdDev value name in table


glbMinRowIndex

private int glbMinRowIndex
Global min value index of name in table


glbMaxRowIndex

private int glbMaxRowIndex
Global max value index of name in table


glbMeanRowIndex

private int glbMeanRowIndex
Global mean value index of name in table


glbStdDevRowIndex

private int glbStdDevRowIndex
Global stdDev value index of name in table


sortTableTitle

private java.lang.String sortTableTitle
Sort Table title if any


sFlipTableReport

public java.lang.String sFlipTableReport
Computation Strings that can be added to the report. The HTML version is for the generated HTML file


sFlipTableReportHTML

public java.lang.String sFlipTableReportHTML
Constructor Detail

MakeFlipTable

public MakeFlipTable(HTMLtools cvt,
                     int maxSeekRowsToExtract)
MakeFlipTable() - Constructor

Parameters:
cvt - is instance of HTMLtools
maxSeekRowsToExtract - - max # of rows to extract for flipped Table
Method Detail

initClassVars

private void initClassVars()
initClassVars() - reset the class variables


setFlippedOutputFile

public boolean setFlippedOutputFile(java.lang.String flipOutputFile)
setFlippedOutputFile() - set the dir and name for flipped output file. This sets the output file name either from the -saveTableAsFile switch or using the input data file base name with a -addOutputPostfix substring or "-flipped" in the worst case.

Parameters:
outputDataDir - - directory for saving the flipped Table .txt and .html files
flipOutputFile - - name of the output flipped Table file to be save
Returns:

makeWorkingTables

public boolean makeWorkingTables()
makeWorkingTables() - Initialize working Tables required for flipping data. This sets up the FileTables for
 ftData - for the input data Table file
 ftIndex - for index map Table file corresponding to input data file rows
 ftRows - for the filtered data rows to be computed
 ftFlipped - for the flipped ftRows data to be computed
 
It also reads the '-flipColumnName:{flipColumnFile,flipColumnName}' or '-flipColumnName:{*LIST*,flipColumnName,v1,v2,...vn}' data, and it reads the '-flipRowFilterNamesFile:{flipRowNameFile}' filtered by "-flipRowGSPIDfilterSubstring:"{substring}".

Returns:
true if succeed
See Also:
FileTable, FileTable.readAndParseTableFieldsAndIndexMap(java.lang.String), FileTable.readAndParseTable(java.lang.String), FileIO.readFileAsString(java.lang.String), FileTable.lookupFieldIdx(java.lang.String), UtilCM.mapCRLF2space(java.lang.String), UtilCM.replaceSubstrInString(java.lang.String, java.lang.String, java.lang.String), UtilCM.cvs2Array(java.lang.String, java.lang.String), UtilCM.logMsg(java.lang.String)

processData

public boolean processData()
processData() - create the flipped data Table and write it out by the following algorithm.
 [1] Filter all rows into ftRows that match any of the
     flipColNameFilterData[] criteria in the index map (which has all
     of the fields we need to do the filtering) to generate a list of
     rows that we will actually random access read from the input
     data file.
  [2] Create the Flipped Table ftFlipped from the ftRows Table.  
  [3] Create and write out the Flipped Table ftFlipped from the
      ftRows Table.
  [4] Create and write out the flipped HTML Table w/multi-line HREF 
      mapped headers.
 

Returns:
true if succeed
See Also:
extractDataRowsByColumnFilters(), flip_ftRowsTo_ftFlippedTable(), mapMultilineHdrHREFs(java.lang.String), FileTable.writeTableToTabDelimFile(java.lang.String, boolean), FileIO.mapPathFileSeparators(java.lang.String), FileIO.makePathSubDirs(java.lang.String), FileIO.writeStringToFile(java.lang.String, java.lang.String), UtilCM.logMsg(java.lang.String)

extractDataRowsByColumnFilters

private boolean extractDataRowsByColumnFilters()
extractDataRowsByColumnFilters() - extract ftData rows by column-filters saving the resulting rows in ftRows. Test the rows by matching column data flipColNameFilterData[] against data in the ftIndex index-map Table. data against the column filters. Save these matching rows in ftRows. The index map Table has all of the fields we need to do the filtering. We get the ftData rows using the random-access reads on the disk file using the (start,end) byte-seek data in the ftIndex Table. We do this since we don't read the entire ftData table into memory since it could be too large.

Returns:
true if succeed
See Also:
FileIO.readRandomAccessLine(java.lang.String, long), FileTable.appendRowToTable(java.lang.String[]), UtilCM.cvs2ArrayNullFill(java.lang.String, java.lang.String, java.lang.String), UtilCM.logMsg(java.lang.String)

makeListOfSeekDataRowsByColumnFilters

private boolean makeListOfSeekDataRowsByColumnFilters(int maxSeekRows)
makeListOfSeekDataRowsByColumnFilters() filter rows from Index Map data by filtering any of the flipColNameFilterData[] criteria in the index map (which has all of the fields we need to do the filtering) to generate a list of rows that we will actually random access read from the input data.
 This creates the three arrays in these class variables:  
   rowNbrToSeek[0:nSeekRows-1]
   startSeekByte[0:nSeekRows-1]
   endSeekByte[0:nSeekRows-1]
 

Parameters:
maxSeekRows - - is the initial size of the seel table
Returns:
true if succeed
See Also:
#matchColumnFilterData(), UtilCM.cvs2l(java.lang.String, long)

matchColumnFilterData

private boolean matchColumnFilterData(java.lang.String[] rowData,
                                      java.lang.String[] colNames,
                                      java.lang.String[][] colNameFilterData,
                                      int[] idxColNamesInRowData,
                                      int nCols,
                                      boolean useExactMatchFlag)
matchColumnFilterData() - test a row of Index-Map data to see if match Column lists.

Parameters:
rowData - - row of the index-map data to test (size nCols)
colNames - - is list of column names to test (size nCols)
colNameFilterData - - is nCols array of arrays of column data instances. If a particular array is null, do not test it.
idxColNamesInRowData - - indexes of colNames for rowData.
nCols - - # of columns to test in rowData[]
useExactMatchFlag - - use exact match flag, else match all substrings
Returns:
true if match or there are no filters, false if there are filters and it failed.

matchRowFilterData

private boolean matchRowFilterData(java.lang.String sRowData,
                                   java.lang.String[] rowFilterNames,
                                   int nRowFilterNames)
matchRowFilterData() - test a String data row to for row name match.

Parameters:
rowData - - row String
flipRowNames - - is list of row names to test (size nCols)it.
nFlipRowFilterNames - - number of for filter name to test.
Returns:
true if match or there are no filters, false if there are filters and it failed.

flip_ftRowsTo_ftFlippedTable

private boolean flip_ftRowsTo_ftFlippedTable()
flip_ftRowsTo_ftFlippedTable() - create ftFlipped Table from ftRows Table

Returns:
true if succeed.

saveFlipTableAsHTMLfile

public boolean saveFlipTableAsHTMLfile(java.lang.String oDir,
                                       java.lang.String oFile)
saveFlipTableAsHTMLfile() - save flip Table as HTML file. It forces a ".html" extension on oFile.

Parameters:
oDir - output file directory
oFile - output file
Returns:
true if succeed

saveFlipTableAsTextfile

public boolean saveFlipTableAsTextfile(java.lang.String oDir,
                                       java.lang.String oFile)
saveFlipTableAsTextfile() - save flip Table as a tab-delimited text file. It forces a ".txt" extension on oFile.

Parameters:
oDir - output file directory
oFile - output file
Returns:
true if succeed

mapMultilineHdrHREFs

public java.lang.String mapMultilineHdrHREFs(java.lang.String htmlFilePath)
mapMultilineHdrHREFs() - create HTML for ftFlipped Table.

Parameters:
htmlFilePath - is the name of the HTML file to write

getGlobalStatistics

private boolean getGlobalStatistics(FileTable ftIndex)
getGlobalStatistics() - Setup Global statistics if .sidx file, then get (Statistics Index Map), it has Global statistics in * header[0:1].


calcReportFoldChangeABstatistics

public boolean calcReportFoldChangeABstatistics(FileTable ftFlipped)
calcReportFoldChangeABstatistics() - calc stats if cvt.reportFoldChangeFlag. Then for EG samples cvt.flipAclass and cvt.flipAclass (no common samples), then parse (nA,nB,mnA,mnB,sdA,sdB,fcAB=mnA/mnB,pValueAB) into float[]s. Then add string rows for (mnA,mnB,sdA,sdB,fcAB=mnA/mnB,pValueAB) to the ftFlipped table.

Parameters:
ftFlipped - - completed ftFlipped table (before add statistics)
Returns:
true if succeed in generating statistics

thresholdFoldChangeColumnsInFlipTable

private boolean thresholdFoldChangeColumnsInFlipTable(FileTable ftF,
                                                      float fcThr,
                                                      double[] fcAB)
thresholdFoldChangeColumnsInFlipTable() - threshold columns < flipFCthreshold. If reporting fold change with reportFoldChangeFlag in flip Table reporting, and -flipFCthreshold:{flipFCthreshold}, then set the flipFCthrFlag and save the postive value in flipFCthreshold. Keep columns c with fcAB[c] >= fcThr and for 1/fcAB[c] >=fcThr.

Parameters:
ftF - is the ftFlipped table
fcThr - is flipFCthreshold
fcAB - is the fold-change values for the corresponding columns as ftF
Returns:
true if sucessful