Class DataRowStatisticsIndexMap

java.lang.Object
  extended by java.lang.Thread
      extended by DataRowStatisticsIndexMap
All Implemented Interfaces:
java.lang.Runnable

public class DataRowStatisticsIndexMap
extends java.lang.Thread

Class DataRowStatisticsIndexMap is used to add statistics on the data to the IndexMap which could be saved as a .sidx file. The ftIM IndexMap is read from the [TODO] add discussion on which data can be used.

 List of Methods
=================== DataRowStatisticsIndexMap() - Constructor setIndexMapTable() - set the IndexMap FileTable ftIM to use setDataTable() - set the data FileTable ft to use if any createStatisticsIndexMapFile() - create & write Statistics Index Map file readIndexMapFile() - read the IndexMap FileTable to ftIM. readDataFileHeader() - read the data file FileTable to ft. computeRowStatistics() - compute row statistics of data (max,min,mean,stddev) computeGlobalStatistics() - compute global stats of (max,min,mean,stddev) data addRowStatisticsToIndexMap() - add the row stats to the IndexMap ftIM table. addGlobalStatisticsToIndexMapHdr() - add 2 header rows hdr[0:1] for global stats. writeStatisticsIndexMap() - write extended IndexMap w/statistics as .sidx file.

This code is available at the HTMLtools project on SourceForge at http://htmltools.sourceforge.org/ under the "Common Public License Version 1.0" http://www.opensource.org/licenses/cpl1.0.php.

It was derived and refactored from the open source MAExplorer (http://maexplorer.sourceforge.org/), and Open2Dprot (http://Open2Dprot.sourceforge.net/) Table modules.

$Date: 2009/09/15 11:45:56 $ $Revision: 1.34 $
Copyright 2008, 2009 by Peter Lemkin E-Mail: lemkin@users.sourceforge.net http://lemkingroup.com/


Nested Class Summary
 
Nested classes/interfaces inherited from class java.lang.Thread
java.lang.Thread.State, java.lang.Thread.UncaughtExceptionHandler
 
Field Summary
private  HTMLtools cvt
          converter link
private  java.lang.String dataFilePath
          The full path of the Table data file (.idx extension).
private  int[] dropListCols
          List of columns in FileTable ft that are in the -dropColumn list.
private  FileTable ft
          FileTable of the Data table
private  FileTable ftIM
          FileTable of the IndexMap
private  float glbMaxRowVal
          Global max value from data tables
private  double glbMaxRowValSum
          Global min value from data tables
private  float glbMeanRowVal
          Global mean value from data tables
private  double glbMeanRowValSum
          Global min value from data tables
private  float glbMinRowVal
          Global min value from data tables
private  double glbMinRowValSum
          Global min value from data tables
private  float glbStdDevRowVal
          Global stdDev value from data tables
private  double glbStdDevRowValSum
          Global min value from data tables
private  int idxEndByte
           
private  int idxMaxRow
           
private  int idxMeanRow
           
private  int idxMinRow
           
private  int idxStartByte
           
private  int idxStdDevRow
           
private  java.lang.String indexMapFilePath
          The full path of the Table Index Map file (.idx extension).
private  float[] maxRowVal
          Computed row data max values from the data Tables.
private  float[] meanRowVal
          Computed row data mean values from the data Tables.
private  float[] minRowVal
          Computed row data min values from the data Tables.
private  int nD2cols
          Number of data columns in the data Table NOT including the drop columns.
private  int nDcols
          Number of data columns in the data Table including the drop columns.
private  int nDropListCols
          Size of dropListCols[] list of columns in FileTable ft that are in the -dropColumn list.
private  int nIMrows
          Number of data rows in the index map Table.
private  int precision
          Number of of digits in output statistics for .sidx Table
private  java.lang.String statIndexMapFilePath
          The full path of the Statistics Index Map file (.sidx extension).
private  float[] stdDevRowVal
          Computed row data stdDev values from the data Tables.
 
Fields inherited from class java.lang.Thread
MAX_PRIORITY, MIN_PRIORITY, NORM_PRIORITY
 
Constructor Summary
DataRowStatisticsIndexMap(HTMLtools cvt, java.lang.String dataFilePath)
          DataRowStatisticsIndexMap() - Constructor
 
Method Summary
 boolean addGlobalStatisticsToIndexMapHdr()
          addGlobalStatisticsToIndexMapHdr() - add 2 header rows hdr[0:1] for global statistics where
 boolean addRowStatisticsToIndexMap()
          addRowStatisticsToIndexMap() - add row statistics(Min,Max,Mean,StdDev) column data to the ftIM index map table after the ftIM table has been read into memory.
 boolean computeGlobalStatistics()
          computeGlobalStatistics() - compute global statistics of (max,min,mean,stddev) data from global sums computed in computeRowStatistics.
 boolean computeRowStatistics()
          computeRowStatistics() - compute row statistics of data (max,min,mean,stddev).
 boolean createStatisticsIndexMapFile()
          createStatisticsIndexMapFile() - create & write Statistics Index Map file
 boolean isDropColumn(int cTest)
          isDropColumn() - test if column cTest is a -drop list column.
 boolean readDataFileHeader()
          readDataFileHeader() - read the data file FileTable to ft.
 boolean readIndexMapFile()
          readIndexMapFile() - read the IndexMap FileTable to ftIM.
 void setDataTable(FileTable ft)
          setDataTable() - set the data FileTable ft to use if any if it is already in memory.
 void setIndexMapTable(FileTable ftIM)
          setIndexMapTable() - set the IndexMap FileTable ftIM to use if it is already in memory.
 void setTablePrecision(int precision)
          setTablePrecision() - set number of digits in output statistics.
 boolean writeStatisticsIndexMap(FileTable ftSIM)
          writeStatisticsIndexMap() - write extended IndexMap with statistics as a .sidx file.
 
Methods inherited from class java.lang.Thread
activeCount, checkAccess, countStackFrames, currentThread, destroy, dumpStack, enumerate, getAllStackTraces, getContextClassLoader, getDefaultUncaughtExceptionHandler, getId, getName, getPriority, getStackTrace, getState, getThreadGroup, getUncaughtExceptionHandler, holdsLock, interrupt, interrupted, isAlive, isDaemon, isInterrupted, join, join, join, resume, run, setContextClassLoader, setDaemon, setDefaultUncaughtExceptionHandler, setName, setPriority, setUncaughtExceptionHandler, sleep, sleep, start, stop, stop, suspend, toString, yield
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

cvt

private HTMLtools cvt
converter link


ftIM

private FileTable ftIM
FileTable of the IndexMap


ft

private FileTable ft
FileTable of the Data table


dataFilePath

private java.lang.String dataFilePath
The full path of the Table data file (.idx extension). We do not read the entire data file into Table ft, but rather random access (all) rows based on the indexMap Table ftIM.


indexMapFilePath

private java.lang.String indexMapFilePath
The full path of the Table Index Map file (.idx extension). The table is read into ftIM. We do not write the table, but just read it.


statIndexMapFilePath

private java.lang.String statIndexMapFilePath
The full path of the Statistics Index Map file (.sidx extension). The table the expanded ftIM table after we add the row and global statistics. We do not read the table, but just write it.


precision

private int precision
Number of of digits in output statistics for .sidx Table


dropListCols

private int[] dropListCols
List of columns in FileTable ft that are in the -dropColumn list. This is of size nDropListCols.


nDropListCols

private int nDropListCols
Size of dropListCols[] list of columns in FileTable ft that are in the -dropColumn list.


idxStartByte

private int idxStartByte

idxEndByte

private int idxEndByte

idxMinRow

private int idxMinRow

idxMaxRow

private int idxMaxRow

idxMeanRow

private int idxMeanRow

idxStdDevRow

private int idxStdDevRow

nDcols

private int nDcols
Number of data columns in the data Table including the drop columns.


nD2cols

private int nD2cols
Number of data columns in the data Table NOT including the drop columns. - computed


nIMrows

private int nIMrows
Number of data rows in the index map Table.


minRowVal

private float[] minRowVal
Computed row data min values from the data Tables.


maxRowVal

private float[] maxRowVal
Computed row data max values from the data Tables.


meanRowVal

private float[] meanRowVal
Computed row data mean values from the data Tables.


stdDevRowVal

private float[] stdDevRowVal
Computed row data stdDev values from the data Tables.


glbMinRowVal

private float glbMinRowVal
Global min value from data tables


glbMaxRowVal

private float glbMaxRowVal
Global max value from data tables


glbMeanRowVal

private float glbMeanRowVal
Global mean value from data tables


glbStdDevRowVal

private float glbStdDevRowVal
Global stdDev value from data tables


glbMinRowValSum

private double glbMinRowValSum
Global min value from data tables


glbMaxRowValSum

private double glbMaxRowValSum
Global min value from data tables


glbMeanRowValSum

private double glbMeanRowValSum
Global min value from data tables


glbStdDevRowValSum

private double glbStdDevRowValSum
Global min value from data tables

Constructor Detail

DataRowStatisticsIndexMap

public DataRowStatisticsIndexMap(HTMLtools cvt,
                                 java.lang.String dataFilePath)
DataRowStatisticsIndexMap() - Constructor

Parameters:
cvt - is an instance of converter
dataFilePath - is the data .txt file that has an associated .idx IndexMap file with the same base name. The .sidx file to be created has the same base name
Method Detail

setIndexMapTable

public void setIndexMapTable(FileTable ftIM)
setIndexMapTable() - set the IndexMap FileTable ftIM to use if it is already in memory.


setDataTable

public void setDataTable(FileTable ft)
setDataTable() - set the data FileTable ft to use if any if it is already in memory.


setTablePrecision

public void setTablePrecision(int precision)
setTablePrecision() - set number of digits in output statistics. default is 2.


createStatisticsIndexMapFile

public boolean createStatisticsIndexMapFile()
createStatisticsIndexMapFile() - create & write Statistics Index Map file

Returns:
true if succeed

readIndexMapFile

public boolean readIndexMapFile()
readIndexMapFile() - read the IndexMap FileTable to ftIM. This can be used if the IndexMap file has not been read yet.

Returns:
true if succeed with (ftIM, idxStartByte, idxEndByte) set up.

readDataFileHeader

public boolean readDataFileHeader()
readDataFileHeader() - read the data file FileTable to ft.

Returns:
true if succeed with ft set up.

isDropColumn

public boolean isDropColumn(int cTest)
isDropColumn() - test if column cTest is a -drop list column. If not in list of no list, return false.

Parameters:
cTest - is the column index to test
Returns:
true if it is a drop list column.

computeRowStatistics

public boolean computeRowStatistics()
computeRowStatistics() - compute row statistics of data (max,min,mean,stddev).

Returns:
true if succeed

computeGlobalStatistics

public boolean computeGlobalStatistics()
computeGlobalStatistics() - compute global statistics of (max,min,mean,stddev) data from global sums computed in computeRowStatistics.

Returns:
true if succeed

addRowStatisticsToIndexMap

public boolean addRowStatisticsToIndexMap()
addRowStatisticsToIndexMap() - add row statistics(Min,Max,Mean,StdDev) column data to the ftIM index map table after the ftIM table has been read into memory.

Returns:
true if succeed

addGlobalStatisticsToIndexMapHdr

public boolean addGlobalStatisticsToIndexMapHdr()
addGlobalStatisticsToIndexMapHdr() - add 2 header rows hdr[0:1] for global statistics where
  Hdr[0] has names
   ("Global Min", "Global Max", "Global Mean","Global StdDev")
  Hdr[0] has global values
   ( glbMinRowVal, glbMaxRowVal, glbMeanRowVal, glbStdDevRowVal)
  

Returns:
true if succeed

writeStatisticsIndexMap

public boolean writeStatisticsIndexMap(FileTable ftSIM)
writeStatisticsIndexMap() - write extended IndexMap with statistics as a .sidx file.

Returns:
true if succeed