These release notes contain the following information:
What's New in Insightful Miner 3.0
Supported Platforms and System Requirements
Getting Help in Insightful Miner
Contact Information for Feedback
These Release Notes are current as of June 2003. For updates to the Release Notes that are made after this date and other useful information, please see the Web site:
http://www.insightful.com/support/iminer30/
To familiarize yourself with this release, see the manual Insightful Miner 3 Getting Started Guide. This manual is provided as an Adobe Acrobat PDF file that you can access by selecting Getting Started from the main Help menu in Insightful Miner. The guide references data sets and documents in the examples folder under your Insightful Miner installation directory; you can use these files to familiarize yourself with the functionality in Insightful Miner.
Insightful Miner is now available on Solaris 2.6, 2.7, 2.8, and 2.9. Previously the product was only available on Microsoft Windows.
A variety of components have been added.
Insightful Miner 2.0 uses ODBC to connect to databases. In many cases, a native database driver customized for a particular database provides improved performance.
Insightful Miner 3.0 provides read and write components using native database drivers. On Windows, native database drivers are available for Oracle, DB2, Sybase, and SQL Server. On Solaris, native database drivers are available for Oracle, DB2, and Sybase. For accessing additional databases, read and write components using ODBC are still available on Windows.
The Read Excel File and Write Excel File nodes allow you to read and write Excel data. This functionality was previously provided as part of the Read Other File node.
The Read Excel File component allows specification of the sheet within the Excel file, which contains the data to read.
The Compare component enables you to compare outputs from different nodes. For example, you could compare the predicted results of one node with the measured data from another.
The Transpose component allows you to exchange rows and columns in a dataset.
The Cox Regression component fits Cox proportional hazards models. These models are historically used for survival and reliability analysis, and more recently have been applied in churn modeling.
Insightful Miner 3.0 introduces a new type of link for passing models between components. The model port is indicated by a circle on the lower left and/or right of the node. Links between model ports are displayed as dotted red lines.
The standard links pass rectangular data sets between components. The new model links pass information about a fitted model between components.
The model ports are not currently available for the S-PLUS Script node.
Model nodes and the Import PMML node have output model ports. Predict, Generate C, Export PMML, and Export HTML nodes have input model ports.
The model nodes are: Linear Regression, Logistic Regression, Classification Tree, Regression Tree, Classification Neural Network, Regression Neural Network, K-Means, Naive Bayes, Principal Components, and Cox Regression . Each of these nodes can create and output a model.
The Predict node has been extensively redesigned and enhanced.
Previously the Predict node could only be created using the Create Predictor menu item for a model node. The Predict node can still be created in this manner, but it also can be obtained in the explorer.
The Predict node has a standard port for the data on which to predict, and a model port to indicate the model node to use for prediction. The model node will be run before the Predict node, and the predictions will be performed using the currently computed model.
If the Predict node is attached to a computed model node and then disconnected, a copy of the model will be stored in the Predict node. This copy of the model will not be changed if the model node is changed. A solid black port indicates that a model is stored, while a gray port indicates that the Predict node contains no model or uses the model link to obtain the model.
Sometime the data used in prediction contains category values that were not present in the training data used to construct the model. The Predict node lets the user specify what to do when a previously unknown level is encountered: replace it with a specified level, replace it with NA, or signal an error. Different replacement values can be specified for each categorical column.
The Generate C component generates a set of files that provide C code for predicting from the model. The C code is standard ANSI C, and is stand-alone code that does not require Insightful Miner to run. The code is not platform specific, and is designed to be portable.
The actual model is described in a text format in a model.txt file. This is placed in a text file rather than hard-coded in the C code so that a new model may be substituted by changing the text file, without the need to recompile C code. General data structures are described in IMObjects.c and IMObjects.h , and the model-specific prediction code is in files such as IMTree.c and IMTree.h .
Predictive Modeling Markup Language (PMML) is an XML standard for exchanging descriptions of data mining models. Currently model descriptions in this format are generated by many of the leading data mining products. Import of PMML in commercial products is less widespread, but is expected to become more prevalent in the future.
The Export PMML node can generate PMML for any of the model nodes. The PMML extension mechanism is used to store information used by Insightful Miner that is not available in standard PMML, and to describe models such as Principal Components and Cox Regression that are not defined in PMML.
The Import PMML node is guaranteed to correctly import PMML generated by Insightful Miner.
The Export HTML node creates an HTML report describing the model. Typically this HTML report is also available through the node's custom viewer.
Many Insightful Miner 2.0 nodes have been enhanced.
A Select Table button in each database read/write node allows you to view the available tables in the database, and select one to use.
Insightful Miner 2.0 replaces the existing table when writing to a database. The database write nodes now allow selecting Create New Table (create a table only if it doesn’t exist), Overwrite Table (replace an existing table), or Append To Table (add rows at the end of an existing table).
Read and write are available for SAS 9 files.
The Missing Values component now allows different missing value methods and replacement values to be specified for each column, and it also includes “last value” replacement.
The Outlier Detection component now has an Add Outlier Index Column check box. If this is checked, the output contains a column indicating the original row number of each row.
The Bin component now provides extensive options for specifying the bin cut points for each column.
The Filter Columns dialog has been redesigned to use a list of columns with check boxes indicating whether to output the column rather than a pair of list boxes for inclusion and exclusion.
New functions have been added to the
expression language used by the Create Columns, Filter
Rows, and Split nodes. There are new functions for
translating between doubles, dates, and string (formatDouble, parseDouble, formatDate, parseDate), a
function for calculating the sum of a column (columnSum), and a function for
producing a categorical value (asCategorical).
The Classification Agreement, Lift Chart, and Regression Agreement property dialogs now allow the user to specify which columns to use. This is most useful when these nodes are used with an S-PLUS Script node that has not added role information to its output columns.
The Classification Agreement report includes precision, recall, and F-measure.
The S-PLUS Library is available for use with Insightful Miner when S-Plus 6.1 Release 4 or higher is installed. This library is part of S-Plus, and is available on the S-Plus CD.
The S-PLUS Library is now on a separate tab in the Explorer. This tab will only appear if the S-PLUS Library has been installed.
The Read and Write S-PLUS Data nodes have been revised. These nodes allow data to be read from (written to) either an S-Plus chapter directory or an S-Plus transport file.
If the S-PLUS Library is installed, S-Plus graph nodes equivalent to those S-Plus 6 for Solaris are available. These graph nodes are available in the S-PLUS tab of the Explorer.
Dialogs to create graphs are also available from the Chart menu of the Table View. These dialogs can be used to either immediately create a graph by pressing the Apply button, or to add a component to the worksheet by pressing the Add button.
To avoid running out of memory with large data, the dialogs contain a Max Rows field. If the data contains more than the specified number of rows, simple random sampling will be used to reduce the size of the data.
Each graph dialog has a File tab with controls to specify a File Name and File Type. If a file name is specified, a graph file will be saved when the graph is created.
Selecting the Show Parameters Page check box on
the Options page results in
the Parameters page being
added to the S-PLUS Script properties
dialog. The parameters table is useful for passing arguments to the S-Plus
script. Arguments are entered as name/value pairs that are then passed to the
script as the args
component of the IM
list. The args
component is a named character vector.
The Load button is available to select a text file containing a script to load into the script text area.
The Edit button displays the current content of the script text area in a separate text editor. The Notepad editor is used on Windows and the emacs editor is used on Solaris. A different text editor may be specified in the Global Properties dialog.
The Parse button will test whether the text area contains a complete and legitimate S-Plus script.
Show
Results During Run and Store Results for View
In previous releases the text and graphic results for the S-PLUS Script node were displayed during run, and the viewer was the default Table View. Options are now available to specify different combinations of showing the results when running or displaying them when viewing.
The Show Results During Run check box indicates whether to display results during a run. This is useful when progress messages are intended to be printed during computation.
The Store Results For View check box indicates whether the results are stored for display during view. Displaying the results when viewing rather than running is consistent with the behavior of other nodes. However, it does take additional time and space to keep track of the results.
In Insightful Miner 2.0, it was usually necessary for the S-Plus script to contain code called during the “test phase” to determine the names, types, and roles of output columns, as well as details on the requirements of the script. The Options page now contains a radio button to select Specify in Script to get this behavior, or Specify Here to specify this information in the dialog.
If Specify Here is selected, the user can specify whether to get all of the data in a Single Block, a sample of the rows in a Single Block, or to use Multiple Blocks. The user can also specify the names, types, and roles for the output columns.
A variety of options that were previously available as global application options are now available as worksheet properties saved within the worksheet. These include the options Maximum Categorical Levels, Date Parsing Format, Caching, etc.
The new option Max Megabytes Per Block limits the amount of memory used for storing a data block, to prevent out-of-memory errors. In addition, new options are available for specifying the location of the Default File Directory, used for resolving relative file paths, and several other worksheet related directories.
When a new worksheet is created, all of these properties are filled in with default values. The worksheet properties for the current worksheet can be edited in the File:Properties dialog. The default values for new worksheets can also be changed with this dialog.
The Toggle Orthogonal Links menu item has been renamed Toggle Diagonal Links. The diagonal/orthogonal option is now available on a per-link basis in addition to an overall worksheet setting.
A collection of components may be collapsed together to create a Collection node. Creating a collection may be useful to combine several nodes into a single conceptual unit, or to reduce the space used on a worksheet.
Options are available for changing the node name, icons, help file, tool tips, exposed links, and other properties.
An Annotation may be added to present descriptive text on the worksheet. The Annotation box contains an editable text field. Options are available for the font, color, and background color.
Long node labels now wrap to appear on multiple lines.
Previously, the Table View was available as an individual component, and as the viewer for components with no custom viewer. This viewer is now available for all nodes using the Table View toolbar button or context menu item. For many nodes the Viewer and Table View are the same.
The worksheet file browser for Open and Save includes an Examples button. Pressing this button will copy the example files from the default examples directory to the user's work directory, and will browse to this directory.
The files are copied to assure that they are in a location in which the user has write access.
Folders and nodes in the Explorer are now editable. It is possible to cut, copy, delete, and rename nodes. Folders may be added, and items moved around to reorganize the items in the Explorer.
Each page in the explorer corresponds to a library of components. Three such libraries are the basic Insightful library, User library, and S-PLUS library. Additional libraries can be created and selectively exposed to the user via library management options in the context menu of the explorer tabs.
A menu item is now available to manage the open viewer windows. This lists the current viewers, and allows you to select which ones to close.
The Advanced tab contains an Execute After combo box that can be used to select a node that must be executed prior to the execution of the current node. This provides a way to control the flow of operations when it is important for one node to compute before the other. For example, a Read Text File node can be set to execute after a Write Text File node that writes the data file to be read.
If you have any version of Insightful Miner prior to version 3.0 Release 1, you need a new license key. This release has a new license manager which is not compatible with a pre-existing license manager.
Insightful Miner is supported on the following:
Windows NT 4.0 Service Pack 6 or later
Windows 2000
Windows XP Professional
Windows Server 2003
The minimum recommended system configuration is a Pentium II/300MHz processor, at least 256MB of RAM, and an SVGA or better graphics card and monitor. You must have at least 125MB of free disk space for the typical installation (and, if not installing on drive C:\, an additional 2MB free disk space on drive C:\ to unpack the distribution). Additional disk space is necessary to run the application.
32-bit version for Solaris 2.6, 7, 8, or 9
Before installing Insightful Miner, review the minimum system configuration. To determine the required RAM, sum the base RAM and the per-user RAM multiplied by the number of simultaneous users. For example, a single-user Solaris system should have at least 64 + 32*1 = 96MB of RAM. For a typical installation, you need 225MB of disk space, 64MB of base RAM, and 40MB of per-user RAM.
The base RAM and per-user RAM listings can also be used to calculate minimum swap space requirements. In general, the minimum swap space required is twice the sum of the base RAM and the per-user RAM multiplied by the number of simultaneous users. For example, on a Solaris system with three simultaneous Insightful Miner users, the minimum swap space is 2*(64+(40*3)) = 368MB.
Installation instructions are available in the INSTALL.txt file accompanying these Release Notes. Different versions of this file are available for Windows and Solaris.
After installing Insightful Miner, the Insightful Miner 3.0 Release 1 program group appears as an option under Programs when you click the Start button. This program group contains the following options:
· Insightful Miner 3.0 launches the Insightful Miner application.
· Insightful Miner Release Notes opens the release notes (this document).
· Insightful Miner 3.0 Help opens the help system.
· Insightful License Manager launches the Insightful Miner license manager used for entering license keys.
Some editions of Insightful Miner will have the license manager item in a separate Insightful License Manager program group. See the INSTALL.txt file located in the top level of your Insightful Miner CD for details on requesting and installing license keys.
To start Insightful Miner:
1. From the Start menu, choose Programs.
2. Choose the Insightful Miner 3.0 Release 1 program group.
3. Choose Insightful Miner 3.0.
By default, the installation program places a shortcut to Insightful Miner on your desktop. If you kept this option during installation, you can simply double-click the shortcut icon to launch Insightful Miner.
During installation, a shortcut to the IMiner shell script will be placed in
a location determined by the system administrator. If this script is installed in a commonly available location
in the user’s PATH,
it will be directly available by typing IMiner at the Solaris prompt. This
script is also available at the top level of the Insightful Miner installation
directory.
On Windows, the help system in Insightful Miner uses Microsoft’s current help standard, HTML Help. On Solaris, the help system uses JavaHelp.
To obtain help on a particular component in this release, click the Help button in the properties dialog for the component. Alternatively, you can right-click on a node in your worksheet and choose Help from the context-sensitive menu. To view the table of contents, index, or search pages for the help system, use the options under the Help menu in the Insightful Miner graphical user interface.
You can access online PDF versions of the Getting Started Guide and the User's Guide. In the Insightful Miner main menu, go to Help and select the appropriate manual from the drop-down list.
On Solaris, the X-Windows system is used to display graphical user interfaces. Insightful Miner requires an X-server to be running on the machine displaying the user interface.
To use Insightful Miner on Solaris, you must be able to
connect to your local X window server. Thus, you must have the environmental variable
DISPLAY
set and the X window server on your local machine must allow Insightful Miner
to create windows on your machine; see the Solaris programs xauth or xhost.
Typically, if you can run xclock
on your machine, then Insightful Miner should also be able to access the X
server.
To set your display from a C-like shell (csh, tcsh, etc.),
use the setenv
command:
% setenv DISPLAY <display_name>
where <display_name>
is the name of your local machine. From the Bourne- and Korn-like shells
(including sh,
ksh,
bash,
etc.), use the following commands:
% DISPLAY=<display_name>;export DISPLAY
You do not need to do this if your DISPLAY
variable is set already; check the output from echo $DISPLAY to be sure.
When running Insightful Miner as a batch process, the intent
will often be to do background processing on a machine that does not have the DISPLAY
set. One approach to handling this
situation is to run Xvfb
on the machine running Insightful Miner.
This starts a "virtual frame buffer" that can be used as a
headless display.
Some relevant links are:
http://www.itworld.com/AppDev/1461/UIR000330xvfb/
http://developers.sun.com/solaris/articles/solaris_graphics.html#4
Start Xvfb
and set the display as follows:
% /usr/X11R6/bin/Xvfb :1 -screen 0 800x600x24 &
% setenv DISPLAY :1.0
Source code is available from:
Solaris versions of Insightful Miner may be accessed remotely from a Windows desktop using X-server software. Insightful Miner has been testing with the following products:
Exceed (Hummingbird)
Reflection X (WRQ)
X-Win32 (StarNet)
Cygwin (Redhat Cygwin)
The graphical user interface (GUI) performance of Insightful Miner will depend
upon the X-server software. The performance of the underlying Insightful
Miner engine is not affected.
The following notes describe configurations that have been found to produce the best results.
Version 8.0.0 of Exceed was tested, and no significant performance problems were observed. When launching Insightful Miner from an xterm, the xterm pops in front of the Insightful Miner application when it first displays, and the Insightful Miner application must be clicked with the mouse to recover desktop focus. Fonts were smaller than on other X-servers. Exceed seemed to have the best response time of any of the X-servers.
Version 10.0 of Reflection X was tested, and no performance problems were observed.
When using some versions of
X-Win32 to start-up X-applications on your desktop, we have found that the
"multiple windows" setting causes some latency in GUI response and
some defects in displaying GUI elements. We recommend running in
"single window" mode.
If you are connecting with XDMCP, then running in "single window"
mode should present you with a default window manager from which you may run
Insightful Miner.
If you are connecting with rsh or rexec in "single window"
mode, the application may be displayed in a "raw" X-session that
lacks a window manager. We recommend that you start the Solaris default
window manager (/usr/dt/bin/dtwm)
using either the X-Win32 "Command" option, or from an xterm on the
Solaris host. Once the window manager has been launched, the Insightful
Miner application should run and display successfully.
The open-source Cygwin product
- currently sponsored by Redhat - provides a UNIX-like environment that runs on
top of a Windows operating system. It includes a port of XFree86 that
provides X-server capabilities. The X-server utility is /usr/X11R6/bin/XWin.exe,
and it is generally invoked from a start-up script in the same directory, named
startxwin.sh.
By default, Cygwin installs with the twm window manager, and the startxwin.sh
script uses twm
as the window manager when launching XWin. While this works fairly well,
we have found that the WindowMaker (wmaker.exe) window manager to be
much better in working with Insightful Miner applications. To use the
WindowMaker window manager under Cygwin, make sure that the wmaker package is
included when installing Cygwin, and then, following the installation, you will
need to modify the startxwin.sh script to call wmaker instead
of twm.
Cygwin also offers the FVWM-2 window manager, but that does not work as well as
WindowMaker in displaying the Insightful Miner application.
Recently Cygwin introduced a "multiple window" option for XWin.
This appears to have significant defects at the present time. We
recommend that you avoid using this option.
· If possible, set the Display Colors in the Windows Control Panel to use True Color (32 bit).
· If the Display Colors are set to High Color (16 bit) the dithering of colors will give overly light images when printing.
· If the Display Colors are set to 16 color the application will fail to start properly.
· Insightful Miner 3.0 is developed and extensively tested on the US English version of Windows. It is not tested on other localized versions of Windows.
· Our experience suggests that the application will install and run properly on European Windows versions (such as French, German, and Spanish) that use only 8-bit ASCII characters.
· Internally, characters are represented using an extended character set. However, the read and write functionality only supports 8-bit characters rather than a full Unicode character set.
· Insightful Miner is unlikely to work completely on Asian versions of Windows that utilize multi-byte characters extensively. The Read Text File and Write Text File components provide a Text Encoding option that can be set to UTF-8 to indicate that the text file contains multi-byte characters in the UTF-8 format. The other read and write nodes typically do not handle multi-byte characters.
· The computational engine checks for an interrupt request whenever a computational node is ready to process a new block of data. If a computation takes a long time to perform, the node may not check for the interrupt very quickly. In this case you can stop the computation by saving the worksheet and exiting the application. Note that it is possible to save a worksheet while the network is running, so you can exit without losing worksheet changes.
· Insightful Miner cannot always predict the amount of memory needed for an operation. If the operation requires more memory than is available, Insightful Miner attempts to detect this and generate a descriptive error message. In some cases, the operation will fail without an error message.
·
You can increase the amount of memory the program
attempts to access by using the -Xmx argument when starting the
program. For example, using -Xmx1024m tells the Java virtual machine used to
support the GUI that it can attempt to access 1024MB of RAM for the heap. If
too large a value is specified, the application fails to start.
· If the wrong delimiter is specified when reading a text file, the preview may see the whole row or whole file as a single item and may take a long time to compute.
· The status of a read file or read database node does not change when the file is changed external to the program. To re-read the file, invalidate the node.
· ODBC import and export facilities do not support nchar, nvarchar, or ntext data types. The varchar type is supported.
· The read and write database components stop processing when accessing data from a Windows 2000 machine using the Oracle ODBC driver version 8.1.5. The problem is fixed when the Oracle ODBC driver is upgraded to version 8.1.5.7.
· All Write database nodes convert table names to upper case. The read database nodes match table names without regard to upper or lower case, so a table named "Table23" can be read by specifying a table name of "Table23", or "table23", or "TABLE23". Some databases such as Sybase can contain two distinct tables whose names just differ by case. In this situation, it is possible to retrieve the different tables by using an explicit SQL statement, such as "select * from Table23" or "select * from TABLE23". To prevent accidentally creating two such tables, all of the write database nodes convert the specified table name to upper case.
· When a large number of charts are being viewed, sometimes there is a lag associated with the chart scrolling that allows the charts to fall out of alignment with the headings.
· When fitting a single tree with a large Maximum Rows value, you may need to reduce the value in Stop Splitting Complexity When Complexity Changes to grow a tree.
· The dialog is not able to traverse Windows shortcuts pointing to other directories. If an item does not appear as a folder in the left-hand pane in Windows Explorer, it will not appear as a folder in the file selection dialog.
· Files and directories may not be deleted from this dialog.
Please feel free to contact us with suggestions and questions regarding possible bugs in this release. Bug reports can be sent to the following email address:
We are very interested in receiving your comments and suggestions for improving Insightful Miner.
When reporting bugs, please include any warnings and error messages that you see. There are two locations where Insightful Miner writes status messages:
1. The scrolling text pane at the bottom of the Insightful Miner application window; and
2. The Insightful Miner log file.
As Insightful Miner runs, status messages are displayed in the bottom pane of the application window. Additional messages are recorded in the log file, which is stored in your working directory.
On Windows, the default log file location is C:\Program Files\Insightful\Insightful Miner 3.0 Release 1\users\<username>\logfile.txt. On Solaris, the default log file location is ${HOME}/iminer_work/logfile.txt where ${HOME} is the location of the user’s Solaris home directory.
When reporting bugs, please include a copy of the log file as well as any messages you see in the Insightful Miner application window. Insightful Miner currently overwrites the existing log file each time you launch the application.