|
Data that has already been normalized/summarized outside of S+ArrayAnalyzer
can be imported
for differential expression testing. The data will need to be imported and then
put into the appropriate
object type for analysis in S-PLUS. In most cases, the data to be imported has
the simple matrix
structure of genes represented in the rows and different chips represented in
the columns of the data.
This type of data can be imported through the graphical user interface in windows
and using the
command line in other versions. Both operations will be shown.
Import from the Graphical User Interface
In Windows it is very easy to import data using the graphical user interface.
After S-PLUS is
open and the S+ArrayAnalyzer module has been loaded, select File>Import data…>From
File…

Select Browse from the resulting Dialog to select your file:

Navigate to the directory that contains your data using the Select file to
import dialog box.
After you have selected your file and pressed "OK" the file name and
path will appear in the
"File Name:" box on the Import From File dialog. Make sure that the
type of data you are
importing is correctly selected under the "Data Type:" drop down menu.

The "Data set" box will allow you to name your dataset. Simply enter
a name in the space provided.
If there are existing dataframes in the current working directory the will be
shown in the drop down list.
This is useful if you want to append data to an existing dataframe. In our case
we want to import
new data so we will check "Create new data set" and assign a new name
in the "Data set" box.
Use the update preview button to examine what the data will look like on import.
Just like it sounds,
this button enables you to get a quick peak at the data before performing an
entire import. This allows
you to make sure things like the column and row names are being read in and
placed correctly. A
quick look at the example data shows that we need to move the first row over
to make sure we
have the gene names set as the rownames in the resulting dataframe.

In order to move the gene names over into the rownames space select the "Options"
Tab.
Simply select the "Row name col" setting and change it from the default
value of "Auto"
to the numeric value 1.

Again, return and examine the results of selecting "Update Preview"
under the "Data Specs" tab.
It now appears that our gene names our correctly placed in the row names position.
Other settings
for changing the delimiter start column and row and filtering the data are also
available under the
"Options" and "Filter" tabs. Those tabs will not be covered
in more detail here. Simply select
"ok" and the file will import to the specified name.
Import from the Command Line
From the command line the import of data can be completed in one simple line
of code
encompassing all the options seen in the GUI import. The real difference is
that the final
result cannot be previewed. The command below will import the data in the same
fashion
as just performed in the GUI:
filename<-importData(file="fidler.txt",type="ASCII",
delimiter="\t",colNameRow=1,rowNamesCol=1,stringsAsFactors=F)
We can then examine the first few rows of the imported file.
> filename[1:5,1:5]
X20245 X20246 X20247 X30308 X30309
212466_at 5.703759 5.865593 5.403050 5.721424 5.771488
212467_at 8.758925 8.423956 8.769344 8.427560 8.883989
212468_at 6.605608 5.897529 6.368558 5.168173 5.828893
212469_at 8.470683 8.581691 8.622024 8.593048 8.593915
212470_at 8.887343 8.188181 8.483683 8.316764 8.70789
Creation of the exprSet Object
In order to put the data into an object type supported by S+ArrayAnalyzer some
additional
command line code must be configured and executed regardless of the import method.
The best object type for previously summarized and/or normalized objects is
the exprSet
object. Full details on the exprSet object are available under the help menu
(Help>Available Help>arrayanalyzer) by looking under the keyword exprSet
in the index.
The creation of the exprSet can be done in S-PLUS in two basic steps. First
we will create
the phenoData object and then the exprSet. The phenoData object (see keyword
phenoData)
describes the experimental conditions used in creating the data. More specifically,
the
phenoData object describes how the columns (or chips) of the imported dataframe
are
organized in the experiment. In our example we have two conditions "MUTANT"
and
"WILD-TYPE" that equally divide the 42 columns (or chips) in the experiment.
The following
command will create a list of factors for use in creating the phenoData object.
factorlist<-rep(c("MUT","WT"),c(21,21))
Look at the factorlist:
>factorlist
[1] "MUT" "MUT" "MUT" "MUT" "MUT" "MUT" "MUT"
[8] "MUT" "MUT" "MUT" "MUT" "MUT" "MUT" "MUT"
[15] "MUT" "MUT" "MUT" "MUT" "MUT" "MUT" "MUT"
[22] "WT" "WT" "WT" "WT" "WT" "WT" "WT"
[29] "WT" "WT" "WT" "WT" "WT" "WT" "WT"
[36] "WT" "WT" "WT" "WT" "WT" "WT" "WT"
Next we create the phenoData object and use the factorlist as the pData slot
and
column names from the dataframe as the varLabels slot.
pd <- new("phenoData", pData=data.frame(factorlist), varLabels=dimnames(filename)[[2]])
Finally, to create the exprSet object:
myExprSetObj <- new("exprSet", exprs=as.matrix(filename), phenoData=pd)
This object can then be used in differential expression testing through the
normal S+Arrayanalyzer
graphical user interface. An import slot to remember to set on your exprSet
object is the "annotation" slot.
This slot contains character data that tells the GUI which annotation library
to use. For example,
when using Affymetrix chips one could set the chip name in order to have the
annotation information
available in the graphical results of any differential expression testing.
myExprSetObj@annotation<-"mgu74av2"
Quick Access Script
The above script is provided below without comments and can be cut and paste
from
this web page into a script window in S-PLUS for quick editing.
# import file
filename<-importData(file="fidler.txt",type="ASCII",
delimiter="\t",colNameRow=1,rowNamesCol=1,stringsAsFactors=F)
# create factor list
factorlist<-rep(c("MUT","WT"),c(21,21))
# create phenoData object
pd <- new("phenoData", pData=data.frame(factorlist), varLabels=dimnames(filename)[[2]])
#create exprSet
myExprSetObj <- new("exprSet", exprs=as.matrix(filename), phenoData=pd)
# assign annotation slot
myExprSetObj@annotation<-"mgu74av2"
|