程序代写代做代考 Excel Java R: Read the contents of a worksheet into an R data.frame

R: Read the contents of a worksheet into an R data.frame

read.xlsx {xlsx} R Documentation

Read the contents of a worksheet into an R data.frame

Description

Read the contents of a worksheet into an R data.frame.

Usage

read.xlsx(file, sheetIndex, sheetName=NULL, rowIndex=NULL,
startRow=NULL, endRow=NULL, colIndex=NULL,
as.data.frame=TRUE, header=TRUE, colClasses=NA,
keepFormulas=FALSE, encoding=”unknown”, …)

read.xlsx2(file, sheetIndex, sheetName=NULL, startRow=1,
colIndex=NULL, endRow=NULL, as.data.frame=TRUE, header=TRUE,
colClasses=”character”, …)

Arguments

file
the path to the file to read.

sheetIndex
a number representing the sheet index in the workbook.

sheetName
a character string with the sheet name.

rowIndex
a numeric vector indicating the rows you want to extract. If NULL, all rows found will be
extracted, unless startRow or endRow are specified.

colIndex
a numeric vector indicating the cols you want to extract. If NULL, all columns found will be
extracted.

as.data.frame
a logical value indicating if the result should be coerced into a data.frame. If FALSE, the
result is a list with one element for each column.

header
a logical value indicating whether the first row corresponding to the first element of the
rowIndex vector contains the names of the variables.

colClasses
For read.xlsx a character vector that represent the class of each column. Recycled as
necessary, or if the character vector is named, unspecified values are taken to be NA. For
read.xlsx2 see readColumns.

R: Read the contents of a worksheet into an R data.frame http://127.0.0.1:17456/library/xlsx/html/read.xlsx.html

1 of 3 10/8/2016 2:52 PM

keepFormulas
a logical value indicating if Excel formulas should be shown as text in R and not evaluated
before bringing them in.

encoding
encoding to be assumed for input strings. See read.table.

startRow
a number specifying the index of starting row. For read.xlsx this argument is active only if
rowIndex is NULL.

endRow
a number specifying the index of the last row to pull. If NULL, read all the rows in the sheet.
For read.xlsx this argument is active only if rowIndex is NULL.


other arguments to data.frame, for example stringsAsFactors

Details

The read.xlsx function provides a high level API for reading data from an Excel worksheet. It calls several
low level functions in the process. Its goal is to provide the conveniency of read.table by borrowing from its
signature.

The function pulls the value of each non empty cell in the worksheet into a vector of type list by preserving
the data type. If as.data.frame=TRUE, this vector of lists is then formatted into a rectangular shape. Special
care is needed for worksheets with ragged data.

An attempt is made to guess the class type of the variable corresponding to each column in the worksheet
from the type of the first non empty cell in that column. If you need to impose a specific class type on a
variable, use the colClasses argument. It is recommended to specify the column classes and not rely on R to
guess them, unless in very simple cases.

Excel internally stores dates and datetimes as numeric values, and does not keep track of time zones and DST.
When a datetime column is brought into R, it is converted to POSIXct class with a GMT timezone. Occasional
rounding errors may appear and the R and Excel string representation my differ by one second. For
read.xlsx2 bring in a datetime column as a numeric one and then convert to class POSIXct or Date. Also
rounding the POSIXct column in R usually does the trick too.

The read.xlsx2 function does more work in Java so it achieves better performance (an order of magnitude
faster on sheets with 100,000 cells or more). The result of read.xlsx2 will in general be different from
read.xlsx, because internally read.xlsx2 uses readColumns which is tailored for tabular data.

Value

A data.frame or a list, depending on the as.data.frame argument. If some of the columns are read as NA’s
it’s an indication that the colClasses argument has not been set properly.

If the sheet is empty, return NULL. If the sheet does not exist, return an error.

R: Read the contents of a worksheet into an R data.frame http://127.0.0.1:17456/library/xlsx/html/read.xlsx.html

2 of 3 10/8/2016 2:52 PM

Author(s)

Adrian Dragulescu

See Also

write.xlsx for writing xlsx documents. See also readColumns for reading only a set of columns into R.

Examples

## Not run:

file <- system.file("tests", "test_import.xlsx", package = "xlsx") res <- read.xlsx(file, 1) # read first sheet head(res) # NA. Population Income Illiteracy Life.Exp Murder HS.Grad Frost Area # 1 Alabama 3615 3624 2.1 69.05 15.1 41.3 20 50708 # 2 Alaska 365 6315 1.5 69.31 11.3 66.7 152 566432 # 3 Arizona 2212 4530 1.8 70.55 7.8 58.1 15 113417 # 4 Arkansas 2110 3378 1.9 70.66 10.1 39.9 65 51945 # 5 California 21198 5114 1.1 71.71 10.3 62.6 20 156361 # 6 Colorado 2541 4884 0.7 72.06 6.8 63.9 166 103766 # >

# To convert an Excel datetime colum to POSIXct, do something like:
# as.POSIXct((x-25569)*86400, tz=”GMT”, origin=”1970-01-01″)
# For Dates, use a conversion like:
# as.Date(x-25569, origin=”1970-01-01″)

res2 <- read.xlsx2(file, 1) ## End(Not run) [Package xlsx version 0.5.7 Index] R: Read the contents of a worksheet into an R data.frame http://127.0.0.1:17456/library/xlsx/html/read.xlsx.html 3 of 3 10/8/2016 2:52 PM