I have provided you with a LAB-01 RMD template to get you started:
You will use the following functions for this lab:
names() # variable names
head() # preview dataset
length() # vector length (number of elements)
dim(), nrow(), ncol() # dataset dimensions
sum(), summary() # summarize numeric vectors
table() # summarize factors / character vectors
This lab uses city tax parcel data from Syracuse, NY. [ Data Dictionary ]
You can load the dataset by including the following code chunk in your file:
URL <- "https://raw.githubusercontent.com/DS4PS/Data-Science-Class/master/DATA/syr_parcels.csv"
dat <- read.csv( URL, stringsAsFactors=FALSE )
Note that referencing variables in R requires both the dataset name and variable name, separated by the $
operator:
Unlike other stats programs, you can have several datasets loaded at the same time in R. They will often have variables with the same name (if you create a subset, for example, and save it as a new object you will have two datasets with identical names). To avoid conflicts R forces you to use the dataset$variable
convention.
Answer the following questions using the Syracuse parcels dataset and the functions listed.
Your solution should include a written response to the question, as well as the code used to generate the result.
dataset dimensions: dim() or nrow()
sum() over the numeric acres vector
sum() over the vacantbuil logical vector
sum() plus length() functions withthe logical tax.exempt vector
table() with the neighborhood variable
table() with the neighborhood and land_use variables
HELPFUL HINTS:
When you apply a sum() function to a numeric vector it returns the sum of all elements in the vector.
When you apply a sum() function to a logical vector, it will count all of the TRUEs:
x <- c( TRUE, TRUE, FALSE, FALSE, FALSE )
sum( x ) # number of TRUEs
sum( x ) / length( x ) # proportion of TRUEs
R wants to make sure you are aware of missing values, so it will return NA (not available) for functions performed on vectors with missing values.
Add the ‘NA remove’ argument (na.rm=TRUE
) to functions to ignore missing values:
When you have completed your assignment, knit your RMD file to generate your rendered HTML file. Platforms like BlackBoard and Canvas often disallow you from submitting HTML files when there is embedded computer code, so create a zipped folder with both the RMD and HTML files.
Login to Canvas at http://canvas.asu.edu and navigate to the assignments tab in the course repository. Upload your zipped folder to the appropriate lab submission link.
Remember to:
See Google’s R Style Guide for examples.
If you are having problems with your RMD file, visit the RMD File Styles and Knitting Tips manual.
Note that when you knit a file, it starts from a blank slate. You might have packages loaded or datasets active on your local machine, so you can run code chunks fine. But when you knit you might get errors that functions cannot be located or datasets don’t exist. Be sure that you have included chunks to load these in your RMD file.
Your RMD file will not knit if you have errors in your code. If you get stuck on a question, just add eval=F
to the code chunk and it will be ignored when you knit your file. That way I can give you credit for attempting the question and provide guidance on fixing the problem.