14  Saving Data to File



14.1 Key Concepts

In this chapter, we’ll explore the following key concepts:

  • CSV, TSV, & Delimited Files
  • R Data Sets (RDS)
  • RData Format

14.2 Key Takeaways

Too long; didn’t read? Here’s everything you need to know:

  • Importing data is reading data; exporting data is writing data
  • Keep your working directories in mind, all files write to them by default
  • Base R function write.table() writes most common file types
  • Base R function write.csv() writes CSVs, TSVs, and more
  • Package “readr” has write*() functions for each file type
  • Save an object with save() and .rds extensions
  • Save objects and workspaces using extension .RData
  • Write data to your clipboard with writeClipboard()







ATTENTION

Readers are expected to have read the chapter “Getting Data into R”.




14.3 Functions for Writing Data

We read data into R, or import, with read*() functions.

Similarly, we write data from R, or export, with write*() functions.

  • Most base R reading functions have equivalent writing functions
  • Most package “readr” functions also have similar writing functions
  • Packages for Excel, JSON, and other file formats have writing functions


What is writing data?

Simply put, it’s the act of storing data in a location and format of your choosing.

Typically, your data are stored in an object like a matrix or data frame.

It’s simply a matter of exporting a data-laden object.



14.3.1 A Brief Note on Working Directories

Unless otherwise specified, data are written to your working directory by default.

  • You can often specify different paths to save your data with argument file =
  • Print your working directory with function getwd()
  • Change your working directory with function setwd()
  • Create new directories with function dir.create()
  • See contents of directories with function dir()



14.3.2 Base R’s Workhorse Writing Function: write.table()

R’s workhorse reading function, on which other functions depend, is read.table().

R’s workhorse writing function is write.table().


Function wrappers like write.csv() are powered by write.table() under the hood.

write.csv
function (...) 
{
    Call <- match.call(expand.dots = TRUE)
    for (argname in c("append", "col.names", "sep", "dec", "qmethod")) if (!is.null(Call[[argname]])) 
        warning(gettextf("attempt to set '%s' ignored", argname), 
            domain = NA)
    rn <- eval.parent(Call$row.names)
    Call$append <- NULL
    Call$col.names <- if (is.logical(rn) && !rn) 
        TRUE
    else NA
    Call$sep <- ","
    Call$dec <- "."
    Call$qmethod <- "double"
    Call[[1L]] <- quote(utils::write.table)
    eval.parent(Call)
}
<bytecode: 0x000001da0256a920>
<environment: namespace:utils>


If all else fails (and it probably won’t), you can depend on write.table().



14.3.3 Writing Text Files: write.csv() & write_csv()

Comma-separated values (CSV) files are the most common type of output in R.

However, we can use the same write*() functions to create TSVs and more.


Practice Data: Let’s create a simple data frame to practice writing data.

name <- c("Fatimah", "Li", "Arnold", "Fede", "Sly")   # Character vector
weight <- c(61.4, 68.4, 81.8, 79.9, 90.3)             # Double vector
age <- c(29L, 31L, 44L, 33L, 27L)                     # Integer vector

clients <- data.frame(name, weight, age,              # Create data frame
                      stringsAsFactors = FALSE)       # Don't forget this!

clients
     name weight age
1 Fatimah   61.4  29
2      Li   68.4  31
3  Arnold   81.8  44
4    Fede   79.9  33
5     Sly   90.3  27


Base R: Write CSV files using function write.csv().

write.csv(x = clients,                # Write object "clients"
          file = "clients.csv")       # Write file name and extension

Include Extensions: When writing a file, include the extension in the file name.

  • Saving an R script? Include .r
  • Saving an Excel sheet? Include .xlsx
  • Saving a CSV? Include .csv


TSVs & Other Delimiters: You’re not restricted to using commas with write.csv().

write.csv(x = clients,
          file = "clients.tsv",       # Use appropriate extension
          sep = "\t")                 # Save as tab-delimited


Notable Arguments: Function write.csv() has some notable parameters, e.g.

  • x = specifies the name of the object to write
  • file = specifies the output file name; requires quotes and extension
  • sep = specifies the delimiter, e.g. commas, tabs, semicolons, etc.
  • na = specifies the character(s) to use instead of missing values


Package “readr”: Function write_csv() is the same as write.csv() except:

  • Significantly faster at writing data
  • Does not write row names automatically
  • Cannot write files with non-comma delimiters
  • More consistent argument names; file = is now path =


Writing with “readr”: Observe write_csv() in action:

library(readr)

write_csv(x = clients,
          path = "clients.csv")


TSV Files with “readr”: Bummer! Can’t write TSV files with write_csv().

Hark! Package “readr” has function write_tsv() for precisely that!

library(readr)

write_tsv(x = clients,
          path = "clients.tsv")     # Right tool for the job


In fact, package “readr” has writing functions optimized for many file types.

We encourage you to check out each one, e.g. help(write_delim):

  • write_csv()
  • write_csv2()
  • write_delim()
  • write_excel_csv()
  • write_file()
  • write_rds()
  • write_tsv()



Mind blown.



14.4 Saving Your Work: R Datasets (RDS)

Manually restoring your workspace to your former session’s glory is a pain.

Hence, base R has functions save() and load() to save objects locally.


Saving an Object: Save your original object, clients, as a .rds file:

save(clients, 
     file = "clients_object.rds")


Saving More than One Object: List each object first and save as .RData:

save(age, name, weight, clients,
     file = "clients_all.RData")


Saving your Workspace: Save your history and all objects with save.image():

save.image(file = "my_workspace.RData")


Load Objects & Workspaces: Simply input the file name into function load():

load("clients_object.rds")      # Load a single object file
load("clients_all.RData")       # Load multiple objects
load("my_workspace.RData")      # Laod entire workspace


Why save objects and workspaces?

  • Perfectly reproduce your environment for collaborators
  • Load typical header information for scripts, like authors and versions
  • Save and load objects that take a lot of time or computing power to create
  • We haven’t learned many object types, but some are useful when reproduced often



14.5 Writing Data to Statistical Software Files

The “foreign” and “haven” packages help read and write files used in SAS, SPSS, etc.


Package “foreign” uses write.foreign() as a catch-all writing function.

Argument package = accepts “SPSS”, “SAS”, and other software names.

It also has wrapper functions like write_dta() for Sata files, e.g.

library(foreign)

write.foreign(df = clients, 
              datafile = "clients.sas", 
              package = "SAS")


Package “haven” only has four specific functions rather than a single workhorse:

library(haven)

write_dta(clients, "clients.dta")   # Stata
write_sas(clients, "clients.sas")   # SAS
write_sav(clients, "clients.sav")   # SPSS
write_xpt(clients, "clients.xpt")   # SAS, too



14.6 Copying Data to Your Clipboard

Read data from the clipboard with readClipboard(); write with writeClipboard().


The concept, briefly:

  1. When copying text, it goes to your clipboard - the same as writeClipboard()
  2. When pasting text, it comes from your clipboard - the same as readClipboard()


You can write character data to your clipboard from R.


Character Data Only: Here, we’ll copy two objects to the clipboard.

  • Note that only the character data will copy to your clipboard
  • Non-characyet data must be coerced with function as.character()
txt <- "This is a sentences comprised of text."
num <- 2.718

writeClipboard(txt)                 # Accepts character data
writeClipboard(num)                 # Will not accept numeric data
writeClipboard(as.character(num))   # Accepts when coerced to character



14.7 Copying Data to Excel

There’s a “quick and dirty” method to copying and pasting data from R into Excel.

  1. Open the tabular data object with function View()
  2. Highlight each cell, starting at lower-right, ending at upper-left
  3. Right click or Ctrl + C to copy the RStudio Viewer data
  4. Right click or Ctrl + V to paste into Excel


Copy and paste right from your RStudio viewer.



14.8 Further Resources



14.9 Works Cited

  • SpongBob SquarePants
  • Tim & Eric’s Awesome Show
  • Gravity Falls