Module 01: The R Workflow

Environment, Dependencies, and Data Foundations

Agenda

Block Topic Format
1 Environment check Interactive
2 How R works Discussion + exploration
3 Packages and where they live Hands-on

Steps completed:


1. Environment Check

Let’s verify everyone has a working setup.

Is R installed?

R.version.string

You should see something like R version 4.x.x.

What is this actually telling us? R is a program—a binary executable—installed somewhere on your computer. When you type R in a terminal or open RStudio, you’re launching that program.

Where is R?

# The R binary location
R.home()

# On Unix-like systems, you can also check from terminal:
# which R

Is Git installed?

system("git --version", intern = TRUE)

If this errors, Git isn’t installed or isn’t on your PATH.

Is Quarto installed?

system("quarto --version", intern = TRUE)

2. How R Works

If you’re using RStudio, you’ll see something like this:

The RStudio IDE

R is an interpreter

When you type code at the console, R:

  1. Reads your input
  2. Evaluates it
  3. Prints the result
  4. Loops back waiting for more

This is called a REPL (Read-Eval-Print Loop).

# Try this in your console:
2 + 2

The R console

There’s no compilation step. R interprets your code line by line.

The R session

When R starts, it creates a session—a running instance of the R interpreter with:

  • A global environment (where your objects live)
  • A working directory (where R looks for files by default)
  • Loaded packages (base R plus whatever you load)
# Your current working directory
getwd()

# Your global environment (probably empty if you just started)
ls()

# What packages are currently loaded?
search()

Everything is an object

In R, everything you create is an object stored in an environment:

x <- 42
f <- function(a) a + 1

# Both are objects
class(x)
class(f)

# Both live in your global environment
ls()

The Environment pane shows all objects in your session:

RStudio

Positron

3. Packages and Where They Live

What is a package?

A package is a bundle of:

  • Functions
  • Documentation
  • Possibly data
  • Metadata (who wrote it, what it depends on)

Where do packages come from?

Source Examples Install command
CRAN dplyr, ggplot2 install.packages("dplyr")
Bioconductor DESeq2, GenomicRanges BiocManager::install("DESeq2")
GitHub Most OHDSI/HADES packages remotes::install_github("OHDSI/CohortMethod")

Where do packages live on your computer?

This is important to understand:

# R searches these directories for packages, in order
.libPaths()

You’ll likely see:

  1. A user library (packages you install)
  2. A system library (packages that came with R)

Let’s explore

# Pick the first library path
lib_path <- .libPaths()[1]

# What's in there?
list.dirs(lib_path, recursive = FALSE) |> head(20)

Each subdirectory is an installed package. A package is literally just a folder with a specific structure.

Installing a package

# This downloads the package and puts it in your user library
install.packages("RSQLite")

# Now it exists on disk
file.exists(file.path(.libPaths()[1], "RSQLite"))

Loading vs installing

Installing = downloading and saving to disk (do once)

Loading = making functions available in your session (do each session)

# Load a package
library(RSQLite)

# Now its functions are available
# Check what's loaded
search()

Wrap-up

Today we covered the foundations:

  1. R is a program: a binary that interprets your code
  2. Sessions: R runs in a session with an environment and working directory
  3. Objects: everything in R is an object
  4. Packages are folders: installed to library paths, loaded into sessions

Next module: Git, dependencies, and working with data.