2. Basic Operations in R: A Technical Guide

R
Basics
Operations
Author

Lorenzo Longobardi

Published

November 6, 2024

1. The R Console: Understanding Your Work Environment 🔢

When you open R, you’ll see the console window where you type commands and see results. Each command you type is called an “expression” and R evaluates it immediately when you press Enter.

Basic Mathematical Operations

Let’s look at how R handles calculations and what each symbol means:

When you run this command: 1. R reads the numbers on either side of the + 2. Performs the addition 3. Returns the result 4. The # symbol starts a comment - R ignores everything after it

Let’s try multiple operations:

This result shows that R follows standard mathematical order of operations: 1. Multiplication (*) happens before addition (+) 2. So R first calculates 3 * 2 = 6 3. Then adds 5 to get 5 + 6 = 11

To change this order, use parentheses:

Here’s what happened: 1. Parentheses tell R “do this first” 2. So R calculates 5 + 3 = 8 3. Then multiplies by 2 to get 8 * 2 = 16

Understanding Decimal Numbers

R uses decimals (floating-point numbers) by default:

Important points about decimals in R: - You can write numbers with or without .0 - R will show as many decimal places as needed - Use round() to control decimal places:

Error Messages and Troubleshooting

You’ll sometimes see error messages. Let’s look at common ones:

When you see an error: 1. Read the message carefully 2. Check for missing operators (+, -, *, /) 3. Check for matching parentheses 4. Look for missing quotation marks 5. Verify no spaces in numbers (1000 not 1 000)

Console Tips
  • Use the up/down arrow keys to recall previous commands
  • Press Esc to cancel a command
  • Press Ctrl+L (Cmd+L on Mac) to clear the console
  • Type ?function_name to get help (e.g., ?round)

2. Variables: Understanding Data Storage 📦

Variables are containers that store values in R. Understanding how R handles variables is crucial for working with data effectively.

The Assignment Operator

R uses <- or = for assignment. While both work, <- is preferred in R:

What happens during assignment: 1. R evaluates the right side first 2. Creates a name in memory (left side) 3. Stores the value in that memory location 4. The variable name now points to that value

You can see all variables in memory:

Variable Naming Rules

R has specific rules for variable names:

Rules for variable names: 1. Must start with a letter or . 2. Can contain letters, numbers, _ and . 3. Cannot contain spaces or - 4. Case sensitive (weight ≠ Weight) 5. Cannot use R’s reserved words (if, for, etc.)

Understanding Variable Updates

Variables can be modified after creation:

What happens in the update: 1. R reads the current value of count (10) 2. Adds 5 to it 3. Stores the new value (15) back in count 4. The old value is removed from memory

Multiple Assignments

You can work with multiple variables:

Each variable: 1. Exists independently in memory 2. Can be used in calculations 3. Can be updated separately 4. Remains until removed or overwritten

Common Variable Mistakes
  1. Misspelling variable names
  2. Using different capitalization
  3. Not updating variables when needed
  4. Using reserved words as names
  5. Forgetting to assign results to a variable

3. Understanding Data Types 📝

R has several basic data types. Understanding these is crucial for handling different kinds of data correctly.

Numeric Data (Numbers)

R has two types of numbers: 1. Double (decimal numbers) 2. Integer (whole numbers)

Key points about numbers: - Doubles are default - Integers use less memory - Conversion may lose precision - Mathematical operations work with both

Character Data (Text)

Text data must be in quotes:

What’s happening: 1. R stores text as “strings” 2. Each character takes memory 3. paste() adds spaces between items 4. paste0() concatenates directly

Logical Data (TRUE/FALSE)

Logical values are essential for comparisons:

Understanding logical operations: 1. TRUE and FALSE are special values 2. Comparisons return logical values 3. Can be used in calculations (TRUE=1, FALSE=0) 4. Essential for filtering data

Type Conversion

R can convert between types:

Important conversion rules: 1. Not all conversions are possible 2. Failed conversions create NA 3. Check results after converting 4. Use appropriate conversion function

Checking Data Types

Use these functions to check types: - class() - Basic type - typeof() - Internal type - is.numeric() - Type test - str() - Structure and type

4. Working with Vectors 📚

Vectors are one-dimensional arrays of data. They’re fundamental to R programming.

Creating Vectors

The c() function combines values into a vector:

What happens: 1. R allocates memory for all values 2. Values must be same type 3. Vector length is number of elements 4. Elements are ordered and indexed

Vector Operations

Operations apply to all elements:

Vector rules: 1. Operations apply element-wise 2. Vectors must be same length 3. Shorter vector is recycled if needed 4. Warning if lengths don’t match exactly

Accessing Vector Elements

Elements are accessed by position:

Understanding indexing: 1. Positions start at 1 (not 0) 2. Use [ ] for indexing 3. Can select multiple elements 4. Negative indices remove elements

Vector Functions

R has many functions for vectors:

Common vector operations: 1. Statistical functions 2. Mathematical operations 3. Sorting and ordering 4. Finding unique values

5. Understanding Functions 🛠️

Functions are reusable blocks of code that perform specific tasks.

Using Built-in Functions

R has many built-in functions:

Function components: 1. Function name 2. Parentheses () 3. Arguments inside parentheses 4. Returns a value

Function Arguments

Functions can have multiple arguments:

Understanding arguments: 1. Can be positional or named 2. Some have default values 3. Order matters for positional 4. Names make code clearer

Getting Help with Functions

R’s help system is comprehensive:

Help pages show: 1. Function purpose 2. Required arguments 3. Optional arguments 4. Return value 5. Examples

Practice Exercises 💪

Exercise 1: Calculate Average Catch

Using this week’s daily catch data (in kg), calculate the average catch excluding missing values.

Click to see solution

Exercise 2: Analyzing Species Composition

Here’s the data for species caught during a sampling day. How many tuna were caught?

Click to see solution

Exercise 3: Finding Large Fish

Given these fish lengths, identify which ones are above average:

Click to see solution

Exercise 4: Working with Multiple Variables

Analyze this catch data to find: 1. Total catch weight 2. Average weight per fish 3. Maximum weight

Click to see solution

Exercise 5: Data Filtering

Find all catches over 2kg and count how many there are:

Click to see solution

Exercise 6: Multiple Operations

For this set of lengths, calculate: 1. The range (max - min) 2. The variance 3. How many fish are between 40 and 50 cm

Click to see solution

Key changes: 1. Used HTML <details> tags to hide solutions 2. Set eval: false in solution chunks to prevent automatic execution 3. Kept exercise chunks for practice 4. Added clear instructions in each exercise 5. Provided comments in solutions to explain steps

Would you like me to modify this format further or add more exercises? Each exercise now: 1. Shows the initial data 2. Provides clear instructions 3. Has space for practice 4. Shows the complete solution 5. Often includes explanatory output using cat() 6. Builds in complexity progressively

Would you like me to add more exercises or modify these in any way?

7. Advanced Vector Operations 🚀

Let’s explore more complex operations with vectors that you’ll commonly use in data analysis.

Vector Arithmetic with Different Lengths

Understanding how R handles vectors of different lengths is crucial:

What’s happening here: 1. R sees vectors of different lengths 2. Shorter vector is recycled to match longer 3. 10 is added to 1,3 and 20 to 2,4 4. Warning if longer length isn’t multiple of shorter

Vector Recycling Rules
  • Shorter vector is repeated
  • Must be even division to avoid warning
  • Can lead to unexpected results
  • Best to use equal length vectors

Logical Operations with Vectors

Complex filtering often requires multiple conditions:

Understanding logical operators: - & AND: both conditions must be TRUE - | OR: at least one condition TRUE - ! NOT: reverses TRUE/FALSE - Use parentheses to group conditions

Missing Values (NA)

Real data often has missing values:

Key points about NA: 1. NA propagates through calculations 2. Use na.rm=TRUE to ignore NAs 3. is.na() identifies missing values 4. Different from zero or empty string

Vector Sorting and Ordering

R provides multiple ways to sort data:

Understanding sort vs order: 1. sort() returns sorted values 2. order() returns positions 3. order() useful for sorting multiple vectors 4. Can sort ascending or descending

8. Understanding Matrices 📊

Matrices are 2-dimensional arrays. They’re useful for tabular data:

Matrix properties: 1. All elements same type 2. Rectangular structure 3. Can have row/column names 4. Accessed by row,column index

Matrix Operations

9. Real-World Data Exercise 🌊

Let’s analyze a complete fishing dataset by combining several operations. You’ll need to: 1. Calculate the average catch (excluding missing values) 2. Identify which catches were above average 3. Count how many tuna were caught

Click for hints
  • For step 1: Use mean() with na.rm = TRUE
  • For step 2: Use > to compare values with the average
  • For step 3: Use sum() with species == "tuna"
Click to see solution

Additional Challenge

Once you’ve completed the main tasks, try these additional analyses: 1. What percentage of catches were above average? 2. What is the average catch weight specifically for tuna? 3. On which days (positions) were tuna caught?

Click to see additional challenge solution

10. Additional Resources and Next Steps 📚

After mastering these basics, you can:

  1. Learn Data Frames: The next step for handling real datasets
  2. Explore Visualization: Create plots with ggplot2
  3. Study Statistics: R’s statistical functions
  4. Write Functions: Create your own reusable code
Practice Tips
  1. Type code yourself, don’t copy-paste
  2. Experiment with different values
  3. Try to predict results before running
  4. Use help documentation regularly
  5. Keep notes on new functions

Remember: R has a helpful community. Use resources like: - R documentation (?function_name) - RStudio Community forums - Stack Overflow with [r] tag - R-bloggers for tutorials

Next: Understanding Data Frames