7 👾 Basics of R syntax

Code Works Code Not Working GIFfrom Code Works GIFs

Okay, now we can start learning about the basics of the R coding syntax.

R is a programming language . This means just as you would with learning any other language, you need to immerse yourself into it by practicing and getting your hands dirty. (There’s a reason that annoying owl doesn’t do well when you 👻 it.) No one knows everything about the R coding language. You learn it over time and with more experience.

Coding in general can be very frustrating 🤬. Even for experienced coders (I would count myself as one of them – I know quite a few coding languages and even know a lot of theory behind coding languages and get pretty frustrated at times). But with more practice, you get more and more used to it.

You should not approach coding simply as memorizing what to type. It also requires a different type of thinking and workflow. What I mean by this is that coding is logical 🖖.

Your computer is dumb 🤪. Coding is telling your computer what to do, step by step ☑️. This means that code needs to follow a particular order and you need to be exact. Many people get frustrated and overwhelmed when they are just staring at a blank script file 🙀. You need to first think about what end result you want and need to work backwards from that. Sometimes, if I am stuck, I just start drawing or writing an outline that starts with the end result I want and I start writing backwards what steps need to be done to get there ✍️.

Additionally, you can’t memorize everything. This means that you will have a lot of times where you are troubleshooting bugs in code. There is no central repository of coding information (including me).

When coding, you need to get in the habit of looking up answers for how to do something or how to fix a bug by searching the internet . I did not get this good at coding because I have someone to ask coding questions to. I became good at coding because I learned how to find resources and documentation for how to do these things myself. The common joke is that a coder needs one screen to write their code and about 20 others to display all of the Google Chrome tabs that they have open. This is very true. The most common forum or site that people use to find answers to coding questions is StackOverflow . I not only have it bookmarked but I search non-coding things and Google’s algorithm often assumes I am looking for something on their site.

So… my advice is:

accept the fact that you will get frustrated during the process of writing your code 😔; but trust me, once you have your end result, you’ll be satisfied 💪.
if you want to develop some independence and really understand what your code does, don’t fall into the habit of immediately asking instructors/friends/classmates/etc. to help diagnose an error or to explain the code. Hop on to Google and start searching around.
- This will be less successful and meaningful in the beginning, but over time, you’ll start to understand what to look for, how, and how to understand some of the jargon in some of these resources over time. Again, this is part of the skillset to develop. 🏋️

Okay… finally, some basics of R code. 😩

7.1 Commenting

Denoted with # in a .qmd code chunk or your .R
R does not evaluate this. So comments can be written in natural language
Used to explain what the code:
- is doing
- notes for yourself or teammates
- help explain and document your code for the future
You must document your code
- This is really important. Don’t just write the code, comment different parts to explain what it is doing
Pick a consistent commenting strategy and stick with it
- Some people only use # to add comments to their code.
- My personal preference is to write my comments like a bulleted list.

```{r}
#| label: one-plus-one
# Notes:
    #* Description: R script to add one to one
    #* Updated: 2023-01-01
        #** by: dcr

# Calculate 1+1
1+1
```

Notice, that I don’t use #|. This is because #| is a special thing in a .qmd code chunk to specify options. I instead use # to represent headers and #* for subheadings in my comments. This allows me to create a bulleted-list-type set of comments that are pretty easy to read and to also follow for teammates.

7.2 Objects

Objects represent the temporarily stored (cached) result of any operation/function in R.

You define objects with the assignment operator (<-).

Here is an example:

Say that I want to evaluate the expression \(1+1\). But say I want to access the result of that expression later. Either because I don’t want to rerun the code or if I want to store the result and do some other things to it later.

I can create an object to store the result of me evaluating the expression \(1+1\). R stores this object in your environment (see Chapter 3) and will stay accessible to you until you close out of RStudio (RStudio Team 2020).

# Define a = 1+1
a <- 1+1

So with the code chunk above, what does a represent?

Let’s check:

print(a)

[1] 2

Now let’s say that I want to evaluate the expression \(1+1+1\) later in my code. But I already evaluated part of this expression with \(1+1\)! Since I stored the result in a, I can just evaluate the expression \(a+1\) which is equivalent to \(1+1+1\).

# Evaluate a+1
b <- a+1

We should expect that \(b=3\):

print(b)

[1] 3

7.2.1 Functions

Functions are a specific type of object.

Functions are simply an object that takes an input and does something to that input. R is full of functions.

Some examples of functions that we have already dealt with:

<-: The input is the object and the name you want to assign the object
+: The input are the numbers you want to sum

Functions more generally are predefined with underlying code (often very complex) that do some task for you. They are a shortcut so that you don’t have to rewrite that same code. Otherwise, everyone’s scripts would be much longer than they are.

As functions are a specific type of object, the whole idea behind a function is to only need to write the code once; rather than multiple times throughout a script.

We can write our own functions (advanced topic we aren’t going to go over), and we can access the functions written by other people to make our life easier.

When someone writes a function and want to share it, they often have other related functions and code. They can package all of this code into a library and submit it to code repositories such as Github or CRAN (where you downloaded R; because what you downloaded came with basic functions like + and <-).

If there is a function that someone has written and packaged, you will need to install the underlying code onto your computer so that you can use it for your own work.

7.2.1.1 Installing packages

To install a package from CRAN (often the default place to download packages from), you can use the built-in function install.packages()

```{r}
# Install some package called FakePackage
install.packages("FakePackage")

# Install some package called FakePackage and another called OtherPackage
install.packages(c("FakePackage", "OtherPackage"))
```

The install.package() function takes an input – the name of the package you want to install (surrounded by quotations) – and it will then perform some action with that input – download the underlying code for that package and function.

Note

The input for this and other functions ARE case-sensitive.

For example, if I ran the following instead:

```{r}
install.packages("fakepackage")
```

I would get an error!

7.2.1.2 Loading packages

Now, how do I use my function?

Well, you’ve downloaded the package containing the function that you want to use, but you will need to now load it to your environment. Why? Well, again, remember that functions are just a type of object. More practically, think about the number of packages some one like me has on my computer. If every time I opened R, it automatically loaded all of that code, then my computer would probably melt. So instead, I just want to specify what I want to load.

There are a couple of options: 1. Most people load the entire package by writing:

```{r}
# Load packages
library(FakePackage) # load FakePackage
```

Modularly loading only the functions you want from your packages with the box package (Rudolph and Schubert 2022):

```{r}
# Load packages
box::use(
    FakePackage = FakePackage[FunctionIWant, OtherFunctionIWant], # Load FunctionIWant and OtherFunctionIWant in FakePackage
    OtherPackage = [...] # load all functions in OtherPackage
)
```

What are the benefits of the second option?

I load only the functions that I actually am going to use
Sometimes functions can be named the same thing, do different things, and are from different packages. This can create a lot of confusion and can even lead to errors in R if R is confused about what to use. Loading things purposefully, you can identify all of this before you try to run it.
Puts less strain on my computer
Easier to read. Now I can see what functions I got from which package rather than it just loading everything from the packages I loaded with library()

To use one of the two options:

Using library(): No extra steps, library() comes with R when you downloaded it – from the base-r package (technically)
Using box::use(): Need to install the box package. And then you can just use the code from above!

install.packages("box")