R
code (top right button if you hover over a code chunk) or click on links.Doing statistical calculation by hand? Tedious & error prone! Computer is faster…
Using spreadsheets? Limited options, change data accidentally…
Using point-and-click software (e.g., SPSS)?
proprietary software = expensive
R = open, extensible (community)
reproducible!
Science/Academia is a marathon and not a sprint
=> it is worthwhile investing in skills with a slow learning curve that will pay off in the long run
R
in your research projects until you’re “good enough”. It’s more fun to use it on “actual” problems, and makes it much easier to learn.You should all have installed R & RStudio by now! Who had problems doing so?
RStudio Interface
Script pane: view, edit, & save your code
Console: here the commands are run and rudimentary output may be provided
Environment: which variables/data are available
Files, plots, help etc.
Console used as calculator
<-
is used to assign values to variables (=
is also possible, but discouraged in R
)
a
, multi
etc. are the variable names (some naming rules, e.g., no whitespace, must not start with an number, many special characters not allowed)
2*3
outputs 6
, but multi <- 2*3
doesn’t)variables can contain basically anything (words, numbers, entire tables of data …)
the variables contain the calculated value (i.e. 101) and not the calculation/formula (100+1)
a + multi
a
nor multi
change their value.R
won’t warn you about this!)This code with sqrt(9)
looked unfamiliar. sqrt()
is an R function that calculates the square root of a number. 9
is the argument that we hand over to the function.
If you want to know what a function does, which arguments it takes, or which output it generates, you can type into the console: ?functionname
This will open the help file in the Help Pane on the lower right of RStudio.
You can also click on a function in the script or console pane and press the F1 key.
Sometimes, the help page can be a bit overwhelming (lots of technical details etc.). It might help you to scroll down to the examples at the bottom to see the function in action!
Functions often take more than one argument (which have names):
You can explicitly name your arguments (check the help file for the argument names!) or just state the values (but these have to be in the correct order then! See help file).
There are a number of functions already included with Base R (i.e., R
after a new installation), but you can greatly extend the power of R
by loading packages (and we will!). Packages can e.g. contain collections of functions someone else wrote, or even data.
You should already have the tidyverse
installed (if not, quickly run install.packages("tidyverse")
:-) )
But installing is not enough to be able to actually use the functions from that package directly. Usually, you also want to load the package with the library()
function. This is the first thing you do at the top of an R
script:
(If you don’t load a package, you have to call functions explicitly by packagename::function
)
R
project together, which will help you to work with files that belong together.PS: R
can deal with folder and file names that contain spaces, but since some programms can’t, it’s best practice not to use whitespaces for file/folder naming.
You will find the current project on the top right corner of RStudio
If you click on the current project, you can open new projects by choosing “Open Project” and select the .Rproj
file of the project.
You can also just double click on .Rproj
files and RStudio will open with the project loaded.
Existing projects
To open a new script, click File \(\to\) New File \(\to\) R Script. (Ctrl + Shift + N
)
To run a line of the script, you can either click Run at the top right of the pane or Ctrl + Enter
. It will run the code that is highlighted/selected or automatically select the current line (or the complete multi-line command).
To run the whole script/chunk, press Ctrl + Shift + Enter
(with full console output) or Ctrl + Shift + S
(limited output).
Using scripts
[1] 1 7 12 4 2
[1] 2.000 6.100 9.234 1.230
[1] "hello" "cake" "biscuit"
c(10, "biscuit", 2.31)
does not work?).c()
function (“combine”).R
is “vectorized”, which allows us to do some funny tricks.To read in data files, you need to know which format these files have, e.g. .txt. or .csv files or some other (proprietary) format. There are packages that enable you to read in data of different formats like Excel (.xlsx).
We will use the files from Fundamentals of Quantitative Analysis: ahi-cesd.csv
and participant-info.csv
. Save these directly in your project folder on your computer (do not open them!).
Did you find the files? Here are the direct links:
Create a new script with the following content:
Run the code!
There are several options to get a glimpse at the data:
Click on dat
and pinfo
in your Environment.
Type View(dat)
into the console or into the script pane and run it.
Run str(dat)
or str(pinfo)
to get an overview of the data.
Run summary(dat)
.
Run head(dat)
, print(dat)
, or even just dat
.
What is the difference between these commands?
What is the difference to the objects/variables, that you assigned/saved in your Environment earlier and these objects?
RStudio’s Environment panel
The two objects we just read in are data frames, which are “tables” of data (they can contain entire data sets). The objects we assigned earlier were simpler (single values, or “one-dimensional” vectors).
Data frames usually have several rows and columns. The columns are the variables and the rows are the observations (more about that later).
This was the first chapter of this workshop! Do you have any questions?
Next:
Comments