-
Notifications
You must be signed in to change notification settings - Fork 0
/
02-easy-start.Rmd
200 lines (154 loc) · 9.02 KB
/
02-easy-start.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
# It's easy to get started
::: {.goals}
- Installing R and RStudio
- Creating a new project in RStudio
- Running commands in the R console and from a script file
- R's basic operators
- Using functions
:::
Getting R to run on your computer is usually a relatively smooth
experience[^easy-start-1]: Download and install the latest version of **R**
([https://r-project.org/](https://www.r-project.org/)) and **RStudio**
([https://rstudio.com/products/rstudio](https://www.rstudio.com/products/rstudio/)) for
your operating system. If you already have either program installed on your computer,
there is no need to uninstall the old versions beforehand.
[^easy-start-1]: It can be more complicated if you are working on a machine with
restricted permissions, for example a computer issued to you by your employer. If you
are in a situation like this, and you are experiencing problems with the installation,
please contact your IT department to sort it out.
If you cannot or do not want to install anything on your machine, you can instead make an
account at **RStudio Cloud** (<https://rstudio.cloud/>). This lets you to use R and
RStudio in the browser. You can use RStudio Cloud for free for a few hours each month. For
the purpose of working through this book that should be enough.
That's it! Everything else you need can be set up from within RStudio.
::: {.more}
**Additional Software**
If you are using Windows, you might eventually also have to install **RTools**
(<https://cran.r-project.org/bin/windows/Rtools/>). This includes additional tools, such
as a C/C++ compiler, that you will need for things like Bayesian statistics.
For reproducible research, you should also install **Git** (<https://git-scm.com/>), a
version control system which we will briefly talk about later. For now, you do not have to
worry about these things, because you can always do them later.
:::
::: {.exercise .required}
Use one of the methods suggested above to get access to R and RStudio so you are ready to
follow along for the rest of the book.
:::
## Why RStudio?
You might be wondering what RStudio is and why we need it. RStudio is a popular
**Integrated Development Environment (IDE)** for R. It is an *all-in-one* user interface
that provides many conveniences that make working with R a lot easier. To use an analogy:
If R were a horse, RStudio would be a saddle to sit in. You don't need it, but it makes
things more comfortable. There are good alternatives to RStudio, but R's own GUI is not
really one of them:
```{r, echo=FALSE, fig.cap="R's own GUI --- The 90s called, they want their user interface back.", fig.alt="A screenshot of RGui"}
knitr::include_graphics("images/r-gui.jpeg")
```
## Create a new RStudio project
Alright cowboy, let's saddle that horse! 🤠
The first step you should do whenever you start a new data analysis project with RStudio
is to create an **RStudio project** in a new folder. This is the first step towards
reproducibility: All of your project's data and files will be stored in this folder. If
you ever move the project folder to a new location or share it with someone else, all of
the files will move along with it, reducing the chance that things will break. Another
benefit of using projects is that it helps you to keep your work organized. It will be
much easier for you to switch between projects or get back to a project you haven't
touched in a while.
::: {.exercise}
Open RStudio, go to `File ▸ New Project...` and create a new RStudio project in an empty
folder. Leave *Use version control* and *Use renv* unchecked for now --- we will get back
to those later.
:::
Now that we have our project set up, let's have a look around RStudio. The many panels and
tabs in RStudio may be overwhelming at first, but I promise this gets better over time,
and most of them are actually quite helpful. The console, for example, allows you to
interact with R directly.
```{r, echo=FALSE, fig.cap="RStudio's user interface with the four panels", fig.alt="A screen capture of the RStudio user interface"}
knitr::include_graphics("images/rstudio-layout.gif")
```
## Using R as a calculator
The **console** is the panel on the bottom left, the one with the expectantly blinking
cursor. In the console you can type commands and when you press <kbd>Enter</kbd>, R will
try to interpret them. You can also use <kbd>↑</kbd> and <kbd>↓</kbd> to cycle through
previous commands and <kbd>Ctrl</kbd>/<kbd>Cmd</kbd> + <kbd>↑</kbd> to search through
previous commands.
#### Using operators {.unnumbered}
Let's get familiar with the console by trying some of R's mathematical **operators**. R
has all the operators that you would expect to find in a pocket calculator:
| | |
|--------------------------------------|-------------------|
| Arithmetic | `+ - * / ^` |
| Modular arithmetic | `%% %/%` |
| Relational | `< > <= >= == !=` |
| Logical | `&& || ! | &` |
| Parentheses for modifying precedence | `(1 + 2) * 10` |
#### Using functions {.unnumbered}
It also has all the **functions** you would expect to find in a calculator. Functions are
written as the function name followed by the function's argument in parentheses. For
example, for calculating the square root of 2, use `sqrt(2)`.
If a function has multiple **arguments** they will be divided by commas, as in
`log(2, base = 10)`, which will give you the logarithm of 2 to the base of 10. Note, that
we supply the base as a **named argument**. Functions expect arguments in a certain order.
By using named arguments you can override this order, and you can supply individual
arguments while using the **default value** for other arguments (e.g., the default for
`base` in the `log()` function is Euler's number $e$). This is common for functions with
multiple arguments. In our example, using the named argument isn't really necessary,
because we are providing the arguments in the expected order, but it may still help to
make your code more readable.
::: {.exercise}
In the *console*, try out some of R's operators and see what happens. For example,
1. What does `1 != 1.0` do?
2. What happens if you run an incomplete command like `1 *`
3. What happens if you run a command that does not make sense, like `1*/ 3`
::: {.answer}
1. R checks whether the argument on the left side (`1`) is *not equal* to the argument on
the right (`1.0`). It returns `FALSE` because to R these two numbers are identical.
2. R waits for you to finish the command. Press <kbd>Esc</kbd> to terminate the
incomplete command.
3. R throws an error message and tries to tell you what went wrong. If you are working
with R, you will have to get used to getting lots of error messages.
:::
:::
```{r, echo=FALSE, fig.cap="Don't worry -- you can't hurt R, it has been through this before...", fig.alt="Hello Operator Meme", fig.align = 'center'}
knitr::include_graphics("images/operator-meme.jpeg")
```
## Create a script
The console is useful for running quick throw-away commands, and sometimes that's all you
need. However, code that you want to keep should be stored in a **script**. We will use a
special type of script, called an **RMarkdown Notebook**.
**RMarkdown** allows you to do [literate
programming](https://en.wikipedia.org/wiki/Literate_programming). That is, it lets you mix
code and the output the code generates with text. The text can be formatted with a special
syntax to create headings, footnotes, links, and more. This allows you to keep your notes
and your code together and is great for writing reproducible documents. In fact, this
entire book was written with *RMarkdown*.
::: {.exercise}
Go to `File ▸ New File ▸ R Notebook` to create a new RMarkdown Notebook and save it in
your project's working directory as `analysis.Rmd`. Note that the file has appeared in
RStudio's `Files` tab, which shows you the content of your project folder.
:::
The new notebook already contains some example content. You can see that there is R code
that is written inside special blocks, called **code chunks**. You can create new code
chunks with the green button on the top right of the script panel or with the keyboard
shortcut <kbd>Ctrl</kbd>/<kbd>Cmd</kbd> + <kbd>Option</kbd>/<kbd>Alt</kbd> + <kbd>I</kbd>
R scripts still allow a flexible console-like workflow because you can run individual
sections or lines of code in any order. You can...
- **run an entire code chunk** by clicking the green arrow on the right of the chunk.
- **run individual lines or sections of code** by using <kbd>Ctrl</kbd>/<kbd>Cmd</kbd> +
<kbd>Enter</kbd> or the buttons on the top right of the script panel.
::: {.exercise}
Add a new *code chunk* to your R Markdown Notebook in which you calculate the square of
`2021` as well as the square root of `1764`. Then run the chunk.
::: {.answer}
The code chunk in `analysis.Rmd` should look like this.
```{r}`r ''`
2021^2
sqrt(1764)
```
When you run the chunk, the output should be
```{r, echo=FALSE, results='hold'}
2021^2
sqrt(1764)
```
:::
:::