1.3 Programing in R

OK, let’s get started! First you need to import your data into R using the import function from the rio package, so make sure you install rio and call the library rio. Wait! What does all this mean? Great question. Before we get to how you import data, let’s clarify some key concepts in order to speak the same language.

Functions are a simple If Then code. When you set your alarm, you are basically using the function: If DD/MM/YYYY HH:MM:SS = X Then Play Song Y. Packages are primarily a collections of R functions. In addition to functions, packages can also contain data and some other details, all compiled in a well-defined format, created to add specific functionality. There are 10,000+ user contributed packages and growing. Library is the directory where the packages are installed. Does that mean installing R is nothing without its libraries and packages? Not really, there are a set of standard (or base) packages which are considered part of the R source code and automatically available as part of your R installation. Think of R as a scientific calculator, the base functions as the +, - operations, the base packages as the mean or standard deviation calculations, the additional packages as installed programs to conduct specific regressions like fixed effects or a Poisson model. One other important about libraries is the difference between installing and loading libraries. Think of a library as book. Installing it is like buying a book and putting it on your shelf. Each time you want to reach the book you still need to get the book off the shelf, that’s what calling a library is.

Let’s install then call our first library rio. This is often the library used for data import and export.

install.packages("rio")  #Installing rio. You only need to do this step once
library(rio) #Calling rio. You need to do this each time you create a new R Markdown file. 

In writing code, you would probably have your first code chunk that calls all the libraries you will be using and install the ones that are not installed yet. I realize we did not talk about code chunks. It is basically the part of editor section that contains your code. All code chunks start and end with ``` – three backticks or graves. But you can also just insert it clicking on the Insert button you see in the middle of your editor.

Finally, we will now import data. There are many ways you can this, but I personally prefer importing from the same folder what my R Markdown is saved. This helps me avoid any confusion with working directories. I usually never run into problems using this method. Basically, for every project create a folder, put all your data in that folder, and save your .rmd file in it (.rmd is the extension of an R Markdown file). If you this you should be able to see your datasets if you click on Files on the lower right section of the interface. Using this method, all you need to do when importing or reading your data files is to write the data name.

Here, I am importing an spss dataset (.sav) called ‘sesame13’. I then store the data imported in an object called sesame13. Notice that I also set the class to be "tbl_df" which returns a tibble (i.e., tibble::tibble()) instead of a standard data frame, which mostly just has nicer printing.

sesame13 <- import('data/sesame13.sav', 
                   setclass = "tbl_df")

The next thing we want to do is take a look at the data but without loading all of it. We use the function head to look at the first 6 lines. You can also use tail to look at the bottom 6 lines.

head(sesame13)
## # A tibble: 6 x 20
##      id  site   sex   age viewing setting treatmen prebody prelet preform
##   <dbl> <dbl> <dbl> <dbl>   <dbl>   <dbl>    <dbl>   <dbl>  <dbl>   <dbl>
## 1     1     1     1    66       1       1        1      16     23      12
## 2     2     1     2    67       3       1        1      30     26       9
## 3     3     1     1    56       3       1        0      22     14       9
## 4     4     1     1    49       1       1        0      23     11      10
## 5     5     1     1    69       4       1        0      32     47      15
## 6     6     1     2    54       3       1        0      29     26      10
## # … with 10 more variables: prenumb <dbl>, prerelat <dbl>, preclass <dbl>,
## #   postbody <dbl>, postlet <dbl>, postform <dbl>, postnumb <dbl>,
## #   postrel <dbl>, postclas <dbl>, ppvt <dbl>
#tail (sesame13)  #You can use the symbol # to comment your code to avoid executing it. You can also use it to add comments and notes for you and your collaborators.