3  Learn (As You Go)

3.1 Find Functions

How would you discover functions for your specific needs?

Books

There are many great books on R that will expose you to functions. But, if you learn best by doing, the most effective technical book will be the one designed toward your specific task.

Large-Language-Models

The hottest new method is to simply ask LLM’s.

Search Engines

Google is a great search engine that most R programmers use when learning the R language. If we search “r capitalize first letter” we see, on 2025-04-14, the following paragraph as the first result:

Convert First letter of every word to Uppercase in R Programming – str_to_title() Function. str_to_title() Function in R Language is used to convert the first letter of every word of a string to Uppercase and the rest of the letters are converted to lower case.

The trick is to, within Google, always write r before a question or the desired command, like how to capitalize first letter or simply capitalize first letter.

This is a simple example. Most of the time it can be difficult to write in English what you want. This will come with time and practice. At first you may find that the Google search results have nothing to do with what you need. That is a sign to re-word your search, or, if you’ve already re-worded your search, it may be a sign that there is no dedicated function for what you need, or that a different approach is needed. It’s rare that there will be no dedicated function so long as your goal is simple. You may find that it is effective to break down what you’re doing into simple steps, and then search for how to do those steps, as opposed to Googling something long and complicated, involving many steps.

Stack Overflow

Speaking of breaking down something complicated so that a search engine like Google can understand it, this is also necessary for others to understand it. For learning R, allowing others to understand your challenge or need is valuable as the R community is not only willing, but also quickly able to help. R users mainly help each other through Stack Overflow. It is a website that easily allows users to ask or answer questions with code, have their code formatted (look nice), and receive feedback.

The main draw of Stack Overflow is that the person asking the question has one main responsibility, and that is to produce what is called a minimally reproducible example: an example that can be used (reproduced) by someone else seeing the question, and that does not have unnecessary detail irrelevant to the question (minimal). Here’s a short example.

Knowing how to make a reprex (minimally reproducible example) is the majority of the work involved in asking a question on Stack Overflow.

Creating Minimally Reproducible Examples

If your question involves data frames, you need to learn how to build a data frame before asking your question on Stack Overflow. To build a data frame, you can use the tibble() function from package tibble.

If you have 2 numeric columns, like in

# A tibble: 12 × 2
    year lifeExp
   <dbl> <chr>  
 1  1952 28.801 
 2  1957 30.332 
 3  1962 31.997 
 4  1967 34.02  
 5  1972 36.088 
 6  1977 38.438 
 7  1952 68.75  
 8  1957 69.96  
 9  1962 71.3   
10  1967 72.13  
11  1972 72.88  
12  1977 74.21  

then the first part of your minimal example might look this:

tibble(x = c(1, 2, 1, 2), y = c(3, 4, 2, 2))
# A tibble: 4 × 2
      x     y
  <dbl> <dbl>
1     1     3
2     2     4
3     1     2
4     2     2

And if what you’re trying to achieve is

# A tibble: 6 × 2
   year mean_lifeExp
  <dbl>        <dbl>
1  1952         48.8
2  1957         50.1
3  1962         51.6
4  1967         53.1
5  1972         54.5
6  1977         56.3

then the second part of your minimal example might look like this:

tibble(x = c(1, 2), mean_y = c(2.5, 2))
# A tibble: 2 × 2
      x mean_y
  <dbl>  <dbl>
1     1    2.5
2     2    2  

To summarize, your entire question on Stack Overflow could look like this:

How can I transform the first tibble into the second tibble with a function?
library(tibble)
tibble(x = c(1, 2, 1, 2), y = c(3, 4, 2, 2))
tibble(x = c(1, 2), mean_y = c(2.5, 2))

To make your question even better, you can format your code by using the reprex function from the reprex package. The curly brackets are needed to tell reprex that you have multiple lines of code.

library(reprex)
reprex(
  {
    library(tibble)
    tibble(x = c(1, 2, 1, 2), y = c(3, 4, 2, 2))
    tibble(x = c(1, 2), mean_y = c(2.5, 2))
  }
)

Finally your question looks friendly:

How can I transform the first tibble into the second tibble with a function?
library(tibble)
tibble(x = c(1, 2, 1, 2), y = c(3, 4, 2, 2))
#> # A tibble: 4 × 2
#>       x     y
#>   <dbl> <dbl>
#> 1     1     3
#> 2     2     4
#> 3     1     2
#> 4     2     2
tibble(x = c(1, 2), mean_y = c(2.5, 2))
#> # A tibble: 2 × 2
#>       x mean_y
#>   <dbl>  <dbl>
#> 1     1    2.5
#> 2     2    2

3.2 Understand Functions

Once you’ve found a function (or usually, a set of functions) recommended to you, it would be wise to understand how the function(s) work; specifically, the inputs and outputs.

Large-Language-Models

LLM’s can explain function inputs and outputs as they are trained on public online knowledge.

Search Engines

Search engines will give a variety of websites. Remember, after googling “r capitalize first letter” we saw the following paragraph as the first result:

Convert First letter of every word to Uppercase in R Programming – str_to_title() Function. str_to_title() Function in R Language is used to convert the first letter of every word of a string to Uppercase and the rest of the letters are converted to lower case.

This paragraph is from a website called GeeksforGeeks. I would not recommend using this website to understand the function. And for multiple reasons.

  1. You are not familiar with the format of the website.
  2. You will find yourself on multiple websites when you need to discover and learn about multiple functions.
  3. You will then have to navigate the formats of these websites.
  4. Many things can get in the way of reading the instructions, like pop-ups to sign up for the website’s email list, advertisements for completely unrelated products (everything you need to learn R is FREE), and recommended articles to distract you.

It is more effective to use a single, standardized resource when learning about functions. Thankfully, R has a few.

After reading the above paragraph and learning that the function we need may be str_to_title(), we can now Google search “r str_to_title” instead of “r capitalize first letter”. Again, Google shows multiple websites, but we are looking for one that is standardized. tidyverse.org is one of those websites, so we click the result that has “tidyverse.org” in the website address This brings us to this page: https://stringr.tidyverse.org/reference/case.html

As standard, there are multiple sections to the webpage describing a function: Usage, Arguments and Examples. Usage shows the format of the inputs to the function. Any input with an = beside it has a default value. A default value usually indicates that most users will not need to change the value.

The Usage str_to_title(string, locale = "en") tells us that

  1. string should be an object containing some string(s) or a string itself. It has no default value; we must provide one.
  2. locale has the default value "en".

The Arguments tell us more about the inputs in case the Usage is not enough. When first learning R, Arguments can be overwhelming; you might quickly find yourself not understanding the words contained therein, and having to continuously look up definitions (or more function documentation) in order to understand.

Stack Overflow

Another way to understand functions is to be presented with answers from others on Stack Overflow. These answers don’t need to be answers to the questions you have posted on Stack Overflow; they can be answers to questions posted by others.

Here is an example from 2019.

There are three separate answers that have up votes (positive feedback represented by the digit on the top left of an answer): 1 using the data.table package; 1 using base R (R without packages); and 1 using tidyr.

Notice how the answer using tidyr is far more simple; it is one line of code. This word tidy keeps popping up, and for good reason: the functions in this package and more broadly in the tidyverse (the tidy universe) are designed to make coding short and simple.

It is possible to add comments to the answers on Stack Overflow, with further questions about the functions if there is something you don’t understand. Fortunately the tidyverse functions are well documented because of their standardized webpages, and because of multiple, free books on using them for specific tasks.