R

Building the R-Bloggers Bluesky Bot with atrrr and GitHub Actions

Bluesky is shaping up to be a nice, “billionaire-proof”1 replacement of what Twitter once was. One of the things the community was still missing, in my opinion, was the R-Bloggers bot that once spread the news about new R blog posts on ex-Twitter. Especially when first learning R, this was an important resource for me and I created my first package using a post from R-Bloggers. Since I have recently published the atrrr package with a few friends, I thought re-creating the bot that posted new entries was a good opportunity to promote that package and show how you can write a completely free bot with it.

Poor Dude’s Janky Bluesky Feed Reader CLI Via atrrr

Have you ever wanted to see your favourite social media posts in your command line? No? Me neither, but at least hrbrmstr has a few months ago. Or to be honest, I don’t know which social media site he prefers, but Bluesky is currently my favourite. With the ease of use and algorithmic curation that I loved about Twitter before its demise and the super interesting and easy to work with AT protocol, which should make Bluesky “billionaire-proof”1, I’m hopeful that this social network it here to stay.

Release: `atrrr`, a wrapper for the AT protocol behind ’Bluesky’

I’m happy to announce that atrrr has made its way to CRAN. The purpose of atrrr is to communicate with the Authenticated transfer protocol (atproto for short), which powers the Twitter replacement social media site Bluesky. I think there are two things that are especially interesting about the package: it gives near limitless access to a social network site from R the backbone of the package was written mostly automatically The first point will make this interesting for teaching, as the well of interesting data that the Twitter research API once was has tried out, thanks to a certain billionaire.

Re-Release: `traktok`

I’m happy to announce that traktok, my package to get content from TikTok, has returned from the dead. That’s slightly exaggerated, because it actually always worked in some shape or form, but up until about September, the most recent state on Github had very limited functionality. Now I extended the package substantially and also gave it an appealing home on a pkgdown site here: https://jbgruber.github.io/traktok/. The main issue I had before, namely that some requests to the unofficial TikTok API need to be signed, still remains unresolved.

CRAN Download counts

I really like developing software and making my own life and work easier with it. But what I enjoy even more is to see others actually use it! So every now and then I look at CRAN download counts of my R packages. I’m not in any top-10 rankings or anything. But that was also never the point. I just like sharing my knowledge and see others use it!

Introducing `askgpt`: a chat interface that helps you to learn R!

Everyone is talking about AI at the moment. So when I talked to my collogues Mariken and Kasper the other day about how to make teaching R more engaging and how to help students overcome their problems, it is no big surprise that the conversation eventually found it’s way to the large language model GPT-3.5 by OpenAI and the chat interface ChatGPT. It’s advantages for learning R (or any programming languages) are rather obvious:

scikit-learn models in R with reticulate

I have tried to venture into Python several times over the years. The language itself seems simple enough to learn but as someone who has only ever used R (and a bit of Stata), there were two things that held me back: I never really found an IDE that I liked. I tried a few different ones including Spyder and Jupyter Notebook (not technically an IDE) and compared to RStudio and R Markdown they felt rather limited.

Let users choose which plot you want to show

If you have build your homepage using blogdown, it’s actually quite simple to integrate Javascript snippets in it. While this is mentioned in the book “blogdown: Creating Websites with R Markdown”, it still took me a little bit to undertstand how it works. As an example, let’s make different versions of a simple plot and let the user decide which one to display. First I make the plots and save them in a sub-directory:

Get all your packages back on R 4.0.0

R 4.0.0 was released on 2020-04-24. Among the many news two stand out for me: First, R now uses stringsAsFactors = FALSE by default, which is especially welcome when reading in data (e.g., via read.csv) and when constructing data.frames. The second news that caught my eye was that all packages need to be reinstalled under the new version. This can be rather cumbersome if you have collected a large number of packages on your machine while using R 3.

(Much) faster unnesting with data.table

Today I was struggling with a relatively simple operation: unnest() from the tidyr package. What it’s supposed to do is pretty simple. When you have a data.frame where one or multiple columns are lists, you can unlist these columns while duplicating the information in other columns if the length of an element is larger than 1. library(tibble) df <- tibble( a = LETTERS[1:5], b = LETTERS[6:10], list_column = list(c(LETTERS[1:5]), "F", "G", "H", "I") ) df ## # A tibble: 5 x 3 ## a b list_column ## <chr> <chr> <list> ## 1 A F <chr [5]> ## 2 B G <chr [1]> ## 3 C H <chr [1]> ## 4 D I <chr [1]> ## 5 E J <chr [1]> library(tidyr) unnest(df, list_column) ## # A tibble: 9 x 3 ## a b list_column ## <chr> <chr> <chr> ## 1 A F A ## 2 A F B ## 3 A F C ## 4 A F D ## 5 A F E ## 6 B G F ## 7 C H G ## 8 D I H ## 9 E J I I came across this a lot while working on data from Twitter since individual tweets can contain multiple hashtags, mentions, URLs and so on, which is why they are stored in lists.