useR!2019: Highlights of the R conference of the year

Did you miss useR!2019?

Too bad for you..

It was an awesome conference!

From tutorials, talks, (food), to meeting other R users, I just had a great first time (:

R crowd

In this article, I want to make a recap out of my lengthy notes.

Not sure if that’ll be of any use to you, but at least it will show you what you missed!

Tutorials

The first day was tutorials day..

..which I almost missed!

Easyjet decided to cancel my flight the day before. There was no train and no other flight until two days later!!

So I did what I had to do.

I rented a car and drove all night long to NOT MISS the tutorials!

So, I was a bit tired when I arrived there. And a bit stressed out.

Shitty parking
I spent WAY too much time in this parking trying to find the rental shop

But it went okay.

I started with:

Automatic and Explainable Machine Learning with H20 in R.

h2o tutorial

I wanted to be at this tutorial because I had heard about H2O, and AutoML, but had never tried them.

I had heard top Kagglers were using it during competitions.

The tutorial was great, as the progression was slow and steady, and we were left with enough time to practice on our own.

At the end of the tutorial, I was able to try h2o by myself on a past project.

Now I’m using it in another project for a client, and it’s performing great!

The biggest strength of h2o, to me, is that you can run a bunch of models and get great results fast.

Contrary to caret in R, h2o uses Java to perform all the computations and everything is parallelized. That makes the computation much quicker.

Plus, by using AutoML, I’m able to find a super-performing model in no time (well, I just have to let it run by itself).

Now, it’s not perfect of course.

First, the models are not easily interpretable. That’s why they gave tools to use LIME (Local Interpretable Model-Agnostic Explanations), something I had never heard of before.

But you can still run interpretable models such as GLM or CART with h2o.

Second, it works well if your algorithm is simple, such as regression or classification.

In my experience, it’s rarely as simple. Most of the time, you must think creatively and develop more complex models, sometimes multiple layers of models. h2o won’t help you get more creative.

If you’re curious, they have developed tutorials you can find on their Github repo.

It’s easy to use.

Docker for Data Science: R, ShinyProxy and more.

Shinyproxy tutorial

I learned less in this tutorial.

I wished I had had the opportunity to setup shinyproxy during the tutorial (what I hoped), but it was more information-oriented than practice-oriented.

Anyway, I still learned a little bit more about shinyproxy. And a little bit about Docker.

I still used the 3 hours of being in the room to practice it, set it up, and play with Docker.

At the end of the tutorial, I was almost there, and a bit of extra effort lead me to successfully set up shinyproxy a few days later. And more importantly, understand every step of the way.

Yay o/

Plus, I learned how to deploy shiny apps with a simple Docker container (without shinyproxy), which goes handy with golem, a recently developed package to help with building Shiny apps.

If you want to experiment with shinyproxy, give it a try here: Getting Started with ShinyProxy.

Who’s in the place?

Something I found quite interesting during the conference was how many of the talks were made by R Agencies.

They don’t necessarily call themselves this way. They’re a team of several R developers and do consultancy jobs for other companies.

A little bit like me, except they’re multiple people.

So they come to this conference, and one of the ways they have to put their name out there is to make talks.

They show the open-source tools they have developed, customer success stories, etc.

It’s quite interesting to hear about other actors and be inspired by them!

So I discovered:

Cynkra, a Zurich-based data consultancy with a strong focus on R

According to their website, they provide 3 major services:

  • Tidy up: A 2-day workshop to learn the basics of data cleaning and reporting
  • R Consulting: Help with picking the right tools, implementing solutions, training and code review.
  • Open-source: Contributing to the R community

They’re still only 2 people, and it was founded recently, in early 2018.

Their website looks neat and they have multiple open-source projects: cynkra

MiraiSolutions, another Zurich-based data consultancy

MiraiSolutions is also based in Zurich, except they are much bigger.

I can see 14 people on their website, and they seem to cover a larger part of the field.

They do R, Python, Julia, and even Matlab!

I just learned they are the developers behind XLConnect. Not sure if you’ve already tried to work with Excel files, but this is one of the best packages out there to do that.

They published their useR post before mine, so here is a link to their article if you want to have another excerpt of the conference: Impressions from useR! 2019

Appsilon, End to End Data Science Consulting

I already knew Appsilon before, as they once answer a Tweet of mine asking for help.

They’re based in Poland and are quite active in the open-source world.

You might not know about their packages, even if you’re a Shiny developer. They have five of them:

  • shiny.semantic to improve the visual of your Shiny app
  • shiny.dashboard to build a dashboard (aren’t you bored of always seeing the same shinydashboard?)
  • shiny.router to do URL routing (no idea what this is)
  • shiny.i18n to create a multi-language app
  • shiny.info to show diagnostic information during the development of the app

My favorite slide from Filip Stachura’s (CEO of Appsilon) talk was this one:

Appsilon

as it shows in a few words the biggest strengths of Shiny:

  • Fast prototyping
  • Clean structure
  • No limitations

He wrote a recap of useR!2019 so I invite you to check it out as well: useR!2019 Toulouse recap.

Jumping Rivers, Data Science Training and Consultancy

Based in Scotland, Jumping Rivers is a team of 9 people doing consultancy and training in R.

They’re even a certified RStudio Partner, which, to my understanding, is a rare thing (there are only 10 partners according to the RStudio Partners page).

I like how they detailed case studies of previous work.

Something that’s still to be done on my website…

Anyway. Check them out here: https://www.jumpingrivers.com/.

ThinkR, R Expertise and R Training

ThinkR is, to my knowledge, the only company specialized in R in France.

At least the only company with more than 1 person.

I knew them before going to the conference, and they had a small desk there so it was super easy to meet and discuss with them.

I could get a few stickers and even interview their CEO, Vincent Guyader. I’ll publish the interview in a few days.

They have a blog with great articles that I highly recommend (as I regularly stumble upon them when googling my R issues): R Task Blog.

Packages to try

One of the most common points of all talks at useR!2019 is the presentation of a new (or not-so-new) package.

The community is probably more active than ever.

There are more and more packages on CRAN.

And even more that stay in their Github repositories.

Here is the list of packages I want to try as soon as I have the opportunity to:

shinyEventLogger: Log events in shiny applications

Link is on CRAN Vignette or Tutorial
shinyEventLogger Yes Vignette

There aren’t so many tools out there to log events in Shiny.

You have three options:

  1. Look at the files in /var/log/shiny-server/ where your app is hosted (read how to Deploy the app on your server).
  2. Add print statements whenever you want to print something to the log files.
  3. Or use this shinyEventLogger package.

You can easily log the value of a variable, the results of tests, etc.

Then, the logs can be printed to the R console (and find themselves in the log files at /var/log/shiny-server/) but also displayed in the Javascript console or even stored in a database.

This makes searching through logs MUCH easier than the default method.

Here is a use case I have imagined: You can log the duration of an event. So, why not record the duration of a lengthy computation?

As more and more people use your app, or as more data is gathered in your database, this computation could take more and more time. Then, you could set up an alarm that sends you an email when it gets higher than a certain threshold.

promises: Abstractions for Promise-Based Asynchronous Programming

Link is on CRAN Vignette or Tutorial
promises Yes Vignette

Asynchronous programming is running some R code in the background without keeping your current session busy.

That is super useful when you want to speed up a Shiny app!

The most important point is not much about speeding up computation (because that’s some way of parallelizing) but mostly being able to keep navigating in the Shiny app, or in your R session, while the heavy computation is getting done in the background.

It’s even more important when there are multiple users on the same Shiny session since ONE heavy computation can block MANY users at the same time.

I don’t use it much, even though I think it has fantastic potential.

They have many vignettes so.. start there!

future: Unified Parallel and Distributed Processing in R for Everyone

Link is on CRAN Vignette or Tutorial
future Yes Vignette

Have you already tried to run computations in parallel in R?

It’s not super easy.

And it’s hard to understand how that works exactly.

The purpose of future is to gather all existing parallelizing libraries into one. The other libraries are not discarded, of course, but future offers a unique API to access these libraries. It’s super easy to use.

I’ve already tried to use it once for recent work, but it happened that parallelizing took a longer time than doing the computation sequentially.

Yea.. sometimes it’s not faster because of the overhead cost.

That’s not a reason to not keep trying it though!

shinymeta: Record and Expose Shiny app Logic using Metaprogramming

Link is on CRAN Vignette or Tutorial
shinymeta No Tutorial

shinymeta is still brand new and was presented for the first time by Joe Cheng at useR!2019.

He told us he picked this topic for his keynote session.. without having the solution. And that they had something terrible until about one week before the conference.

So.. The “lifecycle: experimental” icon is not there for nothing, I guess.

What is this about?

The goal of shinymeta is to extract the logic of a Shiny app and run it outside Shiny. For example in an RMarkdown report or even in a simple R script.

It will record what you do with the app, and then automatically extract the code to reproduce the steps without the Shiny app. So it will:

  • Create the variables accordingly (when you play with the widgets)
  • Extract the logic (building a chart, or a table)

so that you already have an autonomous R script that can replicate the same chart or table.

Awesome for making Shiny analysis reproducible!

polite: Be nice on the web

Link is on CRAN Vignette or Tutorial
polite No Tutorial

I have done my share of web scraping.

What do you do when you need data from a website?

You fire up your RSelenium and start scrapping it like crazy!

Well.. no.

The goal of polite is to scrape politely by seeking permission, taking slowly, and never asking twice.

I recommend reading through the tutorial to learn the good practice, and then use them from now on.

At least that’s what I plan to do the next time I’ll need to scrape some data!

SuperLearner: Super Learner Prediction

Link is on CRAN Vignette or Tutorial
SuperLearning Yes Vignette

I missed this talk but want to give it a try!

I am so frustrated with the so many different packages that each have their syntax when it comes to machine learning.

So when a package such as SuperLearning comes around and tells me there is another way..

I want to try it.

I say “comes around” but.. it’s a pretty old package.

auth0: Secure Authentication in Shiny with Auth0

Link is on CRAN Vignette or Tutorial
auth0 Yes Tutorial

One of the most common requests I get from my clients once I have set up a Shiny app for them is..

How do I keep stranger locked away from my app?

They want it accessible from anyway, so open to the public internet, but they don’t want to show their business secrets to everyone.

One solution: Authentication.

There are many ways to set up authentication for a Shiny app nowadays.

One of them is to use auth0.

I have already set it up and it’s fairly easy. Now I discover they have made it even easier by providing this package.

No excuse to not set up authentication on your apps anymore!

All in all

It was the first time I participated in the useR conference. I had no idea what to expect, but the conference blew my non-existing expectations!

There are many things I would have liked to cover in this article but.. that’s already a ton of material.

I learned a lot. And I now have a way-too-long list of other things to learn.

It was great to meet so many R users.

I have now a much better idea of who works in the R ecosystem.

I just can’t wait for the next event.

I hope to see you there!

Updated:

Comments

Leave a Comment

Required fields are marked *

Loading...

Comments are validated manually. The page will refresh after validation.