It's time for statistics departments to start supporting their applied students

Statistics departments are failing their applied students. In this post, I have a lot of opinions and give two pieces of advice: statistics departments need to start supporting their applied students, and they need to hire applied faculty.

Rebecca Barter

I graduated with a PhD from UC Berkeley’s statistics department in December. My PhD dissertation consisted of three 100% applied projects (one of which was a piece of open-source software). This is, unfortunately, incredibly rare. Over the past few years, I’ve had a number of current and prospective statistics PhD students both at Berkeley and outside Berkeley get in touch with me to ask me how I made my way through a statistics PhD by working only on applied projects.

Across (dplyr 1.0.0): applying dplyr functions simultaneously across multiple columns

With the introduction of dplyr 1.0.0, there are a few new features: the biggest of which is across() which supersedes the scoped versions of dplyr functions.

Rebecca Barter

Select helpers: selecting columns to apply the function to Using in-line functions with across A mutate example A select example I often find that I want to use a dplyr function on multiple columns at once. For instance, perhaps I want to scale all of the numeric variables at once using a mutate function, or I want to provide the same summary for three of my variables.

Tidymodels: tidy machine learning in R

The tidyverse's take on machine learning is finally here. Tidymodels forms the basis of tidy machine learning, and this post provides a whirlwind tour to get you started.

Rebecca Barter

What is tidymodels Getting set up Split into train/test Define a recipe Specify the model Put it all together in a workflow Tune the parameters Finalize the workflow Evaluate the model on the test set Fitting and using your final model Variable importance There’s a new modeling pipeline in town: tidymodels. Over the past few years, tidymodels has been gradually emerging as the tidyverse’s machine learning toolkit.