Posts by Daniel Oehm
Author: Daniel Oehm
Survivor Confessionals Data: Dataset showcase for {survivoR}
Feed: R-bloggers. Author: Daniel Oehm. Confessionals loosely represent a player’s screen time where they talk strategy and replay events. It is an imperfect measure but can indicate success in the game. It’s often used to show balance or imbalance in the editing. This is a high-level summary of confessionals a showcase of the dataset and an analysis of the edit for key demographics. All code is found on Github and here’s a link to the package data. TL;DR In summary: Historically, women have had fewer confessionals than men and on average have been under-editedThere doesn’t appear to be a difference ... Read More
How to use multiple colour scales in ggplot with {ggnewscale}
Feed: R-bloggers. Author: Daniel Oehm. For week 23 of Tidy Tuesday the chart I wanted to make required two colour scales. For context the dataset detailed pride sponsors that also contributed to anti-LGBTQ+ politicians. TL;DR I wanted to make some rainbows with rainbow colours if the company made the HRC business pledge and a neutral colour for the companies that hadn’t. I could use scale_colour_gradientn() for the colour rainbows but needed a solution for the neutral rainbows. I probably could have hacked it together by assigning a colour to each line, but fortunately, I found the {ggnewscale} package by Elio ... Read More
Survivor Advantages: Dataset showcase for {survivoR}
Feed: R-bloggers. Author: Daniel Oehm. Advantages were introduced to Survivor to give players an edge and to shake up the strategy. A successful play can help advance the player further in the game but can also make the player a target if others know about it. Advantages build uncertainty into the game and prompt players to adapt. Advantages, particularly hidden immunity idols are now integral to the game of Survivor. I will showcase two of my favourite new datasets in the {survivoR} package: advantage_detailsadvantage_movement I’ll walk through the history of advantages in Survivor to demonstrate how the datasets can be ... Read More
survivoR v1.0 is now on CRAN
Feed: R-bloggers. Author: Daniel Oehm. I’m happy to announce that survivoR v1.0 is now on CRAN. The package now contains all the features intended for the first major release. A big thank you to Carly Levitz for helping collate and test the data. This post details the major updates since v0.9.12. For a complete list of tables and features of the package please visit the Github page. To jump right into it you can install the package with install.packages("survivoR") Or from Git with devtools::install_github("doehm/survivoR") If you find an issues please raise them on Github and I’ll correct them asap. For ... Read More
survivoR now on CRAN!
Feed: R-bloggers. Author: Daniel Oehm. I am pleased to announce survivoR 0.9.2 is now on CRAN. The survivoR data package is a collection of datasets detailing events and the cast across all 40 seasons of the US Survivor, including castaway information, vote history, immunity and reward challenge winners, jury votes, and viewers. It also includes season and tribe colour palettes, and ggplot2 scale functions. To install, simply enter install.packages("survivoR") Or from Github with the following devtools::install_github("doehm/survivoR") For more details on the content of the package please follow the link below or visit the Github page. I intend to update the ... Read More
survivoR | Data from the TV series in R
Feed: R-bloggers. Author: Daniel Oehm. 596 episodes. 40 seasons. 1 package! I’m a pretty big fan of Survivor and have religiously watched every season since the first. With 40 seasons under its belt, there’s a tonne of data to dive into. However, getting that data in one place has been tedious. Hence, the survivoR package. survivoR is a collection of datasets detailing events across all 40 seasons of the US Survivor, including castaway information, vote history, immunity and reward challenge winners, jury votes, and viewers. Installation Currently, the package exists on Github and can be installed with the following code ... Read More
Some basics and intuition behind GAN’s in R and Python
Feed: R-bloggers. Author: Daniel Oehm. Generative Adversarial Networks are great for generating something from essentially nothing and there are some interesting uses for them. Most uses are some sort of image processing. Nvidia’s GauGAN is an impressive example, giving the user an MS paint-like interface and generating landscapes. You can give the beta a shot here. I wanted to take a step back and use a small example to understand the basics and build some intuition behind GAN’s. There’s a tonne of information out there on how to fit a GAN to generate new hand drawn numbers, faces or Pokemon ... Read More
Simulating data with Bayesian networks
Feed: R-bloggers. Author: Daniel Oehm. Bayesian networks are really useful for many applications and one of those is to simulate new data. Bayes nets represent data as a probabilistic graph and from this structure it is then easy to simulate new data. This post will demonstrate how to do this with bnlearn. Fit a Bayesian network Before simulating new data we need a model to simulate data from. Using the same Australian Institute of Sport dataset from my previous post on Bayesian networks we’ll set up a simple model. For convenience I’ll subset the data to 6 variables. library(DAAG) library(tidyverse) ... Read More
Use more of your data with matrix factorisation
Feed: R-bloggers. Author: Daniel Oehm. Previously I posted on how to apply gradient descent on linear regression as an example. With that as background it’s relatively easy to extend the logic to other problems. One of those is matrix factorisation. There are many ways to factorise a matrix into components such as PCA, singular value decomposition (SVD), but one way is to use gradient descent. While it’s inception is in image processing, it was popularised by it’s use with recommender systems (Funk SVD). But, it is also found useful in other ways such as market basket analysis and topic modelling ... Read More
Fun with progress bars: Fish, daggers and the Star Wars trench run
Feed: R-bloggers. Author: Daniel Oehm. If you’re like me, when running a process through a loop you’ll add in counters and progress indicators. That way you’ll know if it will take 5 minutes or much longer. It’s also good for debugging to know when the code wigged-out. This is typically what’s done. You take a time stamp at the start – start , print out some indicators at each iteration – cat(“iteration”, k, “// reading file”, file, “n”) and print out how long it took at the end – print(Sys.time()-start). The problem is it will print out a new line ... Read More
Recent Comments