Posts by R-Bloggers
Author: R-Bloggers
Statistics Sunday: Some Psychometric Tricks in R
Feed: R-bloggers. Author: . Statistics Sunday: Some Psychometrics Tricks in R It’s been a long time since I’ve posted a Statistics Sunday post! Now that I’m moved out of my apartment and into my house, I have a bit more time on my hands, but work has been quite busy. Today, I’m preparing for 2 upcoming standard-setting studies by drawing a sample of items from 2 of our exams. So I thought I’d share what I’m up to in order to pass on some of these new psychometric tricks I’ve learned to help me with this project. Because I can’t ... Read More
Statistics Sunday: What Should I Read Next?
Feed: R-bloggers. Author: . When You Need a New Book to ReadI log all of my books on Goodreads. On top of that, whenever I hear about a new book I have to read, I add it on Goodreads, so I remember it. Of course, this means my Goodreads bookshelves are a little out of control. Fortunately, I can use R to dig through my Goodreads to-read shelf and figure out the next book to read and/or buy. If you’re on Goodreads, you can easily download your entire bookshelf, including your to-read books, by going to “My Books” then clicking ... Read More
Statistics Sunday: Visualizing Regression
Feed: R-bloggers. Author: . Statistics Sunday: Visualizing RegressionI had some much needed downtime this weekend, after an exhausting week, along with some self-care – Saturday I had a one-hour deep tissue massage, which left me a little bruised but much more relaxed, and Sunday I spent a few hours in the salon chair having my color touched up, which left me much blonder. Which is why I’m a little late with my Statistics Sunday post, but today, I’m introducing another recently discovered r package: rpart. Short for “recursive partitioning,” this package creates decision trees for classification, regression, and survival analyses ... Read More
Statistics Sunday: Using Text Analysis to Become a Better Writer
Feed: R-bloggers. Author: . Using Text Analysis to Become a Better WriterWe all have words we love to use, and that we perhaps use too much. As an example: I have a tendency to use the same transitional statements, to the point that, before I submit a manuscript, I do a find all to see how many times I’ve used some of my favorites, e.g., additionally, though, and so on. I’m sure we all have our own words we use way too often. Text analysis can also be used to discover patterns in writing, and for a writer, may be ... Read More
Topics and Categories in the Russian Troll Tweets
Feed: R-bloggers. Author: . Topics and Categories in the Russian Troll Tweets I decided to return to the analysis I conducted for the IRA tweets dataset. (You can read up on that analysis and R code here.) Specifically, I returned to the LDA results, which looked like they lined up pretty well with the account categories identified by Darren Linvill and Patrick Warren. But with slightly altered code, we can confirm that or see if there’s more to the topics data than meets the eye. (Spoiler alert: There is more than meets the eye.) I reran much of the original ... Read More
Statistics Sunday: Getting Started with the Russian Tweet Dataset
Feed: R-bloggers. Author: . IRA Tweet Data You may have heard that two researchers at Clemson University analyzed almost 3 millions tweets from the Internet Research Agency (IRA) – a “Russian troll factory”. In partnership with FiveThirtyEight, they made all of their data available on GitHub. So of course, I had to read the files into R, which I was able to do with this code: files c("IRAhandle_tweets_1.csv","IRAhandle_tweets_2.csv","IRAhandle_tweets_3.csv","IRAhandle_tweets_4.csv","IRAhandle_tweets_5.csv","IRAhandle_tweets_6.csv","IRAhandle_tweets_7.csv","IRAhandle_tweets_8.csv","IRAhandle_tweets_9.csv")my_files paste0("~/Downloads/russian-troll-tweets-master/",files)each_file function(file) {tweet read_csv(file) }library(tidyverse) tweet_data NULLfor (file in my_files) {temp each_file(file)temp$id sub(".csv", "", file)tweet_data rbind(tweet_data, temp)} Note that this is a large file, with 2,973,371 observations of 16 variables. Let’s do ... Read More
Statistics Sunday: Highlighting a Subset of Data in ggplot2
Feed: R-bloggers. Author: . Highlighting Specific Cases in ggplot2Here’s my belated Statistics Sunday post, using a cool technique I just learned about: gghighlight. This R package works with ggplot2 to highlight a subset of data. To demonstrate, I’ll use a dataset I analyzed for a previous post about my 2017 reading habits. [Side note: My reading goal for this year is 60 books, and I’m already at 43! I may have to increase my goal at some point.] setwd("~/R")library(tidyverse) booksread_csv("2017_books.csv", col_names = TRUE) ## Warning: Duplicated column names deduplicated: 'Author' => 'Author_1' [13] ## Parsed with column specification:## cols(## .default ... Read More
Stats Note: Making Sense of Open-Ended Responses with Text Analysis
Feed: R-bloggers. Author: . Using Text Mining on Open Ended ItemsGood survey design is both art and science. You have to think about how people will read and process your questions, and what sorts of responses might result from different question forms and wording. One of the big rules I follow in survey design is that you don’t assess any of your most important topics with an open-ended item. Most people skip them, because they’re more work than selecting options from a list, and people who do complete may give you terse, unhelpful, or gibberish answers. When I was working ... Read More
Statistics Sunday: More Text Analysis – Term Frequency and Inverse Document Frequency
Feed: R-bloggers. Author: . Statistics Sunday: Term Frequency and Inverse Document Frequency As a mixed methods researcher, I love working with qualitative data, but I also love the idea of using quantitative methods to add some meaning and context to the words. This is the main reason I’ve started digging into using R for text mining, and these skills have paid off in not only fun blog posts about Taylor Swift, Lorde, and “Hotel California”, but also in analyzing data for my job (blog post about that to be posted soon). So today, I thought I’d keep moving forward to ... Read More
Statistics Sunday: Converting Between Effect Sizes for Meta-Analysis
Feed: R-bloggers. Author: . Converting Between Effect Sizes I’m currently working on my promised video on mixed effects meta-analysis, and was planning on covering this particular topic in that video – converting between effect sizes. But I decided to do this as a separate post that I can reference in the video, which I hope to post next week. As a brief refresher, meta-analysis is aimed at estimating the true effect (or effects) in an area of study by combining findings from multiple studies on that topic. Effect sizes, the most frequently used being Cohen’s d, Pearson’s r, and log ... Read More
Recent Comments