sharanry.github.io
The Tensorflow Graph Problem
This post is an effort to demonstrate and provide possible solutions for tensorflow’s graph problem with PyMC4. Recall that TensorFlow represents calculations as a computation graph, and even for very simple models, the PyMC4 computation graph can be very complex…
masalmon.eu
Where to get help with your R question?
I’ll start this post with a general comment for newbies who aren’t at ease enough yet to post their question, no matter what type of questions (see later), to anywhere public: find your safe and friendly space! Do you have an R friend? Maybe this colleague who actually convinced you to start using R? Well, ask this person for help! Remind them it’s their fault you’re struggling…
lenkiefer.com
Connected scatterplot
On Twitter Claus Wilke asks: Dear Lazyweb: Is there an accepted name for a plot showing a two-variable time series as a path in the x-y plane? #dataviz@Elijah_Meeks @albertocairo @lenkiefer @sharoz @dataandme pic.twitter.com/N8Edmf8qii - Claus Wilke (@ClausWilke) July 21, 2018 I call them connected scatterplots, and we’ve made a few here…
www.justadatageek.com
Exploring Burlington County, NJ
Beginning An Exploration of Burlington County, NJ Early last year, my family and I moved back to the Philadelphia area. We settled in Burlington County, New Jersey. Details on the county and a bit about its history can be found on via Wikipedia. I wanted to start this exploration with looking at county employment…
data-chips.com
Hexagon Patterns with R
Perler beadsPerler bead hexagonsHexagon function (blank design)Hexagon design function #1Hexagon design function #2Future hexagon designsI’m not finished working on hexagons yet…
dsollberger.netlify.com
Let's see if I can make posts (mostly) through RStudio
Let’s see if I can make posts (mostly) through…
engineering.pivotal.io
Let's use Vault - Part 1
There are comments in the config to explain the what and why of the properties. After vault is deployed we need to manually edit the ingress service. See the github issue below descrbing the reason for this change. This will show you the ingress service…
lenkiefer.com
Maybe the Linear Probability Model isn't all bad
The Linear Probability Model (LPM) might be bad, but is it all bad? Let’s look at some conditions where the LPM might not be so bad. We’ll also look at some simple adjustments that might improve the performance of the LPM. We’ll also compare the LPM to some common alternatives. Setup Throughout most of this post, we’re going to consider a world where the LPM model is the true model…
rviews.rstudio.com
CVXR
Our next step is to generate the list of constraints. Note that, by default, the relational operators apply over all entries in a vector or matrix…
aosmith.rbind.io
Creating legends when aesthetics are constants in ggplot2
It would be nice to know which line came from which model, and adding a legend is one way to do that…
blog.wallaroolabs.com
Event Triggered Customer Segmentation
Today I’m going to show you how fast and easy it can be to set up a simple application using the Wallaroo Python API to manage an ad…
rmflight.github.io
Finding Modes Using Kernel Density Estimates
First, lets do this in R. Need some values to work with. Plot the density estimate with the mode location. Lets do something similar in Python. Start by generating a set of random values. Plot to show indeed we have it right…
www.rdatagen.net
Randomize by, or within, cluster?
Under this design, 33 sites around the country will receive the training at some point, which is no small task (and fortunately as the statistician, this is a part of the study I have little…
www.jessemaegan.com
When in doubt, optimize for joy
My summer started with a bang: The direction the world was pushing me towards and the direction that I wanted to go in were firmly at…
rviews.rstudio.com
Monte Carlo Shiny: Part Three
The sidebar looks like this Let’s dissect one of those fluid rows line-by-line. Finally, we set “SPY” as the default initial value. If the user does nothing, the value will be this default…
aosmith.rbind.io
Simulate! Simulate! - Part3
One of the things I like about simulations is that, with practice, they can be a quick way to check your intuition about a model or relationship. My most recent example is based on a discussion with a student about quadratic effects…
lenkiefer.com
U.S. housing starts are still super low
I try not to use too much jargon (jargon monoxide can be deadly) on this blog. But I’ve got a bit of a technical term I’ve been using the describe U.S. residential construction: super-low…
sciathlon.github.io
Elevation gain run data analysis
Hey everyone! This article is part 2 (see part 1) of last week’s piece about analysing my running data from strava, and this time it is about my elevation gain data. Again, I am using the rStrava package…
www.rostrum.blog
Footballers are younger than you
Matt Dray TL;DR I wrote an R Shiny app that tells you how many players at World Cup 2018 were younger than you. It’s designed to make you feel old. You’re welcome…
lenkiefer.com
How bad is a Linear Probability Model?
I think a lot about predicting/forecasting binary outcomes…
yihui.name/en
The 'Invisible Baby' Metaphor
Be kind: you don’t know the constraints they have. This reminds me of a metaphor I have had on my mind for a long time: the “invisible baby”…
www.tidyverse.org/articles
broom 0.5.0
These changes will mostly likely affect you when you: Deprecated tidiers still return data frames. Tidiers for mixed models also return data frames. broom 0.5.0 introduces tidiers for: In addition to these new tidiers, this release includes fixes for a large number of bugs in existing tidiers…
www.nomadic-hacker.com
Announcing
Richard Feynmann 1 has once said: What I cannot create, I do not understand. To understand Neural Networks, and how the recent Machine Learning evolution happened…
nowosad.github.io
Finding similar local landscapes
Can you guess what are we missing in this long list? Yes, you’re (probably) right! Finding places with similar spatial patterns is not a standard spatial operation. However, it could be useful to solve many potential questions…
www.nomadic-hacker.com
From Zero to GPU 1 - A new neural network is born
Welcome to the first post in my From Zero to GPU series. Where I talk about aspects of neural network implementations…
masalmon.eu
Get on your soapbox! R blog content and promotion
There are plenty of R things you could blog about! Then, an important aspect of R blogging is that you don’t need to blog about R code! Examples of relevant R content I’ve seen include: Now, if you still have trouble finding ideas, you can find inspiration by..…
www.ifconfig.it/hugo
War stories - The Docking Station
This story starts with a phone call at night. If you worked in IT long enough you know what it means…
evangelinereynolds.netlify.com
Where should you declare aesthetics? Globally, or geom-by-geom?
Where should you declare aesthetics? Globally or in the geom_*() function? The answer to this question, in some sense is personal preference, because there are simply different ways to get the same job done in the ggplot architecture. My preference is declaring all aesthetic mappings as global unless there are conflicts…
djnavarro.net
Day 67-81
The motivation to face the fear is similarly straightforward: my R code runs too slowly for some of the problems I care about. It doesn’t come up that often, to be honest. Most problems I work on are small enough that it really doesn’t matter that my R code is…
magesblog.com
Hierarchical loss reserving with growth curves using brms
The abstract of the paper motivates the model well: As usual, it is the aim to predict the future claims payments for the various accident years…
www.semidocumentedlife.com
how should I get started with R?
Here’s some evergreen advice from David Robinson: Many of the folks I talk to about learning R have little or no experience with “real” programming languages, which described myself when I first installed the language. If you’re in this camp, I have a few recommendations to get started…
www.carlbfrederick.com
Concentration of Senate Representation
Since the data are readily available, I decided to look into things to decide how strongly I should argue this point in the future…
lenkiefer.com
Housing in the Golden State
I am headed out west, to California to talk housing at the Western Secondary Market Conference. After my talk they might post my slides online somewhere. If they do I’ll link to them, but for now you can get a preview in this twitter thread…
lenkiefer.com
New Blog Style!
I decided to switch over my blog theme. The Ghostwriter theme I used was nice, but it didn’t have a blog archive. As the number of posts grow a blog archive is easier to search. We still have tags you can search. I’ve adopted the Hugo Blackburn theme…
yihui.name/en
No Description, Website, or Topics Provided
Usually I tend to be conservative on actively marketing my own products, but I have also seen people who seem to be totally unaware of marketing, which is a pity in my eyes. A very typical example is a Github repo that has no description, website URL, or topics, which looks like this: I sigh a sigh every time I see it…
irene.rbind.io
Rats to reefs
This week, I came across two news articles about a study in Nature led by Nick Graham that linked invasive rats on islands to coral reefs…
www.williamrchase.com
Saturday Success #1
I had a few science successes this week, like collecting some decent AFM images, learning purrr, or posting on this blog and tweeting (two of my outreach-related goals)…
www.williamrchase.com
Friday Fails #1
“Stop judging yourself against shiny people. Avoid the shiny people. The shiny people are a lie…
wenlong-liu.github.io
Generate a reproducible map for county-level fertilizer estimation data in U.S.A. using R
More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments. There are also some packages required to reproduce this post. If you have not installed them, please run the following codes…
research.libd.org/rstatsclub
LIBD rstats club remote useR!2018 notes
Next, check the videos of the talks. There are more videos there than we can check right now but we hope to come back sometime later and check more talks. From checking Twitter, we can say that there lots of great talks and tutorials. Here are some of the main ones we found in this hour…
yihui.name/en
R Markdown
I don’t know other people’s secrets on how to create successful software, but my experience is that if you create a software package that you like to use by yourself on a daily basis, it is likely to succeed and will be used by many other people, too…
yihui.name/en
The User-Developer Spectrum in the R Ecosystem
Markdown is for 90% of the results from 10% of the effort. Very well said. I also find this odd: “If R seems a bit confusing, disorganized, and perhaps incoherent at times, in some ways that’s because so is data analysis…
blog.wallaroolabs.com
Detecting Spam as it happens
Suppose your social network for chinchilla owners has taken off. Your flagship app contains an embedded chat client, where community members discuss chinchilla-related topics in real-time. As your user base grows, so does its value as a target for advertising…
cattleguard.github.io
How To Apply Google's CausalImpact Package to Analyze Infosec Intervention
Google released their CausalImpact package a few years ago and when they did my mind started racing with ideas for information security and information risk applications. Imagine if you could propose a control, policy change or process improvement with an expected effect on a response variable, which would lead you to purposefully defining a way to measure intervention outcomes…
www.rostrum.blog
How accessible is my post about accessibility?
Matt Dray The accessibility empathy lab at the Government Digital Services building Digital accessibility I wrote about an accessibility workshop at the recent Sprint 18…
simplystatistics.org
Teaching R to New Users - From tapply to the Tidyverse
The intentional ambiguity of the R language, inherited from the S language, is one of its defining…
ropensci.org/technotes
phylogram
As an example, a simple three-leaf dendrogram can be created from a nested list as…
yihui.name/en
Do You Have to Use FontAwesome or Other Libraries for Web Symbols?
Certain HTML entities may not work in all web browsers, but I don’t really care…
mgb-research.netlify.com
Interaction Plots with Continuous Moderators in R
Long ago (the first half of my grad school life), I created a model for a manuscript I submitted…
yihui.name/en
One Little Thing
You can also provide the list of files programmatically, e.g., For multiple files, they are first compressed to a zip file, and the zip file will be embedded…
gcppodcast.com
VirusTotal with Emi Martínez
Emiliano has been with VirusTotal for over 10 years. He has seen the business grow from a small startup in southern Spain into a Google X moonshot under the new Chronicle bet…
yihui.name/en
Write / Don't Write the Whole Book in bookdown
don’t write your book from start to finish in bookdown. it’s too easy to hit a bug, and it’s impossible to interactively debug. also, today my figure captions stopped working. write the whole thing in #bookdown compiles seamlessly to webook, pdf epub…
yihui.name/en
Yue Jiang
Apparently, Yue Jiang (Uchiha?) is a qualified ninja with Sharingan. Once again, we love software…
cevo.com.au
3 months into an all-in AWS migration
I’m currently Tech Lead on an all-in Amazon Web Services (AWS) migration for the Australian arm of a multinational company…
www.rdatagen.net
How the odds ratio confounds
My aim here is to generate a few figures that might highlight some of these issues. With a constant odds ratio of 3, the risk ratios range from 1 to 3, and the risk differences range from almost 0 to just below 0.3…
yihui.name/en
Only One Person Can Help You with That
Only one person can help you with that, and it is most effective to contact him directly. That is also what I’m deeply worried about. When users run into certain problems, only one person can help. One person managing more than 12,000 binary packages (although most should be easy to build)…
blog.rstudio.com
RStudio Connect v1.6.4.2 - Security Update
RStudio remains committed to providing the most secure product possible…
evangelinereynolds.netlify.com
Tuition Increases for Tidy Tuesday
Here’s the code for my first Tidy Tuesday…
atusy.github.io/blog
xetexでunicode文字
$\LaTeX{}$ で μ や α など特殊文字を直打ちすると、 □になってしまうことがしばしば。 XeTeXを使っている場合は、 \setmainfont{IPAMincho}…
www.tidyverse.org/articles
Carpe Talk
Conference talks are a great opportunity to help people learn about the cool and useful things you have built. Given all the hard work you’ve already put in, a bit of marketing effort can be a wise investment in drumming up users…
www.justadatageek.com
My Thoughts On Bitcoin
I enjoy listening to/reading the stories on the various business news outlets about cryptocurrencies – I have been casually reading about blockchain technology and its different applications, but that is not the focus of this post. While I am not an “expert” on cryptocurrencies, I have come to the opinion that they are not really currencies…
alison.rbind.io
Read data with multiple header rows into R
More options: Now we are ready to diagnose the problem! All together now: the final…
blog.zenggyu.com/en
Setting Up Visual Studio Code
This is one of a series of posts where I document software configurations for personal reference. This post documents the configurations for Visual Studio…
sciathlon.github.io
Strava time vs distance data analysis
Hello everybody, I have a new run data analysis today: from my own strava data! I have been much slower recently on the posts, I will probably be until the end of july, because of work for my PhD, so I’m sorry about that…
www.njtierney.com
A note on ggplot code style
I’ve got some opinions about how to write ggplot code. So, if there are more than two sections in a function, these should be separated on a…
mouse-imaging-centre.github.io/blog
Linear Models
Introduction library(tidyverse) library(matlib) library(knitr) library(RColorBrewer) The purpose of this document is to understand the parameter and residuals error estimates in a basic linear regression model when working with binary categorical…
nowosad.github.io
Pattern-based Spatial Analysis - core ideas
Take a look at the example above and compare information depicted by the compositional and co-occurrence histograms. The first one shows that each land cover category occupies a very similar proportion of the area…
www.onceupondata.com
Tidy Eval Meets ggplot2
Then you can either pass column names directly or the variable names. The previous options are good and valid in many cases, but there are some limitations. For instance, you cannot create a function and pass column names unquoted as follows…
simplystatistics.org
What Should be Done When Data Have Creators?
All this got me thinking about how screenwriters are often limited in what they can write by the fact that the material they are writing was originated by someone else…
evangelinereynolds.netlify.com
Wide data to long using the tidyverse (tidyr's gather function)
A wide data storage format is an efficient and compact way to store information. And this organization perhaps it makes data easier to inspect. We have wide monitors our laptops and destops. However, for visualization and analysis you generally need to transform this data from the wide format to a “tidy”, long format…
mgb-research.netlify.com
Bayesian Multilevel Model with Missing Data
This is the first post in a three-part blog series I am putting together. The focus of this initial post is effective exploration of the reasons for missingness in a particular set of data…
ropensci.org/blog
Exploring ways to address gaps in maternal-child health research
I was aware at all times that I had only islands of knowledge separated by darkness; that I was surrounded by chasms of not-knowing, into one of which I was certain to fall. One of the best ways to start feeling less intimidated is to start talking to others. Ullman continues, I learned I was not alone…
nowosad.github.io
Life (expectancy), animated
Global socio-economic data is easily accessible nowadays…
emmavestesson.netlify.com
My first hackathon (part 2)
Gender pay gap hackathon (part 2) This is part 2 of my blog about the gender pay gap hack that I went to. You can read part 1 here. Reflections It has taken me a long time to write the second part of my experience of the hackathon…
sharanry.github.io
Non-Centered Eight Schools Model with PyMC4
We are finally at a state where we can demonstrate the use of the PyMC4 API side by side with PyMC3 and showcase the consistency in results by using non-centered eight schools model. I will be comparing the PyMC3 and PyMC4 way of doing the same task…
www.aggieerin.com
Quick and Dirty Categorical lavaan
I was tagged today on twitter asking about categorical variables in lavaan. I will say I have not done much with categorical predictors either endogenous or exogenous. I did a quick reproducible example of exogenous variables, and I will refer you to the help guide for lavaan here…
toscano84.github.io
Using R to analyse the German Federal Election
As the title of this post implies we will analyze, using the statistical programming language R, the German Federal Election which took place on 24 September of 2017. It will not be an exhaustive analysis of the results. I’m only interested in visualizing the share of the vote that each party represented in the Parliament (i.e…
mlr-blog.netlify.com
Why R Conference
This July we had the great honor to present mlr and its ecosystem at the WhyR 2018 Conference in Wroclaw in Poland…
www.tidyverse.org/articles
ggplot2 3.0.0
Install ggplot2 with: Aesthetics can now be specified independent of the scale name…
toscano84.github.io
About me
Welcome to my blog! I’m Hugo! My background is in Psychology, more specifically, in the area of face perception. Always eager to learn and after falling in love with R, I will blog mostly about data related topics (e.g. importing, wrangling, visualization, statistics). Not yet Machine Learning, still exploring it…
www.stat.cmu.edu/~ryurko
Bayesian Baby Steps
Before going into the regression example with a predictor, it’s worthwhile to first demonstrate quadratic approximation by just modeling the score differential with a Gaussian…
gcppodcast.com
Connected Games with Unity and Google Cloud with Brett Bibby and Micah Baker
As Product Manager leading the strategy for Gaming on the Google Cloud Platform, Micah is committed to enabling developers to realize their vision for great games…
yutani.rbind.io
gghighlight 0.1.0 Is Released!
gghighlight 0.1.0 is on CRAN now! One more small news is, gghighlight got an introductory vignette…
yihui.name/en
On 'Quick Questions'
“Hey Yihui, quick question for you…” For many times I have been asked “quick questions” on Twitter, on Slack, or in emails…
yihui.name/en
A CRANextra Repository for Homebrew and R Users on macOS
There will be no more Step 1, 2, ..…
lenkiefer.com
Exploring housing data with R and IPUMS USA
In this post I want to share some observations on housing in the United States from 1980 to 2016, share some R code for data wrangling, and tri (no that’s not a typo, just a pun) out a visualization techniques. Let’s get to it. I’ve been carrying a running conversation with folks on Twitter regarding the U.S. housing market and its future…
simplystatistics.org
Cultural Differences in Map Data Visualization
The maps need to be usable, but they also need to fulfill cognitive goals on cultural levels that go beyond what any given user might know they need. For instance, in the U.S…
djnavarro.net
Day 63-66: Learning to skim
But everything is “under control” now, at least for a very expansive definition of “control”. The kids are keeping themselves occupied, I’ve responded to some overdue emails, and I’m making inroads into the Saturday morning laundry…
www.rostrum.blog
Markov-chaining my PhD thesis
Matt Dray Doc rot I wrote a PhD thesis in 2014 called Effects of multiple environmental stressors on litter chemical composition and decomposition. See my viva presentation slides here if you don’t really like words…
divingintogeneticsandgenomics.rbind.io
my first try on Rmarkdown using blogdown
I have used blogdown writing regular markdown posts, but the real power is from the Rmarkdown! let me try it for this post. Note that you do not knit the Rmarkdown by yourself, rather you let blogdown do the heavy lift. It is awesome! blogdown will knit and render the code and the output into a html file…
divingintogeneticsandgenomics.rbind.io
hugo academic theme blog down deployment (some details)
It is quite straightforward to have a working site following Alison’s guide. However, you always want some customization of your own site. I took the tips from Leslie. and changed to twitter: from to to from to The wideget will be gone in the home page. from to Now, you can check the traffic reports in real-time…
blog.wallaroolabs.com
Real-time Streaming Pattern
Introduction This week, I will continue to look at data processing patterns used to build event triggered stream processing applications, the use cases that the patterns relate to, and how you would go about implementing within…
blog.sellorm.com
Running Python in the RStudio IDE
I’ve had a few different people ask me variants of the same question lately, which is: “How can I run python code on a server in a similar way to using R with RStudio Server”…
www.tidyverse.org/articles
bench 1.0.1
Install the latest version with: Results are easy to interpret, with human readable units in a rectangular data frame. You can also produce fully custom plots by un-nesting the results and working with the data directly…
divingintogeneticsandgenomics.rbind.io
Backup automatically with cron
cron is a Unix, solaris, Linux utility that allows tasks to be automatically run in the background at regular intervals by the cron daemon. Crontab (CRON TABle) is a file which contains the schedule of cron entries to be run and at specified times…