sharanry.github.io

The Tensorflow Graph Problem

This post is an effort to demonstrate and provide possible solutions for tensorflow’s graph problem with PyMC4. Recall that TensorFlow represents calculations as a computation graph, and even for very simple models, the PyMC4 computation graph can be very complex


masalmon.eu

Where to get help with your R question?

I’ll start this post with a general comment for newbies who aren’t at ease enough yet to post their question, no matter what type of questions (see later), to anywhere public: find your safe and friendly space! Do you have an R friend? Maybe this colleague who actually convinced you to start using R? Well, ask this person for help! Remind them it’s their fault you’re struggling


lenkiefer.com

Connected scatterplot

On Twitter Claus Wilke asks: Dear Lazyweb: Is there an accepted name for a plot showing a two-variable time series as a path in the x-y plane? #dataviz@Elijah_Meeks @albertocairo @lenkiefer @sharoz @dataandme pic.twitter.com/N8Edmf8qii - Claus Wilke (@ClausWilke) July 21, 2018 I call them connected scatterplots, and we’ve made a few here


www.justadatageek.com

Exploring Burlington County, NJ

Beginning An Exploration of Burlington County, NJ Early last year, my family and I moved back to the Philadelphia area. We settled in Burlington County, New Jersey. Details on the county and a bit about its history can be found on via Wikipedia. I wanted to start this exploration with looking at county employment


data-chips.com

Hexagon Patterns with R

Perler beadsPerler bead hexagonsHexagon function (blank design)Hexagon design function #1Hexagon design function #2Future hexagon designsI’m not finished working on hexagons yet


dsollberger.netlify.com

Let's see if I can make posts (mostly) through RStudio

Let’s see if I can make posts (mostly) through


engineering.pivotal.io

Let's use Vault - Part 1

There are comments in the config to explain the what and why of the properties. After vault is deployed we need to manually edit the ingress service. See the github issue below descrbing the reason for this change. This will show you the ingress service


lenkiefer.com

Maybe the Linear Probability Model isn't all bad

The Linear Probability Model (LPM) might be bad, but is it all bad? Let’s look at some conditions where the LPM might not be so bad. We’ll also look at some simple adjustments that might improve the performance of the LPM. We’ll also compare the LPM to some common alternatives. Setup Throughout most of this post, we’re going to consider a world where the LPM model is the true model


rviews.rstudio.com

CVXR

Our next step is to generate the list of constraints. Note that, by default, the relational operators apply over all entries in a vector or matrix


aosmith.rbind.io

Creating legends when aesthetics are constants in ggplot2

It would be nice to know which line came from which model, and adding a legend is one way to do that


blog.wallaroolabs.com

Event Triggered Customer Segmentation

Today I’m going to show you how fast and easy it can be to set up a simple application using the Wallaroo Python API to manage an ad


rmflight.github.io

Finding Modes Using Kernel Density Estimates

First, lets do this in R. Need some values to work with. Plot the density estimate with the mode location. Lets do something similar in Python. Start by generating a set of random values. Plot to show indeed we have it right


www.rdatagen.net

Randomize by, or within, cluster?

Under this design, 33 sites around the country will receive the training at some point, which is no small task (and fortunately as the statistician, this is a part of the study I have little


www.jessemaegan.com

When in doubt, optimize for joy

My summer started with a bang: The direction the world was pushing me towards and the direction that I wanted to go in were firmly at


rviews.rstudio.com

Monte Carlo Shiny: Part Three

The sidebar looks like this Let’s dissect one of those fluid rows line-by-line. Finally, we set “SPY” as the default initial value. If the user does nothing, the value will be this default


aosmith.rbind.io

Simulate! Simulate! - Part3

One of the things I like about simulations is that, with practice, they can be a quick way to check your intuition about a model or relationship. My most recent example is based on a discussion with a student about quadratic effects


lenkiefer.com

U.S. housing starts are still super low

I try not to use too much jargon (jargon monoxide can be deadly) on this blog. But I’ve got a bit of a technical term I’ve been using the describe U.S. residential construction: super-low


sciathlon.github.io

Elevation gain run data analysis

Hey everyone! This article is part 2 (see part 1) of last week’s piece about analysing my running data from strava, and this time it is about my elevation gain data. Again, I am using the rStrava package


www.rostrum.blog

Footballers are younger than you

Matt Dray TL;DR I wrote an R Shiny app that tells you how many players at World Cup 2018 were younger than you. It’s designed to make you feel old. You’re welcome


lenkiefer.com

How bad is a Linear Probability Model?

I think a lot about predicting/forecasting binary outcomes


rmflight.github.io

Split - Unsplit Anti-Pattern

Not bad for the test data


yihui.name/en

The 'Invisible Baby' Metaphor

Be kind: you don’t know the constraints they have. This reminds me of a metaphor I have had on my mind for a long time: the “invisible baby”


www.tidyverse.org/articles

broom 0.5.0

These changes will mostly likely affect you when you: Deprecated tidiers still return data frames. Tidiers for mixed models also return data frames. broom 0.5.0 introduces tidiers for: In addition to these new tidiers, this release includes fixes for a large number of bugs in existing tidiers


www.nomadic-hacker.com

Announcing

Richard Feynmann 1 has once said: What I cannot create, I do not understand. To understand Neural Networks, and how the recent Machine Learning evolution happened


nowosad.github.io

Finding similar local landscapes

Can you guess what are we missing in this long list? Yes, you’re (probably) right! Finding places with similar spatial patterns is not a standard spatial operation. However, it could be useful to solve many potential questions


www.nomadic-hacker.com

From Zero to GPU 1 - A new neural network is born

Welcome to the first post in my From Zero to GPU series. Where I talk about aspects of neural network implementations


masalmon.eu

Get on your soapbox! R blog content and promotion

There are plenty of R things you could blog about! Then, an important aspect of R blogging is that you don’t need to blog about R code! Examples of relevant R content I’ve seen include: Now, if you still have trouble finding ideas, you can find inspiration by..


www.ifconfig.it/hugo

War stories - The Docking Station

This story starts with a phone call at night. If you worked in IT long enough you know what it means


evangelinereynolds.netlify.com

Where should you declare aesthetics? Globally, or geom-by-geom?

Where should you declare aesthetics? Globally or in the geom_*() function? The answer to this question, in some sense is personal preference, because there are simply different ways to get the same job done in the ggplot architecture. My preference is declaring all aesthetic mappings as global unless there are conflicts


djnavarro.net

Day 67-81

The motivation to face the fear is similarly straightforward: my R code runs too slowly for some of the problems I care about. It doesn’t come up that often, to be honest. Most problems I work on are small enough that it really doesn’t matter that my R code is


www.nomadic-hacker.com

Hello, to a brand new World

So, after putting it off for years


magesblog.com

Hierarchical loss reserving with growth curves using brms

The abstract of the paper motivates the model well: As usual, it is the aim to predict the future claims payments for the various accident years


www.semidocumentedlife.com

how should I get started with R?

Here’s some evergreen advice from David Robinson: Many of the folks I talk to about learning R have little or no experience with “real” programming languages, which described myself when I first installed the language. If you’re in this camp, I have a few recommendations to get started


www.carlbfrederick.com

Concentration of Senate Representation

Since the data are readily available, I decided to look into things to decide how strongly I should argue this point in the future


lenkiefer.com

Housing in the Golden State

I am headed out west, to California to talk housing at the Western Secondary Market Conference. After my talk they might post my slides online somewhere. If they do I’ll link to them, but for now you can get a preview in this twitter thread


lenkiefer.com

New Blog Style!

I decided to switch over my blog theme. The Ghostwriter theme I used was nice, but it didn’t have a blog archive. As the number of posts grow a blog archive is easier to search. We still have tags you can search. I’ve adopted the Hugo Blackburn theme


yihui.name/en

No Description, Website, or Topics Provided

Usually I tend to be conservative on actively marketing my own products, but I have also seen people who seem to be totally unaware of marketing, which is a pity in my eyes. A very typical example is a Github repo that has no description, website URL, or topics, which looks like this: I sigh a sigh every time I see it


irene.rbind.io

Rats to reefs

This week, I came across two news articles about a study in Nature led by Nick Graham that linked invasive rats on islands to coral reefs


www.williamrchase.com

Saturday Success #1

I had a few science successes this week, like collecting some decent AFM images, learning purrr, or posting on this blog and tweeting (two of my outreach-related goals)


www.williamrchase.com

Friday Fails #1

“Stop judging yourself against shiny people. Avoid the shiny people. The shiny people are a lie


wenlong-liu.github.io

Generate a reproducible map for county-level fertilizer estimation data in U.S.A. using R

More than 70% of researchers have tried and failed to reproduce another scientist’s experiments, and more than half have failed to reproduce their own experiments. There are also some packages required to reproduce this post. If you have not installed them, please run the following codes


research.libd.org/rstatsclub

LIBD rstats club remote useR!2018 notes

Next, check the videos of the talks. There are more videos there than we can check right now but we hope to come back sometime later and check more talks. From checking Twitter, we can say that there lots of great talks and tutorials. Here are some of the main ones we found in this hour


yihui.name/en

R Markdown

I don’t know other people’s secrets on how to create successful software, but my experience is that if you create a software package that you like to use by yourself on a daily basis, it is likely to succeed and will be used by many other people, too


yihui.name/en

The User-Developer Spectrum in the R Ecosystem

Markdown is for 90% of the results from 10% of the effort. Very well said. I also find this odd: “If R seems a bit confusing, disorganized, and perhaps incoherent at times, in some ways that’s because so is data analysis


blog.wallaroolabs.com

Detecting Spam as it happens

Suppose your social network for chinchilla owners has taken off. Your flagship app contains an embedded chat client, where community members discuss chinchilla-related topics in real-time. As your user base grows, so does its value as a target for advertising


cattleguard.github.io

How To Apply Google's CausalImpact Package to Analyze Infosec Intervention

Google released their CausalImpact package a few years ago and when they did my mind started racing with ideas for information security and information risk applications. Imagine if you could propose a control, policy change or process improvement with an expected effect on a response variable, which would lead you to purposefully defining a way to measure intervention outcomes


www.rostrum.blog

How accessible is my post about accessibility?

Matt Dray The accessibility empathy lab at the Government Digital Services building Digital accessibility I wrote about an accessibility workshop at the recent Sprint 18


simplystatistics.org

Teaching R to New Users - From tapply to the Tidyverse

The intentional ambiguity of the R language, inherited from the S language, is one of its defining


ropensci.org/technotes

phylogram

As an example, a simple three-leaf dendrogram can be created from a nested list as


yihui.name/en

Do You Have to Use FontAwesome or Other Libraries for Web Symbols?

Certain HTML entities may not work in all web browsers, but I don’t really care


mgb-research.netlify.com

Interaction Plots with Continuous Moderators in R

Long ago (the first half of my grad school life), I created a model for a manuscript I submitted


yihui.name/en

One Little Thing

You can also provide the list of files programmatically, e.g., For multiple files, they are first compressed to a zip file, and the zip file will be embedded


gcppodcast.com

VirusTotal with Emi Martínez

Emiliano has been with VirusTotal for over 10 years. He has seen the business grow from a small startup in southern Spain into a Google X moonshot under the new Chronicle bet


yihui.name/en

Write / Don't Write the Whole Book in bookdown

don’t write your book from start to finish in bookdown. it’s too easy to hit a bug, and it’s impossible to interactively debug. also, today my figure captions stopped working. write the whole thing in #bookdown compiles seamlessly to webook, pdf epub


yihui.name/en

Yue Jiang

Apparently, Yue Jiang (Uchiha?) is a qualified ninja with Sharingan. Once again, we love software


cevo.com.au

3 months into an all-in AWS migration

I’m currently Tech Lead on an all-in Amazon Web Services (AWS) migration for the Australian arm of a multinational company


www.rdatagen.net

How the odds ratio confounds

My aim here is to generate a few figures that might highlight some of these issues. With a constant odds ratio of 3, the risk ratios range from 1 to 3, and the risk differences range from almost 0 to just below 0.3


yihui.name/en

Only One Person Can Help You with That

Only one person can help you with that, and it is most effective to contact him directly. That is also what I’m deeply worried about. When users run into certain problems, only one person can help. One person managing more than 12,000 binary packages (although most should be easy to build)


blog.rstudio.com

RStudio Connect v1.6.4.2 - Security Update

RStudio remains committed to providing the most secure product possible


rviews.rstudio.com

Solver Interfaces in CVXR

GUROBI is handled in a similar fashion


evangelinereynolds.netlify.com

Tuition Increases for Tidy Tuesday

Here’s the code for my first Tidy Tuesday


atusy.github.io/blog

xetexでunicode文字

$\LaTeX{}$ で μ や α など特殊文字を直打ちすると、 □になってしまうことがしばしば。 XeTeXを使っている場合は、 \setmainfont{IPAMincho}


www.tidyverse.org/articles

Carpe Talk

Conference talks are a great opportunity to help people learn about the cool and useful things you have built. Given all the hard work you’ve already put in, a bit of marketing effort can be a wise investment in drumming up users


www.justadatageek.com

My Thoughts On Bitcoin

I enjoy listening to/reading the stories on the various business news outlets about cryptocurrencies – I have been casually reading about blockchain technology and its different applications, but that is not the focus of this post. While I am not an “expert” on cryptocurrencies, I have come to the opinion that they are not really currencies


alison.rbind.io

Read data with multiple header rows into R

More options: Now we are ready to diagnose the problem! All together now: the final


blog.zenggyu.com/en

Setting Up Visual Studio Code

This is one of a series of posts where I document software configurations for personal reference. This post documents the configurations for Visual Studio


lenkiefer.com

Mortgage rates in the 21st century

Let’s compare two charts


sciathlon.github.io

Strava time vs distance data analysis

Hello everybody, I have a new run data analysis today: from my own strava data! I have been much slower recently on the posts, I will probably be until the end of july, because of work for my PhD, so I’m sorry about that


www.njtierney.com

A note on ggplot code style

I’ve got some opinions about how to write ggplot code. So, if there are more than two sections in a function, these should be separated on a


mouse-imaging-centre.github.io/blog

Linear Models

Introduction library(tidyverse) library(matlib) library(knitr) library(RColorBrewer) The purpose of this document is to understand the parameter and residuals error estimates in a basic linear regression model when working with binary categorical


nowosad.github.io

Pattern-based Spatial Analysis - core ideas

Take a look at the example above and compare information depicted by the compositional and co-occurrence histograms. The first one shows that each land cover category occupies a very similar proportion of the area


www.onceupondata.com

Tidy Eval Meets ggplot2

Then you can either pass column names directly or the variable names. The previous options are good and valid in many cases, but there are some limitations. For instance, you cannot create a function and pass column names unquoted as follows


simplystatistics.org

What Should be Done When Data Have Creators?

All this got me thinking about how screenwriters are often limited in what they can write by the fact that the material they are writing was originated by someone else


evangelinereynolds.netlify.com

Wide data to long using the tidyverse (tidyr's gather function)

A wide data storage format is an efficient and compact way to store information. And this organization perhaps it makes data easier to inspect. We have wide monitors our laptops and destops. However, for visualization and analysis you generally need to transform this data from the wide format to a “tidy”, long format


mgb-research.netlify.com

Bayesian Multilevel Model with Missing Data

This is the first post in a three-part blog series I am putting together. The focus of this initial post is effective exploration of the reasons for missingness in a particular set of data


ropensci.org/blog

Exploring ways to address gaps in maternal-child health research

I was aware at all times that I had only islands of knowledge separated by darkness; that I was surrounded by chasms of not-knowing, into one of which I was certain to fall. One of the best ways to start feeling less intimidated is to start talking to others. Ullman continues, I learned I was not alone


atusy.github.io/blog

GitHub pages with Rmarkdown

遅蒔きながら、Rのblogdownパッケージを使ってblogを始めてみた。


nowosad.github.io

Life (expectancy), animated

Global socio-economic data is easily accessible nowadays


emmavestesson.netlify.com

My first hackathon (part 2)

Gender pay gap hackathon (part 2) This is part 2 of my blog about the gender pay gap hack that I went to. You can read part 1 here. Reflections It has taken me a long time to write the second part of my experience of the hackathon


sharanry.github.io

Non-Centered Eight Schools Model with PyMC4

We are finally at a state where we can demonstrate the use of the PyMC4 API side by side with PyMC3 and showcase the consistency in results by using non-centered eight schools model. I will be comparing the PyMC3 and PyMC4 way of doing the same task


www.aggieerin.com

Quick and Dirty Categorical lavaan

I was tagged today on twitter asking about categorical variables in lavaan. I will say I have not done much with categorical predictors either endogenous or exogenous. I did a quick reproducible example of exogenous variables, and I will refer you to the help guide for lavaan here


toscano84.github.io

Using R to analyse the German Federal Election

As the title of this post implies we will analyze, using the statistical programming language R, the German Federal Election which took place on 24 September of 2017. It will not be an exhaustive analysis of the results. I’m only interested in visualizing the share of the vote that each party represented in the Parliament (i.e


mlr-blog.netlify.com

Why R Conference

This July we had the great honor to present mlr and its ecosystem at the WhyR 2018 Conference in Wroclaw in Poland


www.tidyverse.org/articles

ggplot2 3.0.0

Install ggplot2 with: Aesthetics can now be specified independent of the scale name


toscano84.github.io

About me

Welcome to my blog! I’m Hugo! My background is in Psychology, more specifically, in the area of face perception. Always eager to learn and after falling in love with R, I will blog mostly about data related topics (e.g. importing, wrangling, visualization, statistics). Not yet Machine Learning, still exploring it


www.stat.cmu.edu/~ryurko

Bayesian Baby Steps

Before going into the regression example with a predictor, it’s worthwhile to first demonstrate quadratic approximation by just modeling the score differential with a Gaussian


gcppodcast.com

Connected Games with Unity and Google Cloud with Brett Bibby and Micah Baker

As Product Manager leading the strategy for Gaming on the Google Cloud Platform, Micah is committed to enabling developers to realize their vision for great games


yutani.rbind.io

gghighlight 0.1.0 Is Released!

gghighlight 0.1.0 is on CRAN now! One more small news is, gghighlight got an introductory vignette


yihui.name/en

On 'Quick Questions'

“Hey Yihui, quick question for you…” For many times I have been asked “quick questions” on Twitter, on Slack, or in emails


yihui.name/en

A CRANextra Repository for Homebrew and R Users on macOS

There will be no more Step 1, 2, ..


lenkiefer.com

Exploring housing data with R and IPUMS USA

In this post I want to share some observations on housing in the United States from 1980 to 2016, share some R code for data wrangling, and tri (no that’s not a typo, just a pun) out a visualization techniques. Let’s get to it. I’ve been carrying a running conversation with folks on Twitter regarding the U.S. housing market and its future


simplystatistics.org

Cultural Differences in Map Data Visualization

The maps need to be usable, but they also need to fulfill cognitive goals on cultural levels that go beyond what any given user might know they need. For instance, in the U.S


djnavarro.net

Day 63-66: Learning to skim

But everything is “under control” now, at least for a very expansive definition of “control”. The kids are keeping themselves occupied, I’ve responded to some overdue emails, and I’m making inroads into the Saturday morning laundry


www.rostrum.blog

Markov-chaining my PhD thesis

Matt Dray Doc rot I wrote a PhD thesis in 2014 called Effects of multiple environmental stressors on litter chemical composition and decomposition. See my viva presentation slides here if you don’t really like words


divingintogeneticsandgenomics.rbind.io

my first try on Rmarkdown using blogdown

I have used blogdown writing regular markdown posts, but the real power is from the Rmarkdown! let me try it for this post. Note that you do not knit the Rmarkdown by yourself, rather you let blogdown do the heavy lift. It is awesome! blogdown will knit and render the code and the output into a html file


divingintogeneticsandgenomics.rbind.io

hugo academic theme blog down deployment (some details)

It is quite straightforward to have a working site following Alison’s guide. However, you always want some customization of your own site. I took the tips from Leslie. and changed to twitter: from to to from to The wideget will be gone in the home page. from to Now, you can check the traffic reports in real-time


blog.wallaroolabs.com

Real-time Streaming Pattern

Introduction This week, I will continue to look at data processing patterns used to build event triggered stream processing applications, the use cases that the patterns relate to, and how you would go about implementing within


blog.sellorm.com

Running Python in the RStudio IDE

I’ve had a few different people ask me variants of the same question lately, which is: “How can I run python code on a server in a similar way to using R with RStudio Server”


www.tidyverse.org/articles

bench 1.0.1

Install the latest version with: Results are easy to interpret, with human readable units in a rectangular data frame. You can also produce fully custom plots by un-nesting the results and working with the data directly


divingintogeneticsandgenomics.rbind.io

Backup automatically with cron

cron is a Unix, solaris, Linux utility that allows tasks to be automatically run in the background at regular intervals by the cron daemon. Crontab (CRON TABle) is a file which contains the schedule of cron entries to be run and at specified times