ritsokiguess.site/docs
Tidying weather data
Introduction Weather data often comes in an untidy format that is suitable for looking at, but not so suitable for doing any kind of analysis with…
gcppodcast.com
Dataprep with Eric Anderson
Eric is a Product Manager at Google working on Cloud Dataprep and recently Cloud Dataflow. Previously he was at Amazon Web Services, Harvard Business School, General Electric and University of Utah. He’s from Salt Lake City, Utah and lives in Mountain View, California with and wife and three kids…
lenkiefer.com
Housing market dataviz
IT IS THANKSGIVING WEEK HERE IN THE UNITED STATES. I’m getting ready to go out for a nice casual drive down Interstate I-95. Should be fun. After I get back stuffed with turkey and whatnot, we’ll get back to data visualizations and analysis…
vuorre.netlify.com
Open Access Mandates
Here are some notes on how the six features can be achieved using nothing else but “stuff on the internet”. I won’t provide detailed arguments, but just quick notes on how each of these features can already be implemented with minimal cost to researchers…
www.mytinyshinys.com
EPL week 12
Match of the DayFollowing a defeat to North London rivals, Spurs now talking top 4, not title Pulis is outStodgy football and a points-per-game average lower than Steve Clarke will do that for you. His spell at WBA ended with the lowest ppg average of any of his three clubs…
jvera.netlify.com
Geocoding with R and mapZen
The most widely used service for Geocoding is Google Maps, but you’d hit the limits early in your project if you don’t have a paid account. I often place data on a map, so I need precise geocoding. It’s not unlikely you had to test some other geocodig options ‘til you get the budget for a Google Maps paid account…
www.ifconfig.it/hugo
Moving Complexity
I read a lot of discussions about complexity in networking and IT today that include a large amount of FUD…
jvera.netlify.com
Atom for R
Last week I was working on some projects involving Posix and R code. I tend to use Atom Editor for such code instead of Rstudio, and sometimes I think the one an only valid contender against Rstudio is Atom Editor…
ritsokiguess.site/docs
Chuff chuff
Introduction We are very accustomed to journey planners that help us navigate across the city, the country or the world. Google’s transit directions are an example: you enter a start and an end point, and it tells you which vehicles to get and where to transfer to another one…
asch3tti.netlify.com
Attractive Attractor
My first post is going to be trivial but, I think, quite mesmerizing. This is sweet, but something is missing… perhaps there are not enough points. Let’s use 10,000,000! That’s more like it. Now let’s make another attractor, using 10,000,000 magenta points…
jvera.netlify.com
My Docker images for R
This month I reached the dreaded “2 hours limit” at Docker Hub on building my Docker images. The images are built but the message shown is “build cancelled”. You can still use it, but on the front desk, seems that I’m not doing a good job. Clearly, something to fix…
cattleguard.github.io
Data Cleaning Practice Pt1 - Steak Stats
For miles down I-40 you’ll see billboards for this restaurant and brewery advertising a free 72oz steak, if you can eat it. An office debate erupted amongst two of my colleagues over just what types of people end up beating the beast. They argued. I fired up rvest…
blog.mgechev.com
Faster Angular Applications - Understanding Differs. Developing a Custom IterableDiffer
In this article we’ll take a look at another Angular abstraction - the differs and more specifically the IterableDiffer; we’ll explain what the differs are and how the framework uses them internally…
www.onceupondata.com
Giving My First Data Science Talk
People have different reasons to give talks. For me the main motives are: All that showed how a supportive community can make a difference and help more people get engaged…
roelandtn.frama.io
Social division of the residential space of the Greater Mexico City
EDIT: I provided a quick update here, and it will be the main entrance of a post series regarding this subject, cheers ! https://roelandtn.frama.io/post/update-for-the-mexico-project/ Hello there ! It has been a long time since my last post…
www.jamesuanhoro.com
A Chi-Square test of close fit in covariance-based SEM
TLDR: If you can assume close fit for the RMSEA, there is no reason why you cannot for a Chi-Square test in SEMs…
lenkiefer.com
Majestic mortgage rate plot
COME AND MAKE A MAJESTIC MORTGAGE RATE PLOT WITH ME. We’ll use R to plot a few visualizations of mortgage rates. I recently gave a number of talks about the economic outlook and housing. One point I like to make is that mortgage rates are low. I’ve shown this through a variety of visualizations…
blog.wallaroolabs.com
Non-native event-driven windowing in Wallaroo
Certain applications lend themselves to pure parallel computation better than others. In some cases we require to apply certain algorithms over a “window” in our data…
www.rdatagen.net
Visualizing how confounding biases estimates of population-wide (or marginal) average causal effects
When we are trying to assess the effect of an exposure or intervention on an outcome, confounding is an ever-present threat to our ability to draw the proper conclusions…
www.tidyverse.org/articles
withr 2.1.0
Install the latest version with: One common idiom for dealing with this problem is to save the current state, make your change, then restore the previous state. However this approach can fail if there’s an error before you are able to reset the options. Here are some highlights of new functions for v2.1.0…
cattleguard.github.io
3 Keys for Successful Products and Programs Before You Even Start
If you’ve been an information security practitioner for more than a few years, you’ve likely witnessed your share of disappointing purchases, implementations and initiatives…
gcppodcast.com
Performance Atlas with Colt McAnlis
Colt McAnlis is a Developer Advocate at Google focusing on performance & compression. Before that, he was a graphics programmer in the games industry working at Blizzard, Microsoft (Ensemble), and Petroglyph…
livefreeordichotomize.com
Secret Sampling 🎅🤶
’Tis the season for white elephant / גמד וענק / Yankee swap / secret santa-ing! There are various rules for this, for our version: 🏷 each participant receives the name of someone else to purchase a gift for 🎁 gifts are exchanged at a party 🤔 the receiver is tasked with guessing who the gift-giver was! We thought it’d be particularly fun to do it #rstats…
www.blog.rdata.lu
Functional peace of mind
The problem with this approach, is that you cannot reuse any of the code there, even if you put it inside a…
blog.wallaroolabs.com
Identifying Trending Twitter Hashtags in Real-time with Wallaroo
This week we have a guest post written by Hanee’ Medhat Hanee’ is a Big Data Engineer, with experience working with massive data in many industries, such as Telecommunications and Banking…
www.datalorax.com
Mapping Statewide School Ratings with US Census Tracts
In this post, I’d like to share some work related to geo-spatial mapping, statewide school ratings, and US Census Bureau data using census tracts…
blog.mgechev.com
Faster Angular Applications - Part 2. Pure Pipes, Pure Functions and Memoization
In this post, we’ll focus on techniques from functional programming we can apply to improve the performance of our applications, more specifically pure pipes, memoization, and referential…
blog.sellorm.com
Talk: An Operating Model for R
I’ve given a version of this talk at all three EARL conferences this year starting in San Francisco…
livefreeordichotomize.com
Thanksgiving Gantt Chart
Thanksgiving 🦃 is right around the corner 🎉 – this year we are hosting 17 people 😱…
www.ifconfig.it/hugo
White boxes for everyone?
White boxes and their impact on enterprise networking is a hot topic today, with many point of…
blog.mgechev.com
Faster Angular Applications - Part 1. On Push Change Detection and Immutability
On AngularConnect 2017 in London, I gave a talk called “Purely Fast.” In the presentation, I showed how step by step we can improve the performance of a business application…
blog.sellorm.com
Installing R on RedHat Linux 7
#Pre-requisites Before we begin, it’s a good idea to install some general purpose tools that will help us out once R is installed. The first of these dependencies are a bunch of development tools…
cevo.com.au
Recognising the fallacy of a single root cause
When something goes wrong it’s often tempting to attribute it to a single root cause: The site was down because the database crashed. The application broke because the developer pushed bad code. The machine stopped responding because it ran out of disk space…
ewen.io
Hacking Homelessness & PDF Prisons
How to extract data from PDFs (in this case, homelessness figures in London), alongside a showcase of algorithmic tesselation of geospatial…
leonawicz.github.io/blog
Make memorable plots with memery. v0.3.0 now on CRAN.
Please do share your data analyst meme creations.…
www.samatkins.me
Becoming a professional web developer
Photo by Rohit Tandon on Unsplash A few years ago I decided to transition career and become a professional web developer. Married, with a young son, and a full time job as a management consultant (i.e. long hours and lots of travel) this was not an easy undertaking…
livefreeordichotomize.com
LSTM neural nets as told by baseball
Currently I am studying for my qualifying exams on which the topic is “using deep neural networks for classification of time series data…
gcppodcast.com
Smart Parking and IoT Core with Brian Granatir
Brian Granatir has been developing for the cloud since the beginning, back in 2007. He left Oregon and moved to New Zealand to be with his future wife in 2014. In 2017, he joined Smart Parking to help with the development of their new Smart City platform built on GCP…
lenkiefer.com
Tour of U.S. metro area house price trends
HEY! HERE IS A VIDEO SHOWING HOUSE PRICE TRENDS around the United States. Earlier this year we looked at how to get the data and plot it using R. I made the video using the PowerPoint to .mp4 workflow I outlined here. Below I’ll review how to build this file…
ropensci.org/technotes
solrium 1.0
Or get the development version: If the instance uses SSL, simply specify that like: This change does break the interface we had in the old version, but we think it’s worth it…
www.rdatagen.net
A simstudy update provides an excuse to generate and display Likert-type data
The ordinal data is generated after a data set has been created with an adjustment variable. We have to provide the data.table name, the name of the adjustment variable, and the base probabilities. That’s really it…
yutani.rbind.io
An Example Usage of ggplot_add()
You may wonder why this can’t be written like this: Let me explain a bit…
r-tastic.co.uk
Automated and Unmysterious Machine Learning in Cancer Detection
First, let’s load the data: .. and do some data cleaning: change column names, get rid of the order in factor levels and remove rows with empty cells: That’s better! Now, let’s set up the local H2O instance… ..…
yonicd.netlify.com
Firearms Sourced and Recovered in the United States and Territories 2010-2016
I want to try and probe a question that was raised since Las Vegas and now revived due to the tragedy in Sutherland Springs,TX: Given the free trade between states, can a state unilaterally regulate firearms…
lenkiefer.com
JOLTS a dataviz trilogy
LET’S TAKE A LOOK AT RECENT LABOR MARKET TRENDS IN THE UNITED STATES. Below we’ll plot labor market trends using the U.S. Bureau of Labor Statistics Job Openings and Labor Turnover Survey (JOLTS). Last year we looked at how to get the data and plot it using R…
www.njtierney.com
Tidyverse Case Study
… data sets regarding songs on the Billboard Hot 100 list from 1960 to 2016, including ranks for the given year, musical features, and lyrics. So, this blogpost walks through how you might start to unpack the data, clean it, and draw some interesting conclusions. I also wanted to avoid the “draw the rest of the fucking owl”…
lenkiefer.com
A note on competing risks
WE ARE LATE FOR HALLOWEEN, but let’s get out our broom and purrr as we tidy some statistical results. Today I had occasion to be reminded of competing risks and a handy statistical result on competing risks from A.P. Basu and J.K…
www.mytinyshinys.com
EPL week 11
Match of the DayMourinho still can’t buy a result away against big clubs The Spanish connectionMorata connected up with Azpilicueta’s cross for Chelsea’s winner against Manchester United and he has now assisted on five of the striker’s seven league goals this season, only topped by six of the 52 goals Diego Costa scored for the…
ewen.io
My Linear(ish) Programming Fantasy
Applying linear programming principles to the selection of a fantasy football…
www.ifconfig.it/hugo
VMware NSX VCP6-NV
Today I passed exam 2V0-642 to update my VCP5-DCV and got VCP6-NV (NSX v6.2) I’ll share here what I used to prepare the exam. Learning Path I passed exam VCP5-DCV in 2015. At the time I expected to work more with datacenter technology and I needed a more in-depth knowledge of how virtualization works…
lenkiefer.com
Recent trends in U.S. housing markets
LET US REVIEW HOUSING MARKET TRENDS in the United States through the first three quarters of 2017. Economic background The overall economic environment remains favorable for housing. Interest rates are low, the labor market has been solid and income growth, while modest, has begun to tick up…
www.datalorax.com
esvis: Part 1
This is the first of a series of posts to introduce my new esvis R package, why I think it’s important, and some of its capabilities. As of this writing the current version on CRAN is 0.0.1, so it’s obviously still fresh and may have some bugs…
blog.brianz.bz
Accessing VPC Resources from AWS Lambda
I’m currently working on a book for Packt publishing titled Serverless Design Patterns and Best Practices. While writing and whipping out tons of examples is quite a bit of work, and I sometimes curse myself for agreeing to this, I’m quite excited as I work through the chapters and as it comes together…
eddjberry.netlify.com
Machine learning and k-fold cross validation with sparklyr
In this post I’m going to run through a brief example of using sparklyr in R. This package provides a way to connect to Spark from within R, while using the dplyr functions we all know and love. I was entirely new to Spark, and databases in general, before having a play with sparklyr…
blog.wallaroolabs.com
Open-source your startup’s code in 60 days
I’m Vid Jain, CEO & Founder of Wallaroo Labs. I’m writing today to tell you about how we open-sourced Wallaroo, our software framework for data processing, and the lessons we learned along the way. Our engineering team was experienced at writing great software, but now we faced a new set of challenges…
gcppodcast.com
Cloud IoT Core with Indranil Chakraborty and Gabe Weiss
It’s time to learn everything about Cloud IoT Core from Indranil Chakraborty, Product Manager, and Gabe Weiss, Developer Advocate on IoT. Indranil is a product manager for Google Cloud Platform and leads product strategy and development for Cloud IoT Core. Previously, he held product management roles at Google Fiber and sales strategy roles for Google AdWords…
www.rdatagen.net
Thinking about different ways to analyze sub-groups in an RCT
Here’s the scenario: we have an intervention that we think will improve outcomes for a particular population…
www.tidyverse.org/articles
lubridate 1.7.0
Lubridate is a package that makes working with date-time and time-span objects easier. It provides fast and user friendly parsing of date-time strings, extraction and updating of components of a date-time objects (years, months, days etc.) and algebraic manipulation on date-time and time-span objects…
www.njtierney.com
Bringing Together People and Projects at ozunconf17
What we created was a bunch of awesome individuals from 12 different countries. Before the ozunconf we discussed and dreamt up projects to work on for a few days, then met up and brought a few of them into reality. Before the ozunconf, we discussed various ideas for projects in GitHub…
www.mytinyshinys.com
EPL week 10
Match of the DayKane-less Spurs lose to Pogba-less United and trail City by 8 points Final Top 6?The accepted ‘top six’ i.e Chelsea, Spurs, Man City, Liverpool, Arsenal and Man Utd…
magesblog.com
Goodbye Blogger, welcome Hugo
However, over the last year or so, a couple of things started to annoy me so much that I stopped enjoying writing posts on Blogger. For example, there is no easy way to bulk edit older posts (I wanted to change links from files hosted on public Dropbox folders elsewhere), or to write posts in Markdown or better RMarkdown…
www.tidyverse.org/articles
glue 1.2.0
Install the latest version with: glue is also vectorized over its inputs…
vuorre.netlify.com
Bayesian Estimation of Signal Detection Models, Part 4
However, we almost always want to discuss our inference about the population, not individual subjects. Further, if we wish to discuss individual subjects, we should place them in the context of other subjects…
lenkiefer.com
Dynamic Model Averaging Presentation Slides
I PUT TOGETHER SOME SLIDES SUMMARIZING our recent work on dynamic model averaging. See here and here for more blah blah blah. See below for some slides. Click here for a fullscreen version here. Making the Preso Let me also share with you the R code I used to generate these slides…
giorasimchoni.com
How Much For the Watch?
This post serves a few purposes: Exercising Deep Learning with Regression, as opposed to Classification, which is my comfort zone. A tribute to my previous working place - ebay. Trying to combine text AND image data in a “Bi-Modal” model - never did it before. Doing something which involves Fashion…
thug-r.life
How many random numbers does it take?
Fermat and his library This morning I woke up to a delightful tweet from fermatslibrary about sample random uniform numbers and how many it takes, on average, to sum to 1…
lenkiefer.com
A closer look at forecasting recessions with dynamic model averaging
BACK WE GO INTO THE VASTY DEEP. LAST TIME we introduced the idea of using dynamic model averaging to forecast recessions. I was so excited about the new approach that I didn’t take the time to break down what was going on with it…
www.ifconfig.it/hugo
Cisco ASA show connections ordered
When a customer calls with a problem or request I often see a chance to investigate a technology, learn something new or apply random skills to find a creative solution. This time is about an ASA, customer noticed too much traffic on the Internet facing interface…
www.jamesuanhoro.com
Misspecification and fit indices in covariance-based SEM
TLDR: If you have good measurement quality, conventional benchmarks for fit indices may lead to bad decisions. Additionally, global fit indices are not informative for investigating misspecification. I am working with one of my professors, Dr…
jvera.netlify.com
Rocker: Docker for Rstats
Last Tuesday (October 24th) I was at Madrid R User Group to give a tech talk about using Docker for automating our setup with Rstudio. Using Rocker images for easy and quick deployment. Here, the slides hoping someone will find it useful.…
www.blog.rdata.lu
Scraping data from the local elections
One of my journalist friend was looking at the result of the local election in Luxembourg and he was dissatisfied because he was unable to compare the results of all the communes…
www.njtierney.com
So, you’ve decided to change your r package name
I’ve had to change my R package names a few times. Every time I’ve had to do this, there were a few things I had to remember to do. Here’s a blog post that describes how to do that. When you change your package name, here is a list of things you need to do. And that’s basically it! Hooray, now to handle the git…
www.mytinyshinys.com
Archer Memes
I have rather belatedly gotten around to viewing the adult animated series, Archer, on Netflix…
www.blog.rdata.lu
Easy peasy STATA-like marginal effects with R
First, let’s load some packages: And it is also possible to plot the effects with base graphics: Which makes it possible to extract the effects for a list of individuals that you can create…
lenkiefer.com
Forecasting recessions with dynamic model averaging
HERE THE LITERATURE IS VASTY DEEP. In this post we’ll dip our toes, every so slightly, into the dark waters of macroeconometric forecasting. I’ve been studying some techniques and want to try them out. I’m still at the learning and exploring stage, but let’s do it together. In this post we’ll conduct an exercise in forecasting U.S…
yutani.rbind.io
Publish R Markdown to Medium via An RStudio Addin
mediumr allows you to knit and post R Markdown to Medium. You can install mediumr from github with: The addin knits the Rmd and shows the preview dialog…
blog.wallaroolabs.com
Why we used Pony to write Wallaroo
Hi there! Today, I want to talk to you about why we chose to write Wallaroo, our distributed data processing framework for building high-performance streaming data applications, in Pony. It’s a question that has come up with some regular frequency from our more technically minded audiences…
yonicd.netlify.com
Extending slackr
This lets us interact with R and Slack in new ways, by getting active updates on long simulations directly to your (and your team’s) mobile device and multitask away from your computer…
lenkiefer.com
Home sales in expansions and recessions
LET’S LOOK AT NEW HOME SALES. Today the U.S. Census Bureau joint with the Department of Housing and Urban Development (HUD) released new home sales estimates through September of 2017…
yutani.rbind.io
How Not To Knit All Rmd Files With Blogdown
Compiling all Rmd files is “safe” in the sense that we can notice if some Rmd becomes impossible to compile due to some breaking changes of some package. But, it may be time-consuming and can be a problem for those who have a lot of .Rmd files…
jessesadler.com
Introduction to Network Analysis with R
Compare this to an adjacency matrix with the same data…
gcppodcast.com
Vint Cerf
Google, the Cloud, or podcasts would not exist without the internet, so it’s with an incredible honor that we celebrate our 100th episode with one of its creators: Vint Cerf…
www.stencilled.me
Visualizing Airbnb listings.
In this day and age of so many sharing services like Uber and Lyft , pricey hotels are being replaced by Airbnb…
yonicd.netlify.com
jsTree htmlwidget
jsTree is a R package that is a standalone htmlwidget for the jsTree JavaScript library. It can be run from the R console directly into the Rstudio Viewer, be used in a Shiny application or be part of an Rmarkdown html output…
leonawicz.github.io/blog
Climate explorer update
These conditional distributions for historical and projected temperature and precipitation over different geographic regions, time periods, climate models and greenhouse gas emissions scenarios represent the source data sets available in the app…
lenkiefer.com
Sharing is caring
LET’S MAKE SHARING BETTER ON THIS PAGE. I saw this today: 💫 how-to for making your #blogdown social-friendly: “Socialize your blogdown” by @xvrdm https://t.co/RurfUvRf6X #rstats #opengraph pic.twitter…
www.rdatagen.net
Who knew likelihood functions could be so pretty?
We are generally most interested in finding out where the peak of that curve is, because the parameters associated with that point (the maximum likelihood estimates) are often used to describe the “true” underlying data generating…
www.gokhanciflikli.com
Automatic Time-Series Forecasting with Prophet
Seasonality and Trends Time-series analysis is a battle on multiple fronts by definition. One has to deal with (dynamic) trends, seasonality effects, and good old noise…
lenkiefer.com
Combining PowerPoint and R's tweenr for smooth animations
IN THIS POST I WANT SHARE A METHOD FOR MAKING SMOOTH POWERPOINT ANIMATIONS USING R…
roelandtn.frama.io
liftr (Rmarkdown using docker)
I recently discover a R package, called liftr package. It allows the build of pdf document (and html files, but i didn’t test it) from a Rmarkdown file. Fully integrated in RStudio, the R code is executed (and other code as well, I tried python) and results are displayed…
yutani.rbind.io
Confession
You may notice that diffs of .Rd files are suppressed by default on GitHub since some time. Do you wonder who did this? It’s me, yay! This is my pull request: Though I thought I’ve done the right thing at that time, now I’m afraid this change may be bad for some cases… After the relese of roxygen2 6.0.0, the game has changed a bit…
lenkiefer.com
Purrrtier PowerPoint with R
WE ARE ON OUR WAY TOWARDS BUILDING a tidy PowerPoint workflow. In this post I want to build on my earlier posts (see here for an introduction and here for a more sophisticated approach) for building a PowerPoint presentation with R and try to make it even purrrtier…
blog.sellorm.com
Quick Script to Install an R Package from the Command Line
I wrote a really quick script to install R packages from the command line that I thought I’d share. It doesn’t really do a great deal, but you can use it to install one package at a time. Save the below as rpkginstall and make sure it’s executable with chmod + x rpkginstall. Then you can install a package like this example, which would install dplyr, …
www.ifconfig.it/hugo
Innovation sirens singing
In episode 13 of the Network Collective podcast around minute 26 Jordan Martin asks: Aren’t we all just following a trend? The discussion topic is how to mentor juniors in a learning path to grow their skills and be experts…
www.riinu.me
Your first Shiny app
I am completely obsessed with Shiny and these days I end up presenting most of my work in a Shiny app. If it’s not worth putting in a Shiny app it’s not worth doing. Getting started with Shiny is actually a lot easier than a lot of people make it out to be…
blog.wallaroolabs.com
How Wallaroo Scales Distributed State
Scaling stateful applications is hard. As your business grows, you’re eventually going to find that demand is greater than capacity. That means you can’t simply deploy your application to a set number of servers and forget it…
lenkiefer.com
Mortgage rates are low!
MORTGAGE RATES ARE LOW IN THE UNITED STATES. How low? Let’s take a look. We’ll use R to plot a few visualizations of mortgage rates. We’ll also try out some of the nice features in the tibbletime package that help when working with time series data…