ritsokiguess.site/docs

Tidying weather data

Introduction Weather data often comes in an untidy format that is suitable for looking at, but not so suitable for doing any kind of analysis with


gcppodcast.com

Dataprep with Eric Anderson

Eric is a Product Manager at Google working on Cloud Dataprep and recently Cloud Dataflow. Previously he was at Amazon Web Services, Harvard Business School, General Electric and University of Utah. He’s from Salt Lake City, Utah and lives in Mountain View, California with and wife and three kids


lenkiefer.com

Housing market dataviz

IT IS THANKSGIVING WEEK HERE IN THE UNITED STATES. I’m getting ready to go out for a nice casual drive down Interstate I-95. Should be fun. After I get back stuffed with turkey and whatnot, we’ll get back to data visualizations and analysis


vuorre.netlify.com

Open Access Mandates

Here are some notes on how the six features can be achieved using nothing else but “stuff on the internet”. I won’t provide detailed arguments, but just quick notes on how each of these features can already be implemented with minimal cost to researchers


www.mytinyshinys.com

EPL week 12

Match of the DayFollowing a defeat to North London rivals, Spurs now talking top 4, not title Pulis is outStodgy football and a points-per-game average lower than Steve Clarke will do that for you. His spell at WBA ended with the lowest ppg average of any of his three clubs


jvera.netlify.com

Geocoding with R and mapZen

The most widely used service for Geocoding is Google Maps, but you’d hit the limits early in your project if you don’t have a paid account. I often place data on a map, so I need precise geocoding. It’s not unlikely you had to test some other geocodig options ‘til you get the budget for a Google Maps paid account


www.ifconfig.it/hugo

Moving Complexity

I read a lot of discussions about complexity in networking and IT today that include a large amount of FUD


jvera.netlify.com

Atom for R

Last week I was working on some projects involving Posix and R code. I tend to use Atom Editor for such code instead of Rstudio, and sometimes I think the one an only valid contender against Rstudio is Atom Editor


ritsokiguess.site/docs

Chuff chuff

Introduction We are very accustomed to journey planners that help us navigate across the city, the country or the world. Google’s transit directions are an example: you enter a start and an end point, and it tells you which vehicles to get and where to transfer to another one


asch3tti.netlify.com

Attractive Attractor

My first post is going to be trivial but, I think, quite mesmerizing. This is sweet, but something is missing… perhaps there are not enough points. Let’s use 10,000,000! That’s more like it. Now let’s make another attractor, using 10,000,000 magenta points


jvera.netlify.com

My Docker images for R

This month I reached the dreaded “2 hours limit” at Docker Hub on building my Docker images. The images are built but the message shown is “build cancelled”. You can still use it, but on the front desk, seems that I’m not doing a good job. Clearly, something to fix


cattleguard.github.io

Data Cleaning Practice Pt1 - Steak Stats

For miles down I-40 you’ll see billboards for this restaurant and brewery advertising a free 72oz steak, if you can eat it. An office debate erupted amongst two of my colleagues over just what types of people end up beating the beast. They argued. I fired up rvest


blog.mgechev.com

Faster Angular Applications - Understanding Differs. Developing a Custom IterableDiffer

In this article we’ll take a look at another Angular abstraction - the differs and more specifically the IterableDiffer; we’ll explain what the differs are and how the framework uses them internally


www.onceupondata.com

Giving My First Data Science Talk

People have different reasons to give talks. For me the main motives are: All that showed how a supportive community can make a difference and help more people get engaged


roelandtn.frama.io

Social division of the residential space of the Greater Mexico City

EDIT: I provided a quick update here, and it will be the main entrance of a post series regarding this subject, cheers ! https://roelandtn.frama.io/post/update-for-the-mexico-project/ Hello there ! It has been a long time since my last post


www.jamesuanhoro.com

A Chi-Square test of close fit in covariance-based SEM

TLDR: If you can assume close fit for the RMSEA, there is no reason why you cannot for a Chi-Square test in SEMs


lenkiefer.com

Majestic mortgage rate plot

COME AND MAKE A MAJESTIC MORTGAGE RATE PLOT WITH ME. We’ll use R to plot a few visualizations of mortgage rates. I recently gave a number of talks about the economic outlook and housing. One point I like to make is that mortgage rates are low. I’ve shown this through a variety of visualizations


blog.wallaroolabs.com

Non-native event-driven windowing in Wallaroo

Certain applications lend themselves to pure parallel computation better than others. In some cases we require to apply certain algorithms over a “window” in our data


www.rdatagen.net

Visualizing how confounding biases estimates of population-wide (or marginal) average causal effects

When we are trying to assess the effect of an exposure or intervention on an outcome, confounding is an ever-present threat to our ability to draw the proper conclusions


www.tidyverse.org/articles

withr 2.1.0

Install the latest version with: One common idiom for dealing with this problem is to save the current state, make your change, then restore the previous state. However this approach can fail if there’s an error before you are able to reset the options. Here are some highlights of new functions for v2.1.0


cattleguard.github.io

3 Keys for Successful Products and Programs Before You Even Start

If you’ve been an information security practitioner for more than a few years, you’ve likely witnessed your share of disappointing purchases, implementations and initiatives


gcppodcast.com

Performance Atlas with Colt McAnlis

Colt McAnlis is a Developer Advocate at Google focusing on performance & compression. Before that, he was a graphics programmer in the games industry working at Blizzard, Microsoft (Ensemble), and Petroglyph


livefreeordichotomize.com

Secret Sampling 🎅🤶

’Tis the season for white elephant / גמד וענק / Yankee swap / secret santa-ing! There are various rules for this, for our version: 🏷 each participant receives the name of someone else to purchase a gift for 🎁 gifts are exchanged at a party 🤔 the receiver is tasked with guessing who the gift-giver was! We thought it’d be particularly fun to do it #rstats


www.tidyverse.org/articles

tidyverse 1.2.0

The latest version of tidyverse can be installed


www.blog.rdata.lu

Functional peace of mind

The problem with this approach, is that you cannot reuse any of the code there, even if you put it inside a


blog.wallaroolabs.com

Identifying Trending Twitter Hashtags in Real-time with Wallaroo

This week we have a guest post written by Hanee’ Medhat Hanee’ is a Big Data Engineer, with experience working with massive data in many industries, such as Telecommunications and Banking


www.datalorax.com

Mapping Statewide School Ratings with US Census Tracts

In this post, I’d like to share some work related to geo-spatial mapping, statewide school ratings, and US Census Bureau data using census tracts


blog.mgechev.com

Faster Angular Applications - Part 2. Pure Pipes, Pure Functions and Memoization

In this post, we’ll focus on techniques from functional programming we can apply to improve the performance of our applications, more specifically pure pipes, memoization, and referential


blog.sellorm.com

Talk: An Operating Model for R

I’ve given a version of this talk at all three EARL conferences this year starting in San Francisco


livefreeordichotomize.com

Thanksgiving Gantt Chart

Thanksgiving 🦃 is right around the corner 🎉 – this year we are hosting 17 people 😱


www.ifconfig.it/hugo

White boxes for everyone?

White boxes and their impact on enterprise networking is a hot topic today, with many point of


blog.mgechev.com

Faster Angular Applications - Part 1. On Push Change Detection and Immutability

On AngularConnect 2017 in London, I gave a talk called “Purely Fast.” In the presentation, I showed how step by step we can improve the performance of a business application


blog.sellorm.com

Installing R on RedHat Linux 7

#Pre-requisites Before we begin, it’s a good idea to install some general purpose tools that will help us out once R is installed. The first of these dependencies are a bunch of development tools


cevo.com.au

Recognising the fallacy of a single root cause

When something goes wrong it’s often tempting to attribute it to a single root cause: The site was down because the database crashed. The application broke because the developer pushed bad code. The machine stopped responding because it ran out of disk space


ewen.io

Hacking Homelessness & PDF Prisons

How to extract data from PDFs (in this case, homelessness figures in London), alongside a showcase of algorithmic tesselation of geospatial


leonawicz.github.io/blog

Make memorable plots with memery. v0.3.0 now on CRAN.

Please do share your data analyst meme creations.


www.samatkins.me

Becoming a professional web developer

Photo by Rohit Tandon on Unsplash A few years ago I decided to transition career and become a professional web developer. Married, with a young son, and a full time job as a management consultant (i.e. long hours and lots of travel) this was not an easy undertaking


livefreeordichotomize.com

LSTM neural nets as told by baseball

Currently I am studying for my qualifying exams on which the topic is “using deep neural networks for classification of time series data


gcppodcast.com

Smart Parking and IoT Core with Brian Granatir

Brian Granatir has been developing for the cloud since the beginning, back in 2007. He left Oregon and moved to New Zealand to be with his future wife in 2014. In 2017, he joined Smart Parking to help with the development of their new Smart City platform built on GCP


lenkiefer.com

Tour of U.S. metro area house price trends

HEY! HERE IS A VIDEO SHOWING HOUSE PRICE TRENDS around the United States. Earlier this year we looked at how to get the data and plot it using R. I made the video using the PowerPoint to .mp4 workflow I outlined here. Below I’ll review how to build this file


ropensci.org/technotes

solrium 1.0

Or get the development version: If the instance uses SSL, simply specify that like: This change does break the interface we had in the old version, but we think it’s worth it


www.rdatagen.net

A simstudy update provides an excuse to generate and display Likert-type data

The ordinal data is generated after a data set has been created with an adjustment variable. We have to provide the data.table name, the name of the adjustment variable, and the base probabilities. That’s really it


yutani.rbind.io

An Example Usage of ggplot_add()

You may wonder why this can’t be written like this: Let me explain a bit


r-tastic.co.uk

Automated and Unmysterious Machine Learning in Cancer Detection

First, let’s load the data: .. and do some data cleaning: change column names, get rid of the order in factor levels and remove rows with empty cells: That’s better! Now, let’s set up the local H2O instance… ..


yonicd.netlify.com

Firearms Sourced and Recovered in the United States and Territories 2010-2016

I want to try and probe a question that was raised since Las Vegas and now revived due to the tragedy in Sutherland Springs,TX: Given the free trade between states, can a state unilaterally regulate firearms


lenkiefer.com

JOLTS a dataviz trilogy

LET’S TAKE A LOOK AT RECENT LABOR MARKET TRENDS IN THE UNITED STATES. Below we’ll plot labor market trends using the U.S. Bureau of Labor Statistics Job Openings and Labor Turnover Survey (JOLTS). Last year we looked at how to get the data and plot it using R


www.njtierney.com

Tidyverse Case Study

… data sets regarding songs on the Billboard Hot 100 list from 1960 to 2016, including ranks for the given year, musical features, and lyrics. So, this blogpost walks through how you might start to unpack the data, clean it, and draw some interesting conclusions. I also wanted to avoid the “draw the rest of the fucking owl”


lenkiefer.com

A note on competing risks

WE ARE LATE FOR HALLOWEEN, but let’s get out our broom and purrr as we tidy some statistical results. Today I had occasion to be reminded of competing risks and a handy statistical result on competing risks from A.P. Basu and J.K


www.mytinyshinys.com

EPL week 11

Match of the DayMourinho still can’t buy a result away against big clubs The Spanish connectionMorata connected up with Azpilicueta’s cross for Chelsea’s winner against Manchester United and he has now assisted on five of the striker’s seven league goals this season, only topped by six of the 52 goals Diego Costa scored for the


ewen.io

My Linear(ish) Programming Fantasy

Applying linear programming principles to the selection of a fantasy football


www.ifconfig.it/hugo

VMware NSX VCP6-NV

Today I passed exam 2V0-642 to update my VCP5-DCV and got VCP6-NV (NSX v6.2) I’ll share here what I used to prepare the exam. Learning Path I passed exam VCP5-DCV in 2015. At the time I expected to work more with datacenter technology and I needed a more in-depth knowledge of how virtualization works


lenkiefer.com

Recent trends in U.S. housing markets

LET US REVIEW HOUSING MARKET TRENDS in the United States through the first three quarters of 2017. Economic background The overall economic environment remains favorable for housing. Interest rates are low, the labor market has been solid and income growth, while modest, has begun to tick up


www.datalorax.com

esvis: Part 1

This is the first of a series of posts to introduce my new esvis R package, why I think it’s important, and some of its capabilities. As of this writing the current version on CRAN is 0.0.1, so it’s obviously still fresh and may have some bugs


blog.brianz.bz

Accessing VPC Resources from AWS Lambda

I’m currently working on a book for Packt publishing titled Serverless Design Patterns and Best Practices. While writing and whipping out tons of examples is quite a bit of work, and I sometimes curse myself for agreeing to this, I’m quite excited as I work through the chapters and as it comes together


eddjberry.netlify.com

Machine learning and k-fold cross validation with sparklyr

In this post I’m going to run through a brief example of using sparklyr in R. This package provides a way to connect to Spark from within R, while using the dplyr functions we all know and love. I was entirely new to Spark, and databases in general, before having a play with sparklyr


blog.wallaroolabs.com

Open-source your startup’s code in 60 days

I’m Vid Jain, CEO & Founder of Wallaroo Labs. I’m writing today to tell you about how we open-sourced Wallaroo, our software framework for data processing, and the lessons we learned along the way. Our engineering team was experienced at writing great software, but now we faced a new set of challenges


gcppodcast.com

Cloud IoT Core with Indranil Chakraborty and Gabe Weiss

It’s time to learn everything about Cloud IoT Core from Indranil Chakraborty, Product Manager, and Gabe Weiss, Developer Advocate on IoT. Indranil is a product manager for Google Cloud Platform and leads product strategy and development for Cloud IoT Core. Previously, he held product management roles at Google Fiber and sales strategy roles for Google AdWords


www.rdatagen.net

Thinking about different ways to analyze sub-groups in an RCT

Here’s the scenario: we have an intervention that we think will improve outcomes for a particular population


www.tidyverse.org/articles

lubridate 1.7.0

Lubridate is a package that makes working with date-time and time-span objects easier. It provides fast and user friendly parsing of date-time strings, extraction and updating of components of a date-time objects (years, months, days etc.) and algebraic manipulation on date-time and time-span objects


www.njtierney.com

Bringing Together People and Projects at ozunconf17

What we created was a bunch of awesome individuals from 12 different countries. Before the ozunconf we discussed and dreamt up projects to work on for a few days, then met up and brought a few of them into reality. Before the ozunconf, we discussed various ideas for projects in GitHub


www.mytinyshinys.com

EPL week 10

Match of the DayKane-less Spurs lose to Pogba-less United and trail City by 8 points Final Top 6?The accepted ‘top six’ i.e Chelsea, Spurs, Man City, Liverpool, Arsenal and Man Utd


magesblog.com

Goodbye Blogger, welcome Hugo

However, over the last year or so, a couple of things started to annoy me so much that I stopped enjoying writing posts on Blogger. For example, there is no easy way to bulk edit older posts (I wanted to change links from files hosted on public Dropbox folders elsewhere), or to write posts in Markdown or better RMarkdown


www.tidyverse.org/articles

glue 1.2.0

Install the latest version with: glue is also vectorized over its inputs


vuorre.netlify.com

Bayesian Estimation of Signal Detection Models, Part 4

However, we almost always want to discuss our inference about the population, not individual subjects. Further, if we wish to discuss individual subjects, we should place them in the context of other subjects


lenkiefer.com

Dynamic Model Averaging Presentation Slides

I PUT TOGETHER SOME SLIDES SUMMARIZING our recent work on dynamic model averaging. See here and here for more blah blah blah. See below for some slides. Click here for a fullscreen version here. Making the Preso Let me also share with you the R code I used to generate these slides


giorasimchoni.com

How Much For the Watch?

This post serves a few purposes: Exercising Deep Learning with Regression, as opposed to Classification, which is my comfort zone. A tribute to my previous working place - ebay. Trying to combine text AND image data in a “Bi-Modal” model - never did it before. Doing something which involves Fashion


thug-r.life

How many random numbers does it take?

Fermat and his library This morning I woke up to a delightful tweet from fermatslibrary about sample random uniform numbers and how many it takes, on average, to sum to 1


lenkiefer.com

A closer look at forecasting recessions with dynamic model averaging

BACK WE GO INTO THE VASTY DEEP. LAST TIME we introduced the idea of using dynamic model averaging to forecast recessions. I was so excited about the new approach that I didn’t take the time to break down what was going on with it


www.ifconfig.it/hugo

Cisco ASA show connections ordered

When a customer calls with a problem or request I often see a chance to investigate a technology, learn something new or apply random skills to find a creative solution. This time is about an ASA, customer noticed too much traffic on the Internet facing interface


www.jamesuanhoro.com

Misspecification and fit indices in covariance-based SEM

TLDR: If you have good measurement quality, conventional benchmarks for fit indices may lead to bad decisions. Additionally, global fit indices are not informative for investigating misspecification. I am working with one of my professors, Dr


jvera.netlify.com

Rocker: Docker for Rstats

Last Tuesday (October 24th) I was at Madrid R User Group to give a tech talk about using Docker for automating our setup with Rstudio. Using Rocker images for easy and quick deployment. Here, the slides hoping someone will find it useful.


www.blog.rdata.lu

Scraping data from the local elections

One of my journalist friend was looking at the result of the local election in Luxembourg and he was dissatisfied because he was unable to compare the results of all the communes


www.njtierney.com

So, you’ve decided to change your r package name

I’ve had to change my R package names a few times. Every time I’ve had to do this, there were a few things I had to remember to do. Here’s a blog post that describes how to do that. When you change your package name, here is a list of things you need to do. And that’s basically it! Hooray, now to handle the git


www.mytinyshinys.com

Archer Memes

I have rather belatedly gotten around to viewing the adult animated series, Archer, on Netflix


www.blog.rdata.lu

Easy peasy STATA-like marginal effects with R

First, let’s load some packages: And it is also possible to plot the effects with base graphics: Which makes it possible to extract the effects for a list of individuals that you can create


lenkiefer.com

Forecasting recessions with dynamic model averaging

HERE THE LITERATURE IS VASTY DEEP. In this post we’ll dip our toes, every so slightly, into the dark waters of macroeconometric forecasting. I’ve been studying some techniques and want to try them out. I’m still at the learning and exploring stage, but let’s do it together. In this post we’ll conduct an exercise in forecasting U.S


yutani.rbind.io

Publish R Markdown to Medium via An RStudio Addin

mediumr allows you to knit and post R Markdown to Medium. You can install mediumr from github with: The addin knits the Rmd and shows the preview dialog


blog.wallaroolabs.com

Why we used Pony to write Wallaroo

Hi there! Today, I want to talk to you about why we chose to write Wallaroo, our distributed data processing framework for building high-performance streaming data applications, in Pony. It’s a question that has come up with some regular frequency from our more technically minded audiences


yonicd.netlify.com

Extending slackr

This lets us interact with R and Slack in new ways, by getting active updates on long simulations directly to your (and your team’s) mobile device and multitask away from your computer


lenkiefer.com

Home sales in expansions and recessions

LET’S LOOK AT NEW HOME SALES. Today the U.S. Census Bureau joint with the Department of Housing and Urban Development (HUD) released new home sales estimates through September of 2017


yutani.rbind.io

How Not To Knit All Rmd Files With Blogdown

Compiling all Rmd files is “safe” in the sense that we can notice if some Rmd becomes impossible to compile due to some breaking changes of some package. But, it may be time-consuming and can be a problem for those who have a lot of .Rmd files


jessesadler.com

Introduction to Network Analysis with R

Compare this to an adjacency matrix with the same data


gcppodcast.com

Vint Cerf

Google, the Cloud, or podcasts would not exist without the internet, so it’s with an incredible honor that we celebrate our 100th episode with one of its creators: Vint Cerf


www.stencilled.me

Visualizing Airbnb listings.

In this day and age of so many sharing services like Uber and Lyft , pricey hotels are being replaced by Airbnb


yutani.rbind.io

dplyr

Some of my friends didn’t aware that dplyr now accepts characters


leonawicz.github.io/blog

apputils 0.5.0 released

The key updates


yonicd.netlify.com

jsTree htmlwidget

jsTree is a R package that is a standalone htmlwidget for the jsTree JavaScript library. It can be run from the R console directly into the Rstudio Viewer, be used in a Shiny application or be part of an Rmarkdown html output


leonawicz.github.io/blog

Climate explorer update

These conditional distributions for historical and projected temperature and precipitation over different geographic regions, time periods, climate models and greenhouse gas emissions scenarios represent the source data sets available in the app


lenkiefer.com

Sharing is caring

LET’S MAKE SHARING BETTER ON THIS PAGE. I saw this today: 💫 how-to for making your #blogdown social-friendly: “Socialize your blogdown” by @xvrdm https://t.co/RurfUvRf6X #rstats #opengraph pic.twitter


www.rdatagen.net

Who knew likelihood functions could be so pretty?

We are generally most interested in finding out where the peak of that curve is, because the parameters associated with that point (the maximum likelihood estimates) are often used to describe the “true” underlying data generating


www.gokhanciflikli.com

Automatic Time-Series Forecasting with Prophet

Seasonality and Trends Time-series analysis is a battle on multiple fronts by definition. One has to deal with (dynamic) trends, seasonality effects, and good old noise


lenkiefer.com

Combining PowerPoint and R's tweenr for smooth animations

IN THIS POST I WANT SHARE A METHOD FOR MAKING SMOOTH POWERPOINT ANIMATIONS USING R


roelandtn.frama.io

liftr (Rmarkdown using docker)

I recently discover a R package, called liftr package. It allows the build of pdf document (and html files, but i didn’t test it) from a Rmarkdown file. Fully integrated in RStudio, the R code is executed (and other code as well, I tried python) and results are displayed


yutani.rbind.io

Confession

You may notice that diffs of .Rd files are suppressed by default on GitHub since some time. Do you wonder who did this? It’s me, yay! This is my pull request: Though I thought I’ve done the right thing at that time, now I’m afraid this change may be bad for some cases… After the relese of roxygen2 6.0.0, the game has changed a bit


lenkiefer.com

Purrrtier PowerPoint with R

WE ARE ON OUR WAY TOWARDS BUILDING a tidy PowerPoint workflow. In this post I want to build on my earlier posts (see here for an introduction and here for a more sophisticated approach) for building a PowerPoint presentation with R and try to make it even purrrtier


blog.sellorm.com

Quick Script to Install an R Package from the Command Line

I wrote a really quick script to install R packages from the command line that I thought I’d share. It doesn’t really do a great deal, but you can use it to install one package at a time. Save the below as rpkginstall and make sure it’s executable with chmod + x rpkginstall. Then you can install a package like this example, which would install dplyr,


www.ifconfig.it/hugo

Innovation sirens singing

In episode 13 of the Network Collective podcast around minute 26 Jordan Martin asks: Aren’t we all just following a trend? The discussion topic is how to mentor juniors in a learning path to grow their skills and be experts


www.riinu.me

Your first Shiny app

I am completely obsessed with Shiny and these days I end up presenting most of my work in a Shiny app. If it’s not worth putting in a Shiny app it’s not worth doing. Getting started with Shiny is actually a lot easier than a lot of people make it out to be


blog.wallaroolabs.com

How Wallaroo Scales Distributed State

Scaling stateful applications is hard. As your business grows, you’re eventually going to find that demand is greater than capacity. That means you can’t simply deploy your application to a set number of servers and forget it


lenkiefer.com

Mortgage rates are low!

MORTGAGE RATES ARE LOW IN THE UNITED STATES. How low? Let’s take a look. We’ll use R to plot a few visualizations of mortgage rates. We’ll also try out some of the nice features in the tibbletime package that help when working with time series data