mlr-blog.netlify.com
First release of mlrMBO - the toolbox for (Bayesian) Black-Box Optimization
We are happy to finally announce the first release of mlrMBO on cran after a quite long development time. For the theoretical background and a nearly complete overview of mlrMBOs capabilities you can check our paper on mlrMBO that we presubmitted to arxiv. The key features of mlrMBO are: Global optimization of expensive Black-Box functions…
livefreeordichotomize.com
Introducing shinyswipr
One day Lucy was sitting around on twitter when she spotted this tweet: I built a “tinder for academic preprints” web app: https://t[... ](https://livefreeordichotomize.com/2017/03/12/introducing-shinyswipr-swipe-your-way-to-a-great-shiny-ui/)
adamspannbauer.github.io
Profanity in Twitch Chat
This post contains references to profane language. All instances of profanity on this page have been censored. The post is part tutorial with code chunks, if you are uninterested in the data manipulation steps you can jump to Visualizing Profanity in Chat to see the results. Background In this post we’ll be analyzing profanity in a Twitch…
cevo.com.au
The Four values of a devops transformation
A successful devops transformation sees a change in organisational culture. These changes often come in the way of adoption of specific tools or practices…
mlr-blog.netlify.com
Being successful on Kaggle using mlr
Achieving a good score on a Kaggle competition is typically quite difficult. This blog post outlines 7 tips for beginners to improve their ranking on the Kaggle leaderboards…
adamspannbauer.github.io
lexRankr & Twitter
###Packages Used library(lexRankr) library(tidyverse) library(stringr) library(httr) library(jsonlite) In this post we’ll get tweets from twitter using the twitter API and then analyze the tweets using lexRankr in order to find a user’s most representative tweets…
mlr-blog.netlify.com
OpenML tutorial at useR!2017 Brussels
What is OpenML? Conducting research openly and reproducibly is becoming the gold standard in academic research. Practicing open and reproducible research, however, is hard. OpenML…
blog.sellorm.com
Force all traffic through OpenVPN connection
This is a really quick one, as we use this trick a lot when working remotely, but we always have to scrabble around to find the info! We use the open source OpenVPN for our office VPN…
gcppodcast.com
Python with Jon Wayne Parrott
Jon Wayne focuses on the Python developer experience for Google Cloud Platform…
www.ifconfig.it/hugo
Cisco Live Europe 2017 Berlin
It’s time for CLEUR again, for the second year in Berlin, that’s 5 years in a row and I still get excited when the date arrives. #CLEUR #CCIE #netvet #ciscochampion #merakimaster pic.twitter.com/IyF883JeYk - Gian Paolo (@gp_ifconfig) February 21, 2017 I won’t repeat many consideration’s I’ve already made last year…
cevo.com.au
Why?
We all know DevOps is not about the tools or the process, it’s about a deeper cultural movement…
www.mytinyshinys.com
Spotify - all 20 million tracks
I have consistently been interested in assessing music information in R, for example my - somewhat dormant -charts dashboard A recent, excellent, blog post by RCharlie featuring the Spotify and Genius APIs to determine a ‘Gloom Index’ for Radiohead tracks piqued my interest…
livefreeordichotomize.com
Intro to GMD
Why The Problem You are working in a collaborative situation working on some form of analysis. You want to produce a nice looking document of your work at the end and have easy contribution from all sources…
cevo.com.au
A spot of AWS
As more enterprises start transitioning over to using AWS, cost optimisation has become a hot topic. Andy and I were recently given the opportunity to explore the potential savings of running on spot instances. What are spot instances? Spare compute capacity that Amazon Web Services (AWS) sells at a discounted price…
www.stencilled.me
Top Beers in 2016.
After completing my previous post on food I wanted to work on something which I have started to explore recently,craft beer. A friend of mine introduced me to a beer club membership prior to which I never knew anything beyond the Corona’s …
yonicd.netlify.com
ggedit 0.1.1
In addition to the output UI the user also gets a reactive output that has all the objects that are in the regular ggedit package (plots, layers, scales, themes) both in object and script forms…
lenkiefer.com
A grand tour of house price trends
LET US BUILD ON YESTERDAY’S POST (LINK) and construct more VISUALIZATIONS of house prices. In this post, I’ll include some R code so you can play along…
translatedmedicine.com
Call Me Anytime
It was two o’clock on a Tuesday afternoon and Abigail* was in the emergency room for the third time this month. I was able to step away from the day-to-day struggles of the general inpatient medicine floors to see her in the emergency department. I had promised to keep her out of the hospital, and I had failed…
lenkiefer.com
Recent trends in house prices
IN LATE 2016 HOUSE PRICES recovered back to their pre-recession peak. At least nationally. At least not adjusted for inflation. Let’s talk about it. National trends The chart below shows the Freddie Mac House Price Index (link to source) for the United States from December 2000 to December 2016…
www.samatkins.me
Neighbourhood Map Project
This post relates to a front-end JavaScript and Knockout.js MVVM project, using APIs from Google Maps and Foursquare. The GitHub repo is here, and a demo is live on surge.sh. Context I’ve been meaning to write a blog post on one of my projects and not found the time. So I’m writing about my Neighbourhood Map project for two reasons…
livefreeordichotomize.com
The dire consequences of tests for linearity
This is a tale of the dire type 1 error consequences that occur when you test for linearity…
blog.brianz.bz
How to setup a free SSL certificate for use with Serverless APIs
NOTE!!! As of March 2017 this post is no longer relevant…
www.ifconfig.it/hugo
SDN anyone?
DISCLAIMER I work in the enterprise market, mostly with routing&switching, wireless, security products from multiple vendors. I have no SDN/cloud/ISP real experience yet…
shotwell.ca/blog
The Power of Tidy Data
One problem with data analysis is that you often need to make critical decisions before you really understand the problem…
gcppodcast.com
Improbable with Rob Whitehead
In a previous life, he was an indie iOS developer, and an arms dealer in Second Life! How can I resize a persistent…
www.stencilled.me
Where do people eat in Austin ??
Recently I visited Austin and many of my friends had mentioned about the variety in food options here. So my wife and I decided to search for places to eat on the foursquare app…
dsnotes.com
Fitting logistic regression on 100gb dataset on a laptop
EDIT: Thanks for comments, I created repository with full end-to-end reproducible code. You can find it here - https://github.com/dselivanov/kaggle-outbrain. This is continue of Lessons learned from “Outbrain Click Prediction” kaggle competition (part 1). As a quick recap - we achieved MAP@12 ~ 0…
blog.sellorm.com
Production R at ONS
This post originally appeared on the Mango blog, here - http://www.mango-solutions[... ](https://blog.sellorm.com/2017/02/13/production-r-at-ons/)
mlr-blog.netlify.com
mlr Google Summer of Code 2017
We are happy to announce that we applied for a another Google Summer of Code project in 2017. Operator Based Machine Learning Pipeline Construction We aim to change the way we are currently doing data preprocessing in mlr. Have a look at the proposal linked above for more details…
mlr-blog.netlify.com
mlr Workshop 2017
When and Where? In 2017, we are hosting the workshop at LMU Munich. The workshop will run from 6 March to 10 March 2017 (potentially including the sunday before and the saturday at the end), hosted by the Ludwig-Maximilians-University Munich. Important Dates: Address: Geschwister-Scholl-Platz 1, Room: M203…
lenkiefer.com
Experimenting with expanding axes
LET US EXPERIMENT A BIT WITH AXES. In this post I’m going to try out some data visualization ideas expanding on our earlier work with ticks marks (see post ticks out)…
lenkiefer.com
House prices are highest in coastal metros
TODAY THE NATIONAL ASSOCIATION OF REALTORS (NAR) released (press release) data on metro area median sales prices of existing single-family homes (the U.S. Census and HUD report data on new home sales prices in a joint release). NAR makes the data available (Excel file)…
livefreeordichotomize.com
The prevalence of drunk podcasts
For today’s rendition of I am curious about everything, in Hilary Parker & Roger Peng’s Not So Standard Deviations Episode 32, Roger suggested the prevalence of drunk podcasting has dramatically increased - so I thought I’d dig into it 🚧👷. I pulled the iTunes API for the term drunk in podcasts & plotted the results over time…
www.mytinyshinys.com
Wikipedia Page Views
This is the first in a category of retreads where I look again at past work which could do with some loving care…
lenkiefer.com
Description and presentation, exploration and analysis
THERE IS A LOVELY BOOK on writing style called “Clear and Simple as the Truth” by Francis-Noël Thomas and Mark Turner (webpage). In it Thomas and Turner distinguish between several writing styles including practical style and classic style…
lenkiefer.com
Ticks out!
YOU HAVE SPOKEN and we will go with ticks out, at least 54% of the time. In a graph, should axis ticks face in or out? - Leonard Kiefer (@lenkiefer) February 5, 2017
To celebrate, let’s make an animated gif where the axis expands over time. We’ll use data we used in our mortgage rate post…
lenkiefer.com
Hello Ninja! Crafting a browser-based presentation and how I got (re)started with R
I GIVE A LOT OF TALKS. Some are formal presentations or keynotes to large groups, while many are in small group settings. Sometimes I get impromptu requests so I have to be ready pretty much at all times to give some sort of talk…
dsnotes.com
Large data, feature hashing and online learning
EDIT: Thanks for comments, I created repository with full end-to-end reproducible code. You can find it here - https://github.com/dselivanov/kaggle-outbrain[... ](http://dsnotes.com/post/2017-01-27-lessons-learned-from-outbrain-click-prediction-kaggle-competition/)
shotwell.ca/blog
R for Excel Users
Like most people, I first learned to work with numbers through an Excel spreadsheet…
rsangole.netlify.com
Finite Mixture Modeling using Flexmix
Model Based Clustering Quick EDA Model building Mixtures of Regressions Quick EDA Model Building Results Further investigation Notes References This page replicates the codes written by Grun & Leish (2007) in ‘FlexMix: An R package for finite mixture modelling’, University of Wollongong, Australia…
wirtel.be
PythonFOSDEM 2017 - Call for Volunteers
Introduction The Python Community will be represented during FOSDEM 2017 with the Python Devrooms. This year, we will have two devrooms, the first one for 150 people on Saturday and the second one for 450 people on Sunday, it’s really cool because we had accepted 24 talks instead of 16…
wirtel.be
PythonFOSDEM 2017 - Schedule
Introduction The Python Community will be represented during FOSDEM 2017 with the Python Devrooms. This year, we will have two devrooms, the first one for 150 people on Saturday and the second one for 450 people on Sunday, it’s really cool because we had accepted 24 talks instead of 16…
gcppodcast.com
SRE II with Paul Newson
Before joining Google, he cofounded a tiny game technology startup, sold it to Microsoft, where he then worked on DirectX, Xbox, Xbox Live, and Forza Motorsport, before spending some time working on interesting machine learning problems in Microsoft Research…
lenkiefer.com
Wrangling employment data, plotting trends
We will get back to house prices soon. IN THIS POST I WANT TO EXPLORE EMPLOYMENT TRENDS at the state and metro area. Today the U.S. Bureau of Labor Statistics (BLS) released data on state and metro area employment trends. Last month we looked at unemployment trends. Today we’ll look at trends in nonfarm employment…
eddjberry.netlify.com
Intro to R slides
For the Perception Action and Cognition Lab Open Science Week, 2017 (University of Leeds) I gave two talks introducing R. You can see the slides below. The code for the slides can be found over at GitHub…
lenkiefer.com
Visualizing housing value distributions by metro
EARLEIR TODAY I HAPPENED ACROSS AN INTERESTING post by Ken Steif (twitter @kensteif) at the Urban Spatial Analysis blog that predicts gentrification using census data. Do take some time to check out the post…
lenkiefer.com
Best year for home sales in a decade
WE ARE ONE MONTH INTO 2017 AND WITH THIS MONTH’S economic releases we’ve completed most of the picture of 2016. These data by and large matched our expectations as I outlined in my 2016 year-in-review. Let’s take a quick look…
blog.mgechev.com
Implementing Angular's Dependency Injection in React. Understanding Element Injectors.
Recently I’ve been blogging mostly about Angular and it’s not by accident! Angular is an amazing framework, bringing a lot of innovation to the front-end technologies, with a great community behind it. In the same time, the projects that I’m working on have various of different requirements and sometimes I need to consider different options…
blog.sellorm.com
Accessing CRAN from an internet-less LAN
If you were in an environment where you have R running on server on a secure LAN with no internet access, you’ll be familiar with the situation outlined in the image above…
lenkiefer.com
Fun with Plotly
RECENTLY I HAVE BEEN EXPLORING FLEXDASHBOARDS to visualize data. In this post I want to focus on a tool I’ve found particularly useful, plotly. Plotly enables you to make interactive html widgets that you can embed in your webpage or view from within R…
www.stencilled.me
NFL Season 2016-17
Hello World !!! This is my first project using d3js. Being a GIS professional, visualization is always a part of job. I always wanted to learn different ways for visualizing data.Let it be a simple plot using R or a choropleth map using Arcmap . In this project I am trying to display how the NFL season 2016-17 went about…
lenkiefer.com
GDP Growth Chart (animated)
IN THIS POST I WANT TO SHARE WITH YOU some code to create an animated plot of annual growth rates in U.S. Real Gross Domestic Product (GDP). As in most of my posts, we’ll be creating these graphs in R. GDP Plot On Friday the U.S…
livefreeordichotomize.com
Yoga for modeling
A New Year’s resolution for all of our models: get more flexible! As an aside, I’m better at implementing yoga for my models than yoga for myself, most of the time I end up like this: Anyways, let’s make our models flexible! By flexible, we mean let’s be more intentional about fitting nonlinear parametric…
lenkiefer.com
Converting a Tableau dashboard to a Flexdashboard
Edited on 2017-01-27 to correct typos and fix tootip in dashboard IN THIS POST WE WILL CONVERT a data visualization dashboard I made some time ago using Tableau into a flexdashboard using R. On Monday, the Census posted a blog summarizing recent mobility trends. According to the CPS ASEC, 11…
www.onceupondata.com
A Glimpse into The Daily Life of a Data Scientist
A couple of weeks ago, I had a discussion with a co-worker regarding a project I was involved in, I felt that there was no clear understanding of the daily challenges data scientists face…
livefreeordichotomize.com
CatterPlot thank you note
Lara Harmon has put in countless hours to build and uplift the ASA Student community. We are SO grateful…
blog.sellorm.com
Controlling lights from within the RStudio IDE
I completely forgot that I made this silly video last year…
gcppodcast.com
Java with Ray Tsang and Rajeev Dayal
In this second episode of the year we’ll talk Java! Rajeev Dayal is an Engineering Manager at Google New York that manages the Cloud SDK and Java on GCP efforts…
blog.brianz.bz
Authoring Alexa Skills with Python and Lazysusan
Most recently at my day job we were tasked with building an Amazon Alexa app for a client. As soon as I heard rumors that we would be doing an Alexa app I starting raising my hand hoping that I’d get put on this project…
livefreeordichotomize.com
Custom JavaScript visualizations in RMarkdown
I happened to stumble upon the preview release page for RStudio recently and noticed something that made me exorbitantly happy. A preview release of RStudio v1.0…
blog.sellorm.com
Now with added Blogdown!
I was lucky enough to be at rstudio::conf in Orlando a couple of weeks ago where my friend, Tareef, told me that Yihui Xie, Software Engineer at RStudio, had been working on…
blog.sellorm.com
RStats - Plumber launcher script
Plumber Launcher Script I use Jeff Allen’s Plumber a lot…
lenkiefer.com
A guide to building an interactive flexdashboard
INTERACTIVE DASHBOARDS CAN BE AN EFFECTIVE WAY to explore and present data…
blog.mgechev.com
Distributing an Angular Library - The Brief Guide
In this post I’ll quickly explain the minimum you need to know in order to publish an Angular component to npm. By the end of the post you’ll know how your module can: Be platform independent (i.e. run in Web Workers, Universal). Should be bundled and distributed. Work with the Angular’s Ahead-of-Time compiler…
lenkiefer.com
Flexin' Friday
WE WORKED OUT A VISUALIZATION REMIX ON WEDNESDAY and now that it’s Friday time to flex a little. In this post I’m going to remix the remix into a flexdashboard. I made this dashboard using crosstalk and plotly. By using htmlwidgets we can create an interactive dashboard in a static webpage…
gcppodcast.com
Pokémon GO with Edward Wu, Director of Software Engineering at Niantic
He received his PhD from Stanford in Physics in 2009 applying Bayesian parameter estimation models to cosmological data he collected from three visits to Antarctica and the South Pole…
lenkiefer.com
Working on a Workout
SO APPARENTLY IT IS WORKOUT WEDNESDAY, a day when data visualization fans try to build data visualization skills. I heard about it via this post by @hrbrmstr that reconsiders a visualization of state level unemployment rates. (Original post here)…
cevo.com.au
You're doing it wrong if...
I’ve been around for a while, worked through different teams, across different industries and companies. Over time I have learnt a lot of lessons, some the easy way, most the hard way…
blog.mgechev.com
Angular in Production
In this informal essay I’ll go through a case study of my experience in using Angular (2 and above) in production. Last April, together with a small team, we started working on an educational application; the second version of a product that I developed about 3 years ago using Angular 1. The product targets young kids and their parents…
livefreeordichotomize.com
Regression modeling strategies
Frank Harrell teaches an amazing course “Regression Modeling Strategies” based on his book each spring at Vanderbilt. This was one of my all time favorite courses. It has just the right amount of practical strategies, brilliant statistical insight, and zealous disdain for all things stepwise regression…
lenkiefer.com
That's what I'm (cross)talking about!
IN THIS POST I SHARE WITH YOU SOMETHING I BUILT with R, a simple dashboard using crosstalk, plotly, and data tables. By using htmlwidgets we can create an interactive dashboard in a static webpage. In this version I created a few line charts and data tables. Then I created a linked view using crosstalk to enable you to filter the table and chart simultaneously…
lenkiefer.com
Housing's best year in a decade, remix
IN THIS POST I PRESENT a remixed version of my 2016 year in review article. By many measures 2016 was the best year for housing in a decade. Back in May I shared some trends on housing markets and then in December I did a full year recap. This document is a flexdashboard version of my December 2016 year-in-review article…
www.mytinyshinys.com
Notebook Collaboration
Wild boars amongst usOne of the great virtues of R Notebooks is the ease of collaboration and Tuija Sonkkila recently kindly made hers on wild boars available As - in a former existence - I was a pork buyer for a grocery chain, this piqued (almost-pun intended) my interest…
www.njtierney.com
Magic reprex
Making reproducible examples can be hard. There’s a lot of things you need to consider. Like, making sure your environment is clean, the right packages are loaded, the code is formatted nicely, and images are the right resolution and dimension. Getting all of these ducks lined up can sometimes take a couple of minutes, if you have a nice tightly defined problem. Other times, it can take much, much…
www.seanlnguyen.com
Mapping Starbucks Locations
This is where I’ll be posting tutorials on how to use R and Rstudio to create some amazing graphics and visualizations. If you are completely new to R, don’t worry, I will post guides to explain how to start form scratch…
wirtel.be
Python Events in 2017, Need your help!
Events in 2017 Hello, for the PythonFOSDEM [1] on 4th and 5th February in Belgium, I would like to present some slides with the Python events around the…
lenkiefer.com
Mortgage rate flexdashboard
IN THE PAST I HAVE USED MANY DIFFERENT programs to visualize data. I’ve done quite a few visualizations using Tableau. I enjoy using that program, but it does have some drawbacks. As an alternative, I have been exploring using flexdashboards for R. One advantage of flexdashboards is that you can easily incorporate R code into the design of dashboards…
www.samatkins.me
Sublime Text 3 and Python
As I wrote recently, after jumping between Javascript, GoLang and Python, I’ve decided to heed some good advice and focus on one language for awhile. And that language is Python. This post is about my Sublime Text 3 set-up for programming in Python…
livefreeordichotomize.com
dplyr thank you note
It’s that post-holiday time of year to write some thank yous! I’m getting excited to attend rstudio::conf next week, so in that spirit, I have put together a little thank you using…
www.mytinyshinys.com
Exploratory Analysis
Analyzing some Premier League dataI recently had some fan-mail! The author suggested I write a book on the “practical considerations when building shiny”…
www.samatkins.me
100 Days of Code
I’ve joined the #100DaysOfCode Challenge. Over Christmas I decided to focus in Q1 2017 on completing Udacity’s Full-Stack Nanodegree. As part of this, a complementary objective is I want to really get to grips with Python…
lenkiefer.com
A look back at housing's best year in a decade
THE YEAR IS DRAWING TO A CLOSE and by many measures 2016 was the best year for housing in a decade. Back in May I shared some trends on housing markets in a Tufte style document. Now that we have nearly a full year’s worth of data, let’s see how housing and mortgage markets did in 2016. Well, I’ve created a retrospective looking back at the full year…
www.samatkins.me
Managing Python projects with Virtualenv
Python 2 versus 3 To summarise the Python wiki: “Python 2.x is legacy, Python 3.x is the present and future of the language.” More info is available on the Python wiki…
lenkiefer.com
Data tables are Viz too
THOUGH 2016 IS NOT OVER YET I want to get a jump on my 2017 resolution: make better tables. I’ve been re-reading this paper on the Rudiments of Numeracy by A. S. C. Ehrenberg published in the Journal of the Royal Statistical Society in 1977…
livefreeordichotomize.com
Wait, what are P-values?
Frequently, and especially recently, misunderstandings of common statistical terms/ concepts have caused confusion and even anger. I would like to (attempt) to clear up a big player in the world of commonly used (and commonly misunderstood) statistical concepts: the p-value…
lenkiefer.com
Populous metros are heavy!
I WANT TO SHARE WITH YOU a little bit of code to make this whimsical data visualization: Make a simple map First we can construct a map of the lower 48 U.S. states and add a marker for each city. These data are available in the us…
blog.brianz.bz
Serverless 1.x
Since my last posts on Serverless, Serverless has gone 1.0. In fact, as of this writing Serverless is at version 1.3. I’ve had the luck of taking 1.3 for a spin with my new job by implementing an application for the Amazon Alexa platform…
nilsreimer.com
Reimer et al. (2017). Intergroup contact and social change.
Our new paper is now in print. Click here for an open access preprint, and here for all figures, data, and analysis scripts…
lenkiefer.com
Simple tweenr animations with ggplot2
Animations with tweenr IN THIS POST WE ARE GOING TO CREATE TWO SIMPLE animated data visualizations using R ggplot2, animation, and tweenr packages. See this post about tweenr for an introduction to tweenr, and more examples here and here…
lenkiefer.com
Even more mortgage rate visualizations
Introduction WE ARE BACK WITH EVEN MORE WAYS TO VISUALIZE mortgage rates. A few days ago, I shared some ways to visualize mortgage rate trends and here I posted some additional gifs without the code. I’m going to expand on that last post with R code for one those charts, and give you a totally new one…
lenkiefer.com
See Data, Speak Data Part 1
IF YOU HAVE BEEN FOLLOWING along-welcome if you’re stopping by for the first time-you’ll have seen many different data visualizations that I have shared with you. But let’s take a step back and consider what to do with them…
livefreeordichotomize.com
Hill for the data scientist
This was inspired by Hilary Parker & Roger Peng’s Not So Standard Deviations Episode 28, which can be found…
lenkiefer.com
More amazing ways to visualize mortgage rates
IN THIS POST I AM GOING TO share a couple gifs displaying mortgage rate trends. Check my earlier posts here and here for other charts. I don’t have time tonight to write up all the code for these, so I’m just sharing the images. Check back later (message/tweet at me) if you’d like to see the code…
gcppodcast.com
A Year in Review
Spotify is now on Google Cloud Platform: Kubernetes and Google Container Engine Education: Multiple General Availabilities Machine Learning and Big Data Google Cloud Platform Community Slack We’ll be back on January 18th, 2017 - See you all…
lenkiefer.com
Let's fix a dot plot
IN THIS POST WE’RE GOING TO REVISE the dotplot code I posted that lets you take the Federal Open Market Commitee (FOMC) projections and turn them into an animated dotplot…