mvaugoyeau.netlify.com
The design of my logo
The logo originally The method of traveling salsedperson My logo Today, I explain how I design my logo for the blog from the one drawn by Sébastien Rochette for my thesis and the method of traveling salesperson by Antonio S. Chinchón…
ropensci.org/technotes
icon: web icons for rmarkdown
Icons can be added to your R Markdown documents using short prefixes which identify the font’s library…
ramhiser.com
Autoencoders with Keras
I’ve been exploring how useful autoencoders are and how painfully simple they are to implement in Keras…
www.matteodefelice.name
Consecutive days of good weather in the Netherlands
I have recently moved in North Holland and in the past weeks the weather was particularly fortunate: for many (consecutive) days there was no rain and the temperature have been very high for this area (the maximum temperature was easily above 25° degrees)…
ndres.me
How to make animated gifs from Matplotlib plots easily
The problem If you Google how to make an animated Matplotlib graph, you end up with code like that: fig, ax = plt.subplots() x = np.arange(0, 2*np.pi, 0.01) line, = ax.plot(x, np.sin(x)) def animate(i): line.set_ydata(np.sin(x + i/10…
cevo.com.au
Innovating with AWS
Cevo’s General Manager Hannah Browne was invited to take part on this year’s AWS Summit Partner panel - “Innovating with…
www.rdatagen.net
Is non-inferiority on par with superiority?
In the case of an inferiority trial, we add a little twist. Really, we subtract a little twist…
fharrell.com
Navigating Statistical Modeling and Machine Learning
To extend the analogy, the guideposts identified by Frank could be illustrated as a route map if put into the format of a series of junctures (and termini). Here is an example: This allegorical cartoon is simplistic: the situation is certainly much more nuanced than this…
blog.rstudio.com
sparklyr 0.8
MLeap allows you to use your Spark pipelines in any Java-enabled device or service. This works by serializing Spark pipelines which can later be loaded into the Java Virtual Machine (JVM) for scoring without requiring a Spark cluster…
translatedmedicine.com
Is Your Hospital Closer to a Dunkin or Starbucks? (MA edition)
This post is inspired by #tidytuesday. Coffee is the life force for many healthcare workers. Here too, the age old question arises: Dunkin or Starbucks? Sometimes, it just comes down to proximity. We need coffee and we need it now! So, I decided to see what chain was closest to each hospital in…
sciathlon.github.io
My first half marathon and its data analysis
Hi everyone! I ran my first half marathon this spring and wanted to talk about my experience. I also analyzed the results of the race which I found on the Grenoble-Vizille website, so scroll down if you want to only look at the data science part of this article…
www.rostrum.blog
Accessibility workshop at #Sprint18
Matt Dray Sprinting Sprint events are a chance for the government digital, data, design and technology community to: look back on the work we’ve been doing to transform government and to look forward at what we need to do Kevin Cunnington (@kevincunnington), Director General of the Government Digital Service (GDS), outlined this in a recent blog…
bayesianbabes.netlify.com
Binomial Probability Distribution
Assume your cat just had a litter of kittens. Con-cat-ulations! However, you suspect you will soon have five different paws gently tapping your face in the morning when you are sleeping…
bayesianbabes.netlify.com
Unsupervised Categorical Distribution in JAGS
Question (1) What methods are necessary to find significant covariates? (2) How do different strata effect our coefficients? (3) Must strata be defined a priori or can we find the stratification of 5 intercept groups within our data set? Simulation To answer our question, we will simulate 1000 plants with 5 different genotypes which will vary by…
blog.rstudio.com
Enterprise Advocate
I joined in early 2014. I was excited by RStudio since I love helping people and being an open source company RStudio seemed like a great way to reach a lot of people and get to assist with numerous interesting use cases…
data-chips.com
Pixelating the Mets logo
Perler bead patternsOne of my favorite hobbies is creating pixel art with perler beads. To get the patterns, I usually search for pixelated images on pinterest. But I’ve always wanted to create my own pixelated images too! I found this quad-ruled boombox notebook at TJMaxx recently and it inspired me to design my own pixel boombox…
data-chips.com
Tidying old functions
Introducing your old functions to your new friend tidyverseRead in the sample data with readrOriginal functionsUpdating the original asset manager tablesUnderstanding more about tidy evaluationSummaryIntroducing your old functions to your new friend tidyverseA while back I wrote some R functions to analyze and summarize data for an ongoing quarterly report project I had been…
www.sastibe.de
Use Emacs Org Mode and REST APIs for an up-to-date Stock Portfolio
The API used in their scenario gave different results with a much cleaner nomenclature. For the Alphavantage API, I had to become a little creative with the eLisp code…
cevo.com.au
Using the new Resource Tagging API in anger
Building quality into an automated delivery pipeline can sometimes cause a head scratcher when you have to figure out how to validate the work you’re currently doing. This happened recently as we started extending our validation of Amazon Web Services (AWS) resources to a set of Application Load Balancers (ALBs)…
gcppodcast.com
Beam and Spark with Holden Karau
Upcoming Talks: I have a continuous integration build process setup with Container Builder, but it’s all sequential. I want to speed things up by processing parts of it in parallel…
theaknowles.com
Generate Word Clouds in R from Conference Tweets
A few considerations to bear in mind before forging ahead: Assuming there’s an agreed-upon communal hashtag for a given conference, this is a fairly quick and painless fun little task…
engineering.pivotal.io
How to Install a TLS Certificate on vCenter Server Appliance (VCSA) 6.7
We want a green padlock on our VCSA’s web client; when we use our browser to navigate to our VCSA, we’d like it too look like the following: This blog posts describes the steps to follow in order to install a TLS certificate on a VCSA 6.7…
blog.mgechev.com
Introducing Guess.js - a toolkit for enabling data-driven user-experiences on the Web
About two months ago I published my initial research[1] on data-driven bundling. A few weeks after that, I had the opportunity to present my work on RenderConf in Oxford, UK in my talk “Teach Your Bundler Users’ Habits”[2]…
cjbarrie.netlify.com
Mapping the Tunisian Revolution
R provides a growing number of mapping packages. In this post I document my workflow for producing a map of the diffusion of protest during the Tunisian Revolution…
roh.engineering
PE Industrial Engineer Reference Sheet
80% of the items represent 20% of the sales or 20% of the items represent 80% of the cost. This law is a rule of thumb. The operation process chart only has Operations and Inspections. The flow process chart forces a more detailed look at a system. organizes a large number of ideas into their natural relationships Shows when each hand is busy and idle…
blog.rstudio.com
RStudio Connect v1.6.2
There are a handful of new features that are highlighted below…
jvera.netlify.com
using inotify for long running R scripts
Sometimes your Rscript takes a long time to run. Not for computation requirements but for the kind of processes involved, for instance when calling APIs with time limits and such (I’d use big data architectures otherwise)…
davemcg.github.io
#BoG18: Talk Notes
Typos everywhere. Things may change dramatically over time as I scan back through notes. I’ve tried to respect #notwitter. Will be updated periodically…
lenkiefer.com
Spotlight on housing affordability
IN MY LINE OF WORK, (finance/economics) you see a lot of dual axis line charts. I am of the opinion that dual y axis charts are sort of evil. But in this post I’m going to make one. It’s for a totally legit reason though. Like in an earlier post we’ll make a graph similar to one I saw on xenographics…
www.mytinyshinys.com
EPL Week 37
Match of the DayOnly one point from the top four this week-end means Chelsea still have chance to catch Liverpool and Spurs…
www.aggieerin.com
Gathering Text from the Web
Hi everyone! I don’t really feel like working too hard today, so I decided to write a blog post about how my student Will and I used rvest to mine articles from several different news sources for a project…
www.jamesuanhoro.com
Modeling the error variance to account for heteroskedasticity
One of the assumptions that comes with applying OLS estimation for regression models in the social sciences is homoskedasticity, I prefer constant error variance (it also goes by spherical disturbances)…
yihui.name/en
No Slap in the Face to Those Who are Smiling
There is an idiom in Chinese that I can roughly translate as “You shall not slap a person in the face when he is smiling at you”. This means if a person is showing a positive attitude, you’d better forgive him even if he has made a mistake…
www.jamesuanhoro.com
Simulating data from regression models
My preferred approach to validating regression models is to simulate data from them, and see if the simulated data capture relevant features of the original data. A basic feature of interest would be the mean. I like this approach because it is extendable to the family of generalized linear models (logistic, Poisson, gamma, ..…
ndres.me
Using a neural network to generate your next startup name
Inspired by a Dan Hon article on how to generate British placenames, I decided to train my own network to generate startup names. The original code was made by Andej Karpathy, but there is a more modern and concise version in Keras, available here…
wytham.rbind.io
How to make a poster in R
The main area where I found the Rossi post lacking was on software. All the suggestions he makes (e.g. Powerpoint, Canva, Adobe Illustrator) are point-and-click. I have nothing against using point-and-click software to make a poster, so if it works for you then go ahead…
www.jessemaegan.com
R4DS May Challenge
Sure, we could do something similar to the first iteration of our online learning community and say we’re going to cover a specified amount of material each week, but instead we’re going to try something new! By signing up for office hours, you are making a commitment to show up during the office hours…
lenkiefer.com
Jobs Friday May 2018
TODAY WAS JOBS FRIDAY. LET’s create a couple plots to show the trend in employment growth. Each month the U.S. Bureau of Labor Statistics (BLS) releases its employment situation report. Let’s make a couple plots looking at trends in U.S. nonfarm payrolls. Per usual, let’s make a graph with R…
davemcg.github.io
Template for rmarkdown reports
Since I keep opening up random recent Rmarkdown documents to copy the header to paste into my next document, I figure it would be more efficient to just make a post I could reach from anywhere (with an internet connection)…
blog.sellorm.com
A toy geolocation API in Python
I spent some time last night thinking about ways in which I could improve the Awesome Blogdown website…
blog.wallaroolabs.com
Wallaroo
Its been over a year since I wrote the first blog post introducing Wallaroo to the world. We’ve covered a lot of ground since then; from introducing the Python API that is our primary product, to releasing all our code under an open core model…
www.tidyverse.org/articles
pkgdown 1.0.0
Install pkgdown with: A great way to see what you can do with pkgdown is to look at existing websites…
www.rdatagen.net
How efficient are multifactorial experiments?
In the second scenario, each successive exposure continues to add to the effect, but each additional intervention adds a little less. The first intervention adds 0.8, the second adds 0.6, the third adds 0.4, and the fourth adds 0.2. This is a form of interaction…
yihui.name/en
Make the Right / Positive Things Easier to Do than Wrong / Negative Things
First of all, I should make it clear that I do love Stack Overflow, use it almost every day, and truly appreciate its value, despite of my previous rants…
www.cultureofinsight.com/blog
Multivariate Dot-Density Maps in R with sf & ggplot2
Background Last June I did a blog post about building dot-denisty maps in R using UK Census…
gcppodcast.com
Open Source at Google Cloud Platform with Sarah Novotny
Mark broke SSH access to his Compute Engine instance by accidentally removing the GCP linux guest environment…
blog.sellorm.com
Packaging Shiny apps - A deep dive
(Or, how to write a Shiny app.R file that only contains a single line of code) This article originally appeared on the Mango Solutions blog. This post is long overdue…
lenkiefer.com
What's up? VSUP, that's what's up.
IN THIS POST WE SHALL EXPLORE VALUE-SUPRESSING UNCERTAINTY PALETTES. One of my favorite new sites is xenographics that gives examples of and links to “weird, but (sometimes) useful charts”. The examples xenographics gives are undoubtedly interesting and might help inspire you if you’re looking for something new…
www.mytinyshinys.com
EPL Week 36
For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DaySigns of life from the botton three with wins for WBA and Southampton and a draw for Stoke at…
emmavestesson.netlify.com
My first tidy Tuesday
Tidy Tuesday I have seen some cool graphs on twitter created for Tidy Tuesday. I wanted to join in on the fun so I downloaded the data from week 3 and started playinh…
ritsokiguess.site/docs
Rating rugby league with Stan
What’s in here This is a rather long and complicated…
www.robert-hickman.eu
Riddler 27th April 2018
Formally this is phrased as: Some number, N, of people need to pee, and there is some number, M, of urinals in a row in a men’s room…
shotwell.ca/blog
Testing machine learning models with testthat
Automated testing is a huge part of software development…
www.jakekaupp.com
tidytuesday I'm gonna be...
I’ve been having some days! Between sick kids, myself getting sick, deadlines and development promises, I haven’t had a lot of time to make any kind of blog post…
fharrell.com
Road Map for Choosing Between Statistical Modeling and Machine Learning
Data analysis methods may be described by their areas of applications, but for this article I’m using definitions that are strictly…
ndres.me
Tutorial
Last week I published an article showing you how I built a friend graph using you Facebook data. This article is a detailed version showing you how to do it yourself. Here’s what we’ll end up with: Facebook friend network (click to enlarge) Warning: To make such a graph, you need to scape all your mutual friends…
translatedmedicine.com
Where are the students who are English Learners in MA?
Inspired by my wife’s work in educating students who are English Learners, this post visualizes the English Learners across Massachusetts. The Massachusetts Department of Education publishes a large amount of education data from across the state. It provides yearly percentages of enrolled student and their characteristics, including English…
www.tidyverse.org/articles
pillar 1.2.2
List columns (and the special case of nested data frames) are a very powerful idiom…
thedatawitch.com
If you're having trouble programming data file imports, try RStudio's code preview
When it comes to importing flat data files stored locally on your computer, such as csv’s or xls’s, you might be uncertain which method to use. It can also be hard to remember how to do it or the options that are available for various file types…
cattleguard.github.io
Dorking Around with CORS
After tuning in to Absolute Appsec ep 2 the other day I got pretty interested in CORS security issues related to misconfiguration and dynamic origin handling…
lcolladotor.github.io
Getting ready to attend rOpenSci unconf18 and probably working on tidyverse-like functions for the first time
These and other questions could involve time getting familiar with. Time that I could spend now, before unconf18, learning and at least getting a better sense of what to expect. Maybe I’m complicating myself and worrying too much about this…
matthewsmith.rbind.io
ITNr Version 0.2.0
International Trade Network (ITN) Analysis in R A new version of my package ITNr (0.2.0) is now on CRAN! The ITNr package presents a set of functions of to clean trade data, implement desciptitve analysis of the ITN and create a range of plots…
mailund.github.io/r-programmer-blog
New Package Releases
I have just released version 0.1.0 of foolbox and version 0.1.2 of tailr…
www.rostrum.blog
TWO DOGS IN TOILET ELDERLY LADY INVOLVED
Matt Dray TL;DR Animals get stuck in weird places. Oh, and the sf package in R can be used for coordinate reprojection prior to interactive mapping with leaflet. The problem I often work with data that has coordinates; usually a eastings and northings…
blog.wallaroolabs.com
Adventures with cgo
Hi there! You’re about to read part 2 of a 4 part series about Go performance as told from the perspective of Wallaroo, our distributed stream processor. Part 1 covered issues around having non-Go code holding on to pointers to Go objects. This post builds on part 1…
lenkiefer.com
Expanding Expansions, Contracting Recessions
IN THIS POST I WANT TO SHARE A GRAPH looking at the length of economic expansions and recessions in the United State over time. Earlier today, Andrew Chamberlain (on Twitter), observed that at the end of this month the current economic expansion in the U.S. would be the second longest in history…
www.ifconfig.it/hugo
Cisco Candid
At Cisco Live Europe in Barcelona I had a chance to see Cisco Candid (Network Assurance Engine) in action. I shared my views on GestaltIT Tech Talks…
www.mytinyshinys.com
EPL Week 35
For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayWBA, Stoke and Southampton edge ever closer to the drop What with hitting the woodwork three times the Watford v Palace 0-0 matchup saw some records…
statsbylopez.netlify.com
Rethinking draft curves
Even when you ignore the 2019 pick conveyed to the Colts, the Jets are enormous losers. To be more specific, the Colts aquired 112.4 points worth of draft value in exchange for 52.5 points…
brunaw.com
chorrrds
Então vamos lá. Eu vou começar acertando alguns pontos sobre os dados, já que eles não estão perfeitos, como: A base está no formato longo, ou seja, temos uma linha para cada acorde da música, mantendo a sequência na qual eles aparecem no site…
www.rladiesnyc.org
Introduction to Shiny Apps using NBA data
Come out for our May event for a tutorial on Shiny! Julia Wrobel, a PhD candidate in biostatistics at Columbia University, will be teaching us to make and deploy Shiny apps using NBA data…
www.datalorax.com
Writing an R Package Basics (and why I think you should)
On April 10, 2018, I gave a talk entitled Developing your first R package: A case study with esvis for the Eugene R Users Group. Although I discussed my esvis package, the focus of the talk was really on tools and tips for developing R pacakges…
www.tidyverse.org/articles
bigrquery 1.0.0
Install it with: Four big changes in this version of bigrquery are described in detail below: One of the neatest things about BigQuery is that it supports nested and repeated fields, which are also called structs (or records) and…
blog-mjay.firebaseapp.com
AWS IOT Project with ESP8266
Hello there! Today I have completed a POC using ESP8266 as a slave as well as client model which will interface between hardware to provide IOT support. I wish to make this project compatible with AWS so that anyone can deploy their models to S3 and it can serve as a micro service via EC2. I am looking for some research enthusiast people who can work with me…
lenkiefer.com
April 2018 Housing Market Update
LAST WEEK I POSTED A THREAD ON TWITTER COVERING RECENT HOUSING MARKET TRENDS AND THE OUTLOOK FOR MORTGAGE RATES: #Mortgage rates are now at their highest level since January 2014…
livefreeordichotomize.com
R release names (Updated)
I always love discussions about R release names and their origin…
aosmith.rbind.io
Simulate! Simulate! - Part 2
It wasn’t until I started working with clients and teaching labs on mixed models in R that I learned how to do simulations to understand how well such models worked under various scenarios…
ndres.me
Using Facebook data to plot my friend network
Inspired by a friend’s post, I decided to plot my Facebook network. To do so, I scrapped “mutual friends” and made the following graph: Facebook friend network (click to enlarge) In this blog post, I’ll explain how the graph is made and how clusters are created…
www.tidyverse.org/articles
readxl 1.1.0
The easiest way to install the latest version from CRAN is to install the whole tidyverse. Alternatively, install just readxl from CRAN: Those have now been addressed upstream and version 1.1.0 of readxl embeds a version of libxls that includes those fixes…
dicook.org
Analysing my energy usage
The data is not especially nicely formatted (surprise). The main components are: The code to read the data does the following steps: Below are the calendar plots for the two meters that are running at my apartment. Meter 1 seems to be the daily activity…
ryantravis.netlify.com
Predicting NFL Injuries with Stan Part II
Previously I use data from armchair analysis to build a simple model to predict whether an NFL player would have an injury based only on their position. I restricted the analysis to QBs, RBs, TEs, and WRs…
www.samatkins.me
Converting a React App to TypeScript
Learning some TypeScript has been on my to do list for sometime. I finally found some time and started by reading the docs to get familiar with it. The next step to really help learn it was to actually use it for a project so I decided to convert an existing React app to use TypeScript. This blogpost is a guide on how I did exactly that…
www.ifconfig.it/hugo
Telnet over Internet
A couple of days ago Cisco released a Security Advisory. No big deal so far, level was informational so I didn’t read it right away…
batteriesnotincluded.rbind.io
dockerterm
As seen in the clip above, this initial proof of concept works well, at least on my machine. There are some definite drawbacks and limitations that need to be addressed, but for the most part, I’m pleased with the initial functionality…
ellocke.github.io
(R) Hi-Res Mapping with R for Not-for-Profit Print
Once upon a time (and long before I learned about the tidyverse and %>%, a colleague from a not-for-profit org asked for help with a map for a book…
blog.wallaroolabs.com
Adventures with cgo
A lot of materials have been created to help Go programmers implement Go “best performance practices”. The same can not be said of cgo performance. This is the first post in a series of posts that will discuss cgo performance considerations…
blog.rstudio.com
Arrow and beyond
Feather was a successful project, and has made it easier for thousands of data scientists and data engineers to collaborate across language boundaries…
blog.millerti.me
Ignoring Self-Signed SSL Certificate Errors while using Git
When I interact with most code repositories they’re often hosted on Github, or Gitlab, or some other managed Git service. By and large HTTPS support is a foregone conclusion; I never have issues using git commands or GUI clients to interact with those repositories…
lcolladotor.github.io
Latin American R/BioConductor Developers Workshop 2018
Briefly, these are some of the reasons why they are amazing: This is also a reminder that you have to keep trying. Back in 2009 or 2010 I had gotten an offer of support from Bioc for sending someone to Cuernavaca, but due to funding circumstances it fell through. Like I said, all the pieces fell in the right places this time…
ryantravis.netlify.com
Prediction Assessment with Scoring Rules
Anyone trying to learn how to build and assess prediction models is immediately swamped with a litany of strange sounding performance metrics. In the binary outcome case you might encounter: sensitivity, specificity, precision, recall, accuracy, just to name a few…
www.rdatagen.net
Testing multiple interventions in a single experiment
First, a bit about multi-factorial data. A single factor is a categorical variable that can have any number of levels. In this context, the factor is usually describing some level of intervention or exposure…
www.rladiesnyc.org
Tidy evaluation
For this month’s meetup we have a very special guest : Hadley Wickham! Date: Thursday, April 19, 2018 Time: 6:30pm Speaker: Hadley…
www.gokhanciflikli.com
Yet Another Caret Workshop
Intro Yesterday I gave a workshop on applied predictive modelling1 with caret at the 1st LSE Computational Social Science hackathon. Organiser privileges…
www.jtimm.net
psychological and geographical distance in text
Concreteness ratings and the lexvarsdatr package Context & concreteness scores Geographical distance FIN References This post considers a super-clever study presented in Snefjella and Kuperman (2015), in which the authors investigate the relationship between psychological distance and geographical distance using geolocated…
leonawicz.github.io/blog
tabr package for guitar tablature now on CRAN
While music can be quite complex and a full score will be much longer, something as simple as the following code snippet produces the music notation in the accompanying image. A brief example below highlights the general workflow…
guyabel.com
Animated Directional Chord Diagrams
The next step is to tween the data by migration corridor. (might take a few seconds to fully load) where The function can be used to derive the size of gaps in each frame for a new animated…