dusty.phillips.codes
React Redux Firebase With Firestore Tutorial
Whenever I start a new hobby web project, I just want to jump in and start coding. Instead, I spend many many hours trying to get authentication to work. I’ve got half a dozen half-finished “boilerplate” projects lying around that were supposed to satisfy the desire of, “next time, I can use this boilerplate and authentication will just work…
g-tierney.github.io
Record linkage
I recently encountered a problem that had a surprisingly elegant solution. I struggled a lot with solving this issue, so hopefully in writing this post I can save someone else the trouble! For reasons that are irrelevant, I wanted to track the performance of youth fencers across time…
atusy.github.io/blog
欠損値の発生過程の類別
先日、欠損値の発生過程の例を図示してTweetしたところ、思ったより反響がよかったので、図をブラシュアップの上、記事に残すことにした。…
simplystatistics.org
Constructing a Data Analysis
One of the first revelations I’ve had recently is realizing that data analyses are not naturally occurring phenomena. You will not run into a data analysis while walking out in the woods. Data analyses must be created and constructed by people. One way to think about a data analysis is to think of it as a product to be designed…
data-chips.com
Summer fun
New freelance gigJavaScript-based data visualizationTwo JS data visualization options: D3 and amChartsD3.JSamChartsCrochet washclothsPixel art in crochet projectsPeaceThis summer has gone by fast…
cevo.com.au
Test Driven Infrastructure
Software development has embraced techniques like TDD (Test Driven Development) to help reduce the cycle time between developing code and validating it works…
rviews.rstudio.com
Learning Analytic Administration through a Sandbox
It all starts with sandboxes. Development sandboxes are dedicated safe spaces for experimentation and creativity. A sandbox is a place where you can go to test and break things, without the ramifications of breaking the real, important things…
blog.wallaroolabs.com
Real-time Streaming Pattern
Introduction This week, we continue to look at data processing patterns used to build event triggered stream processing applications, the use cases that the patterns relate to, and how you would go about implementing within…
dusty.phillips.codes
An Order to Learn to Program, Part 3
Parts in this series An Order to Learn to Program, Part 1 An Order to Learn to Program, Part 2 An Order to Learn to Program, Part 3 An Order to Learn to Program, Part 4 Part 3: SQL Basics It’s not common to see SQL as the next language taught after HTML…
robchoudhury.netlify.com
Cyclosporiapsis Outbreak in Texas 2017
Over a 20-day period, the several counties in Texas. Things really light up in the counties that countain big cities like Houston, Austin, and Dallas. Based on the CDC dataset, we can see that things get pretty bad, but start to settle down after mid-July…
blog.zenggyu.com/en
Getting to Know a Database Continued
All the functions are created and have been tested in PostgreSQL v10.5. However, it should be noted that the functions are intended for interactive use only and they may not cover every use case. For example: For example: Another example: For example: Another…
statsbylopez.netlify.com
Lessons hidden in sports betting markets
With sports betting now legal in several US states, I might as well give away my number one piece of advice for amateurs looking to gamble: Don’t bet. It’s an easy recommendation…
robchoudhury.netlify.com
Miami Mango Ratings
Mangos are grown widely in South Florida, but only a few varieties ever make it north, because they don’t really ship well and there isn’t a huge market…
statsbylopez.netlify.com
Part I
The best team in baseball during the 2013 season was the Detroit Tigers. But on October 19th of that year, the best team in baseball was fighting to save its season…
statsbylopez.netlify.com
Part II
The Caps entered a 2nd-round playoff series with Pittsburgh as decent-sized favorites (58 percent), and sure enough, Washington outplayed its rivals…
statsbylopez.netlify.com
Part III
One way in which professional sports are relatively fair is that, in each season, teams are almost always given an identical number of home games…
statsbylopez.netlify.com
Part IV
Thus, perhaps it’s worth taking a step back to analyze what can be a substantially more dominant but often unquantified force that impacts sporting results: who you opponents…
gcppodcast.com
What's new in App Engine with Steren Giannini and Stewart Reichling
What does it mean when the recommendation is to update your…
ropensci.org/technotes
rgbif
We’ve come a long way since Aug 2011. We’ve added a lot of new functionality and many new contributors…
martakolczynska.com
tidytext analysis of TED talks
Setup tidy TED talks Applause, LOL Sentiment This year I spent two weeks of the summer attending the Summer Institute for Computational Social Science Parter Site (SICSS) in Tvärminne and Helsinki, Finland, organized by Matti Nelimarkka from Aalto University and the University of Helsinki, assisted by two TAs: Juho Pääkkönen and Pihla Toivanen from the University of…
robchoudhury.netlify.com
Bird Ages
I wanted to know if there was any signal for how old birds get if theyre closely related. Ultimately, I would love to know how age is affected by size (weight, wingspan, whatever) but I cant find open datasets for that…
blog.sellorm.com
New from RStudio
The post originally appeared on the Mango Solutions blog. One of the few remaining hurdles when working with R in the enterprise is consistent access to CRAN. Often desktop class systems will have unrestricted access while server systems might not have any access at all…
ryantravis.netlify.com
Simple example of fitting splines with mixed models
I thought it might be value to provide some code showing how splines can be fit using mixed…
www.granvillematheson.com
The Weather in Stockholm, Inside and Out, and the Curious Case of Summer 2018
Background Setup SMHI Daily Minimum and Maximum Weekly Temperatures Time Series High Temperatures Low Temperatures Summary Our office Daily Minimum and Maximum Inside Outside No ventilation! Health and Comfort Conclusions Background To start off with, this is my first blog post! Obligatory yay! Anyhow…
amateurdatasci.rbind.io
Waves Intersecting at Right Angles and a Folded Paper
1 Perpendicular Intersecting Waves 1.1 Problem 1.2 Solution 2 A Folded Paper 2.1 Problem 2.2 Solution: Minimum Area 2.3 Solution: Minimum Crease Length 2.4 Solution: Using R (Without Differentiations) 3 Reference library(knitr) library(tidyverse) library(ggthemes) opts_chunk$set( fig…
ropensci.org/blog
What birds are observed near Radolfzell? Bird occurrence data in R
There are two ways to access eBird data with an R package for each of these methods, Now, we can make the query. Now that we have the occurrence data, let’s plot it to see whether trimming is required. Yes, trimming is required! It’d have been too bad not to learn how to do it, anyway. We also add the MPI to the map…
eliocamp.github.io/codigo-r
Wrapping around ggplot2 with ggperiodic
As an atmospheric scientists, a lot of my research consists on plotting and looking at global fields of atmospheric variables like pressure, temperature and the like. Since our planet is a sphere (well, almost), it is unbound and so longitude is a periodic dimension. That is, to the right of 180°E you go back to 180°W…
livefreeordichotomize.com
p-value thoughts
A conversation about how “convincing” various studies were based on sample size and p-values led me to post the following poll on…
aosmith.rbind.io
Automating exploratory plots with ggplot2 and purrr
When you have a lot of variables and need to make a lot exploratory plots it’s usually worthwhile to automate the process in R instead of manually copying and pasting code for every plot…
lenkiefer.com
Facets in space and time
My studies involve a lot of data organized in space and across time. I look at housing data that usually captures activity around the United States, or sometimes the world, and almost always over time. In my data visualization explorations I like to study different ways to visualize trends across both space and time, often simultaneously…
ritsokiguess.site/docs
SAS in R Markdown
Introduction One of the courses I teach is called Applications of Statistical Methods…
dsollberger.netlify.com
Trying Out FlexDashboard
To convert my lab, which was previously in an R Markdown document for HTML output, I had to From there, I also started to arrange the “paragraphs” into separate columns for a nice…
blog.rstudio.com
rstudio
There are fifteen contributed talk slots which are 20 minutes long, and are scheduled alongside talks by RStudio employees and invited speakers…
energychisquared.com
Comparativa de la generación hidráulica y el precio del mercado mayorista
En los últimos meses, la generación hidráulica ha dado de qué hablar en el sector eléctrico…
atusy.github.io/blog
Rmdでchunkごとの実行時間を計測
Jupyter Notebookでは、コードブロック冒頭で %%timeit と唱えると、ブロックの評価に要した時間を表示できる。 https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling[... ](https://atusy.github.io/blog/2018/08/18/time-each-chunk/)
r-mageddon.netlify.com
Reanimating the Datasaurus
Whilst browsing twitter last night I came upon this tweet by the currrent author of gganimate: I’ve started a gganimate wiki page in order to collect…
atusy.github.io/blog
パラメータ付きRmdを試す
パラメータ付きRmdが便利そうだと思ったのでメモと実験 パラメータ付きRmdとは YAMLヘッダーの params で作成される変数のリストを用いたRmd…
yihui.name/en
Create GIFs with gifski in knitr Documents
This will produce the “Pac man” below (assuming the output format is…
lenkiefer.com
Everything is spiraling out of control!
I saw this fun bit of R code in a tweet by user aschinchon. df <- data.frame(x=0, y=0) for (i in 2:500) { df[i,1] <- df[i-1,1]+((0.98)^i)*cos(i) df[i,2] <- df[i-1,2]+((0.98)^i)*sin(i) } ggplot2::ggplot(df, aes(x,y)) + geom_polygon()+ theme_void()#rstats pic.twitter.com/cgNjyk405f - Antonio S…
mgb-research.netlify.com
It's Alive! First Evidence that IBI VizEdit Works
It is official. The program I have spent the better part of a year working on, the very centerpiece of my dissertation, works…
www.stencilled.me
MTA Subway Ridership
var divElement = document.getElementById(‘viz1534474579211’); var vizElement = divElement.getElementsByTagName(‘object’)[0]; vizElement.style.width=‘1000px’;vizElement.style.height=‘827px’; var scriptElement = document.createElement(‘script’); scriptElement.src = ‘https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode…
roelandtn.frama.io
Simple mapping with {sf}
This post is based on a notebook I started about R spatial analysis for the project OSGeoLive It aims to provide a quick introduction to R spatial analysis and cartography and will be extended. R is a language dedicated to statistics and data analysis. It has also a lot of strong packages for spatial analysis…
ritsokiguess.site/docs
A tale of zero kitties
Introduction I was reading this by Sara Stoudt and Kellie Ottoboni, and, looking at Kellie’s analysis, I wondered “how would I do it”, realizing that there are many ways to do things in the Tidyverse world…
dusty.phillips.codes
Computer Vision in Three Lines of Code plus a bunch more lines
My wife and I both have a tendency to leave the garage door open. You’re in and out, grabbing garden tools or supplies, and at the end of the day you enter the house through the back door and forget to check the garage…
sharanry.github.io
Google Summer of Code 2018 @ PyMC
I would like to thank my mentor Colin Carroll and the whole PyMC team for constantly guiding and supporting me and reviewing my code during the GSoC project. It was my privilege working with the PyMC community…
www.rdatagen.net
Multivariate ordinal categorical data generation
In the following examples, I assume four items each with four possible responses - which is different from the…
simplystatistics.org
The Law and Order of Data Science
One conversation I’ve had a few times revolves around the question, “What’s the difference between science and data science?” If I were to come up with a simple distinction, I might say that Science starts with a question; data science starts with the…
rviews.rstudio.com
TokyoR #71
You can find this and several more Japanese R and data sceince books by entering Ishida-san’s name, 石田 基広, into Amazon.co.jp…
blog.wallaroolabs.com
Utilizing Elixir as a lightweight tool to store real-time metrics data
Visibility into performance bottlenecks was the driving force behind the design of Wallaroo’s Monitoring Hub and Metrics…
evangelinereynolds.netlify.com
Visualizing Variance and Standard Deviation
So this wasn’t on today’s to-do list, but there seems to be a cash prize associated with this rabbit hole due to this tweet: So here we go. I’m using the gapminder dataset which is ever-so-handy as it’s available in an R package (thanks Jenny Bryan)…
jenrichmond.rbind.io
hotkeys
There are some commands that as I get more and more familiar with Rstudio I find myself typing again and again. Keyboard shortcuts are helpful (disclaimer: these mac versions). Option-Command-i will open a new Rmarkdown code chunk…
visualizingtheleague.com
Clustering High Scorers by Shot Type
People like arguing about the relative greatness of great basketball players. This desire often sees itself expressed in…less than optimally informed ways…
ropensci.org/technotes
Mongolite 2.0
This week version 2.0 of the mongolite package has been released to CRAN. Major new features in this release include support for MongoDB 4.0, GridFS, running database commands, and connection pooling. New in version 2.0 is support for the MongoDB GridFS system…
visualizingtheleague.com
Where Guys Are From, Part One - Heat Maps
I’ve at some point described every place I’ve ever lived as “a good town/city/region/state for basketball”…
ropensci.org/blog
Where to go observe birds in Radolfzell? An answer with R and open data
Yay, we now know where to find a bird hide not too far from the MPI! Let’s create a basemap for our bounding box, and then add roads and buildings to it. Quite pretty! The lakes can be seen because of the absence of roads and buildings on them…
ryantravis.netlify.com
Fantasy Football Player Rankings
Accurate prediction of player performance is of immense value to those of use who play fantasy football. With this in mind, I was curious about how well simple prediction models could perform in this context. Conveniently, Sean J…
www.ifconfig.it/hugo
Network topology validation with CDP and Python
As most IT professionals I usually configure network devices in a lab environment before the actual installation at customer site…
nowosad.github.io
Pattern-based regionalization
Spatial patterns are an underexplored venue of the Earth and ecological sciences. They could be both an effect of some processes and at the same time affect other ones. For example, a land cover pattern (spatial arrangement of land cover categories) could exist because of an impact of a terrain topology, soils, climate, or human action…
dusty.phillips.codes
Python
I really appreciate Python’s pathlib module for managing filesystem stuff. While I don’t love the argparse module for command line parsing, I don’t think it’s worse than other available options…
blog.rstudio.com
R/Medicine Conference
The keynote speakers will be: Conference talks will address the use of R in medical applications from Phase I clinical trial design through the analysis of the efficacy of medical therapies in public…
martakolczynska.com
Validating survey data
Educational attainment data OECD data SDR data Cleaning and merging SDR and OECD data Results The curious case of ISSP Switzerland Conclusion Appendix with Przemek Powałko General population surveys with representative samples should have a similar education structure as shown by data from administrative sources, especially if survey weights are…
coolbutuseless.github.io
gganimate with sprites
A sprite is a 2d bitmap often used by games to represent objects…
blog.zenggyu.com/en
A Comparison of Different Ways to Define and Return Outputs From PL/pgSQL Functions
Based on the matrix, the following pair-wise comparisons were made: The following code shows the definition of each function and how they are called. The followings are some take-away rules summarised from the…
www.jessemaegan.com
Learning to Learn
As we head out on this adventure, there’s nothing I’d love more than to hear what’s working for you, what you think could be improved, and what should be left on the cutting room…
r-mageddon.netlify.com
The Burger King Pandemic
Whilst I was rooting around for inspiration, my girlfriend suggested I should do a post about food, quickly followed up by “burgers!!!”. So a few google searches later and I decided to create an animated map, showing all the countries of the world that had at least one Burger King store. I’m particularly chuffed that I’ve managed to combine a multitude of really cool packages in this post…
coolbutuseless.github.io
gganimate with bitmap fonts
I put together the little animation shown below, and this is a short guide on how I got there…
irene.rbind.io
A Tale of Two Testing Environments
Background Lesson 1: check suggested packages Lesson 2: MODULARIZE + use vagrant scripts with caution Internet solutions Conclusion Today marks the second time I’ve debugged the problem of tests that pass with devtools::test() but fail with devtools::check()…
www.ashwinmalshe.com
Application Preparation for MS in Data Analytics
The objective of this post is to help prospective MSDA students in preparing their MSDA applications so as to increase their chances of getting admitted to the program. The suggestions below are actually applicable to any good analytics program…
lenkiefer.com
Core Inflation Viz with Progress Bar
About a year ago I shared code for a dataviz with a progress bar. Let’s update that R code using gifski and tweener. The code below will generate this animated gif: Gif code…
energychisquared.com
Forecasting
Realizar un forecast es una tarea que generalmente requiere de conocimiento del sector, hipotetizar sofre efectos que afecten al resultado de estudio y como no, tener unas mínimas habilidades de…
blog.zenggyu.com/en
Getting to Know a Database
Recently I was introduced to a database which contains tens of thousands of tables and millions of columns. Since I don’t have much documentation at hand, I constantly have the feeling that I don’t know enough to use the database while I was exploring it. It should be mentioned that the database I encountered is managed by Greenplum, which is based on…
www.riinu.me
My data science toolbox
I’ve been doing data science for over 10 years now. Although most of this time I didn’t realise I was doing data science. I thought I was just doing normal science but focusing on simulations and data analysis, rather than field or lab work…
www.semidocumentedlife.com
comparing audio features from my monthly playlists, using spotifyr
Spotify’s API functionality seems pretty straightforward. Tracks are curated as playlists by users, and tracks have a bunch of metadata (artist, album, etc.). Each of these entities (tracks, albums, artists, users) has an ID, allowing you to organize and jump between them…
www.williamrchase.com
A responsive CV template with HTML/CSS (Hermione Granger's CV)
I really like this CV template-I think it’s stylish, but not too flashy to distract from the material. I worked out some bugs with the responsiveness and tested on all viewports, but let me know if there are any problems there…
dusty.phillips.codes
An Order to Learn to Program, Part 2
Parts in this series An Order to Learn to Program, Part 1 An Order to Learn to Program, Part 2 An Order to Learn to Program, Part 3 An Order to Learn to Program, Part 4 Part 2: HTML This is the second in a series on the order to study topics related to programming…
mouse-imaging-centre.github.io/blog
Highlights From APPNING and ISMRM 2018
Jacob’s Highlights I recently attended the Workshop on Animal Population Imaging (APPNING 2018) held after the ISMRM conference in Paris…
www.ashwinmalshe.com
Installing R and RStudio
This a short tutorial for the incoming students of UTSA’s MS in Data Analytics program. I am going to assume that the reader has no knowledge of R and RStudio, the Integrated Development Environment (IDE), which we use to code…
www.softinio.com
Life changes and announcing SFBayAreaTech
It is absolutely an awesome experience living in the Bay Area amongst so many super smart techies and great startups. On arrival I had two initial goals, namely, meet everyone in tech and make friends and find a new awesome job with a startup that has a great future and potential solving problems that line up well with my interests and technical interests…
kasparmartens.rbind.io
Neural Processes as distributions over functions
Neural Processes (NPs) caught my attention as they essentially are a neural network (NN) based probabilistic model which can represent a distribution over stochastic processes. So NPs combine elements from two worlds: Both have their advantages and drawbacks…
www.ashwinmalshe.com
Syllabus for DA6233
Data Visualization and Communication - MSDA Fall 2018 Syllabus by Ashwin Malshe on…
cevo.com.au
Why Agile isn't working
Imagine your good friend asks you to come over to his house to check out his brand new Ferrari. Being only ‘slightly’ jealous you head on over. From a distance you see his prize parked majestically in the driveway and, as expected, it looks amazing…
blog.rstudio.com
rstudio
rstudio::conf(2019) continues our tradition of diversity scholarships, and this year we’re doubling the program (again!) to 38 recipients: 32 domestic diversity scholarships available to anyone living in the US or Canada who is a member of a group that is under-represented at rstudio::conf…
engineering.pivotal.io
GiST Support In GPORCA
Pivotal’s SQL Optimizer, GPORCA, does not handle GiST indexes, making any GPORCA generated plan extremely slow when the input grew large. In this blog post, we will look at what GiST indexes are, how we implemented them in GPORCA, and the resulting performance improvement…
rviews.rstudio.com
Highcharting Jobs Friday
Let’s get to it! That plot is nice, but it’s static! Hover on it and you’ll see what I mean. First, we isolate the most recent month by filtering on the last date. We also don’t want the ADP Estimate and filter that out as well. Finally, let’s compare the ADP Estimates to the actual Nonfarm payroll numbers since 2017…
energychisquared.com
Por qué este blog
Allá por noviembre de 2012 comencé un blog dedicado a dos de mis tres pasiones: la energía y la ingeniería…
simplystatistics.org
The Trillion Dollar Question
Recently, Apple’s stock price rose to the point where the company’s market valuation was above $1 trillion, the first U.S. company to reach that benchmark. Subsequently, numerous articles were published describing Apple’s journey to this point and why it got there. Most people describe Apple as a technology company. They make technology products: iPhones, iPads, Macs, etc…
www.niklasjohannes.com
Tools for getting started with your PhD
What do I mean with this? Well, see for yourself whether you recognize any of the following behaviors: However, if you left all of those behaviors behind you long ago, well, you can close this tab and save yourself ten minutes…
peerchristensen.netlify.com
Topic Modelling of Trustpilot Reviews with R and tidytext
Improving the look of figures in ggplot2 is fairly simple. For consistency, we’ll create a clean and simple theme based on the APA theme from the jtools package and change some of the features. The background colour will be set to a light grey hue…
www.tidyverse.org/articles
scales 1.0.0
This is a major release with significant changes to the popular formatter functions, and added transformations…
amateurdatasci.rbind.io
Deriving the Quotient Rule from the Product Rule
1 From Product to Quotient 1.1 Product Rule of Differentiation 1.2 Quotient Rule of Differentiation 1.3 Derive the Quotient Rule from the Product Rule 2 Problems 2.1 Extend the Product Rule and Prove 2…
ropensci.org/technotes
phylotaR
In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples…
blog.themechanicalbear.com
tastytrade - rstats - options...
I have been an options trader and follower of tastytrade research and methods since 2012. For the past few years, I have backtested trading ideas to learn investment strategies and improve my skills in rstats and data analysis…
martakolczynska.com
Dot plot challenge
Getting and reshaping the data The Dot Plot The August edition of the Storytelling with Data challenge #SWDchallenge stars the dot plot…
ropensci.org/blog
Extracting and Processing eBird Data
Access to the full eBird dataset is provided via two large, tab-separated text files. The eBird Basic Dataset (EBD) contains the bird observation information and consists of one line for each observation of a species on a checklist…
engineering.pivotal.io
Frontend Contract Tests Without Magic Numbers
We will explain how we fell into the magic number anti-pattern in our frontend tests. Then, we demonstrate how to use a contract-stub reader to avoid this…