dusty.phillips.codes

React Redux Firebase With Firestore Tutorial

Whenever I start a new hobby web project, I just want to jump in and start coding. Instead, I spend many many hours trying to get authentication to work. I’ve got half a dozen half-finished “boilerplate” projects lying around that were supposed to satisfy the desire of, “next time, I can use this boilerplate and authentication will just work


g-tierney.github.io

Record linkage

I recently encountered a problem that had a surprisingly elegant solution. I struggled a lot with solving this issue, so hopefully in writing this post I can save someone else the trouble! For reasons that are irrelevant, I wanted to track the performance of youth fencers across time


atusy.github.io/blog

欠損値の発生過程の類別

先日、欠損値の発生過程の例を図示してTweetしたところ、思ったより反響がよかったので、図をブラシュアップの上、記事に残すことにした。


simplystatistics.org

Constructing a Data Analysis

One of the first revelations I’ve had recently is realizing that data analyses are not naturally occurring phenomena. You will not run into a data analysis while walking out in the woods. Data analyses must be created and constructed by people. One way to think about a data analysis is to think of it as a product to be designed


data-chips.com

Summer fun

New freelance gigJavaScript-based data visualizationTwo JS data visualization options: D3 and amChartsD3.JSamChartsCrochet washclothsPixel art in crochet projectsPeaceThis summer has gone by fast


cevo.com.au

Test Driven Infrastructure

Software development has embraced techniques like TDD (Test Driven Development) to help reduce the cycle time between developing code and validating it works


rviews.rstudio.com

Learning Analytic Administration through a Sandbox

It all starts with sandboxes. Development sandboxes are dedicated safe spaces for experimentation and creativity. A sandbox is a place where you can go to test and break things, without the ramifications of breaking the real, important things


blog.wallaroolabs.com

Real-time Streaming Pattern

Introduction This week, we continue to look at data processing patterns used to build event triggered stream processing applications, the use cases that the patterns relate to, and how you would go about implementing within


dusty.phillips.codes

An Order to Learn to Program, Part 3

Parts in this series An Order to Learn to Program, Part 1 An Order to Learn to Program, Part 2 An Order to Learn to Program, Part 3 An Order to Learn to Program, Part 4 Part 3: SQL Basics It’s not common to see SQL as the next language taught after HTML


robchoudhury.netlify.com

Cyclosporiapsis Outbreak in Texas 2017

Over a 20-day period, the several counties in Texas. Things really light up in the counties that countain big cities like Houston, Austin, and Dallas. Based on the CDC dataset, we can see that things get pretty bad, but start to settle down after mid-July


blog.zenggyu.com/en

Getting to Know a Database Continued

All the functions are created and have been tested in PostgreSQL v10.5. However, it should be noted that the functions are intended for interactive use only and they may not cover every use case. For example: For example: Another example: For example: Another


statsbylopez.netlify.com

Lessons hidden in sports betting markets

With sports betting now legal in several US states, I might as well give away my number one piece of advice for amateurs looking to gamble: Don’t bet. It’s an easy recommendation


robchoudhury.netlify.com

Miami Mango Ratings

Mangos are grown widely in South Florida, but only a few varieties ever make it north, because they don’t really ship well and there isn’t a huge market


statsbylopez.netlify.com

Part I

The best team in baseball during the 2013 season was the Detroit Tigers. But on October 19th of that year, the best team in baseball was fighting to save its season


statsbylopez.netlify.com

Part II

The Caps entered a 2nd-round playoff series with Pittsburgh as decent-sized favorites (58 percent), and sure enough, Washington outplayed its rivals


statsbylopez.netlify.com

Part III

One way in which professional sports are relatively fair is that, in each season, teams are almost always given an identical number of home games


statsbylopez.netlify.com

Part IV

Thus, perhaps it’s worth taking a step back to analyze what can be a substantially more dominant but often unquantified force that impacts sporting results: who you opponents


gcppodcast.com

What's new in App Engine with Steren Giannini and Stewart Reichling

What does it mean when the recommendation is to update your


ropensci.org/technotes

rgbif

We’ve come a long way since Aug 2011. We’ve added a lot of new functionality and many new contributors


martakolczynska.com

tidytext analysis of TED talks

Setup tidy TED talks Applause, LOL Sentiment This year I spent two weeks of the summer attending the Summer Institute for Computational Social Science Parter Site (SICSS) in Tvärminne and Helsinki, Finland, organized by Matti Nelimarkka from Aalto University and the University of Helsinki, assisted by two TAs: Juho Pääkkönen and Pihla Toivanen from the University of


robchoudhury.netlify.com

Bird Ages

I wanted to know if there was any signal for how old birds get if theyre closely related. Ultimately, I would love to know how age is affected by size (weight, wingspan, whatever) but I cant find open datasets for that


blog.sellorm.com

New from RStudio

The post originally appeared on the Mango Solutions blog. One of the few remaining hurdles when working with R in the enterprise is consistent access to CRAN. Often desktop class systems will have unrestricted access while server systems might not have any access at all


ryantravis.netlify.com

Simple example of fitting splines with mixed models

I thought it might be value to provide some code showing how splines can be fit using mixed


www.granvillematheson.com

The Weather in Stockholm, Inside and Out, and the Curious Case of Summer 2018

Background Setup SMHI Daily Minimum and Maximum Weekly Temperatures Time Series High Temperatures Low Temperatures Summary Our office Daily Minimum and Maximum Inside Outside No ventilation! Health and Comfort Conclusions Background To start off with, this is my first blog post! Obligatory yay! Anyhow


amateurdatasci.rbind.io

Waves Intersecting at Right Angles and a Folded Paper

1 Perpendicular Intersecting Waves 1.1 Problem 1.2 Solution 2 A Folded Paper 2.1 Problem 2.2 Solution: Minimum Area 2.3 Solution: Minimum Crease Length 2.4 Solution: Using R (Without Differentiations) 3 Reference library(knitr) library(tidyverse) library(ggthemes) opts_chunk$set( fig


ropensci.org/blog

What birds are observed near Radolfzell? Bird occurrence data in R

There are two ways to access eBird data with an R package for each of these methods, Now, we can make the query. Now that we have the occurrence data, let’s plot it to see whether trimming is required. Yes, trimming is required! It’d have been too bad not to learn how to do it, anyway. We also add the MPI to the map


eliocamp.github.io/codigo-r

Wrapping around ggplot2 with ggperiodic

As an atmospheric scientists, a lot of my research consists on plotting and looking at global fields of atmospheric variables like pressure, temperature and the like. Since our planet is a sphere (well, almost), it is unbound and so longitude is a periodic dimension. That is, to the right of 180°E you go back to 180°W


livefreeordichotomize.com

p-value thoughts

A conversation about how “convincing” various studies were based on sample size and p-values led me to post the following poll on


aosmith.rbind.io

Automating exploratory plots with ggplot2 and purrr

When you have a lot of variables and need to make a lot exploratory plots it’s usually worthwhile to automate the process in R instead of manually copying and pasting code for every plot


atusy.github.io/blog

Docker情報まとめ

ユーザーをdockerグループにいれると、スーパーユーザーにならずとも docker compose できる。


lenkiefer.com

Facets in space and time

My studies involve a lot of data organized in space and across time. I look at housing data that usually captures activity around the United States, or sometimes the world, and almost always over time. In my data visualization explorations I like to study different ways to visualize trends across both space and time, often simultaneously


blog.themechanicalbear.com

RSI Short Puts

Traders often look for “oversold” stocks to get long


ritsokiguess.site/docs

SAS in R Markdown

Introduction One of the courses I teach is called Applications of Statistical Methods


dsollberger.netlify.com

Trying Out FlexDashboard

To convert my lab, which was previously in an R Markdown document for HTML output, I had to From there, I also started to arrange the “paragraphs” into separate columns for a nice


blog.rstudio.com

rstudio

There are fifteen contributed talk slots which are 20 minutes long, and are scheduled alongside talks by RStudio employees and invited speakers


energychisquared.com

Comparativa de la generación hidráulica y el precio del mercado mayorista

En los últimos meses, la generación hidráulica ha dado de qué hablar en el sector eléctrico


atusy.github.io/blog

Rmdでchunkごとの実行時間を計測

Jupyter Notebookでは、コードブロック冒頭で %%timeit と唱えると、ブロックの評価に要した時間を表示できる。 https://jakevdp.github.io/PythonDataScienceHandbook/01.07-timing-and-profiling[... ](https://atusy.github.io/blog/2018/08/18/time-each-chunk/)


r-mageddon.netlify.com

Reanimating the Datasaurus

Whilst browsing twitter last night I came upon this tweet by the currrent author of gganimate: I’ve started a gganimate wiki page in order to collect


atusy.github.io/blog

パラメータ付きRmdを試す

パラメータ付きRmdが便利そうだと思ったのでメモと実験 パラメータ付きRmdとは YAMLヘッダーの params で作成される変数のリストを用いたRmd


yihui.name/en

Create GIFs with gifski in knitr Documents

This will produce the “Pac man” below (assuming the output format is


lenkiefer.com

Everything is spiraling out of control!

I saw this fun bit of R code in a tweet by user aschinchon. df <- data.frame(x=0, y=0) for (i in 2:500) { df[i,1] <- df[i-1,1]+((0.98)^i)*cos(i) df[i,2] <- df[i-1,2]+((0.98)^i)*sin(i) } ggplot2::ggplot(df, aes(x,y)) + geom_polygon()+ theme_void()#rstats pic.twitter.com/cgNjyk405f - Antonio S


mgb-research.netlify.com

It's Alive! First Evidence that IBI VizEdit Works

It is official. The program I have spent the better part of a year working on, the very centerpiece of my dissertation, works


www.stencilled.me

MTA Subway Ridership

var divElement = document.getElementById(‘viz1534474579211’); var vizElement = divElement.getElementsByTagName(‘object’)[0]; vizElement.style.width=‘1000px’;vizElement.style.height=‘827px’; var scriptElement = document.createElement(‘script’); scriptElement.src = ‘https://public.tableau.com/javascripts/api/viz_v1.js'; vizElement.parentNode


roelandtn.frama.io

Simple mapping with {sf}

This post is based on a notebook I started about R spatial analysis for the project OSGeoLive It aims to provide a quick introduction to R spatial analysis and cartography and will be extended. R is a language dedicated to statistics and data analysis. It has also a lot of strong packages for spatial analysis


ritsokiguess.site/docs

A tale of zero kitties

Introduction I was reading this by Sara Stoudt and Kellie Ottoboni, and, looking at Kellie’s analysis, I wondered “how would I do it”, realizing that there are many ways to do things in the Tidyverse world


dusty.phillips.codes

Computer Vision in Three Lines of Code plus a bunch more lines

My wife and I both have a tendency to leave the garage door open. You’re in and out, grabbing garden tools or supplies, and at the end of the day you enter the house through the back door and forget to check the garage


sharanry.github.io

Google Summer of Code 2018 @ PyMC

I would like to thank my mentor Colin Carroll and the whole PyMC team for constantly guiding and supporting me and reviewing my code during the GSoC project. It was my privilege working with the PyMC community


www.rdatagen.net

Multivariate ordinal categorical data generation

In the following examples, I assume four items each with four possible responses - which is different from the


simplystatistics.org

The Law and Order of Data Science

One conversation I’ve had a few times revolves around the question, “What’s the difference between science and data science?” If I were to come up with a simple distinction, I might say that Science starts with a question; data science starts with the


rviews.rstudio.com

TokyoR #71

You can find this and several more Japanese R and data sceince books by entering Ishida-san’s name, 石田 基広, into Amazon.co.jp


blog.wallaroolabs.com

Utilizing Elixir as a lightweight tool to store real-time metrics data

Visibility into performance bottlenecks was the driving force behind the design of Wallaroo’s Monitoring Hub and Metrics


evangelinereynolds.netlify.com

Visualizing Variance and Standard Deviation

So this wasn’t on today’s to-do list, but there seems to be a cash prize associated with this rabbit hole due to this tweet: So here we go. I’m using the gapminder dataset which is ever-so-handy as it’s available in an R package (thanks Jenny Bryan)


jenrichmond.rbind.io

hotkeys

There are some commands that as I get more and more familiar with Rstudio I find myself typing again and again. Keyboard shortcuts are helpful (disclaimer: these mac versions). Option-Command-i will open a new Rmarkdown code chunk


visualizingtheleague.com

Clustering High Scorers by Shot Type

People like arguing about the relative greatness of great basketball players. This desire often sees itself expressed in…less than optimally informed ways


ropensci.org/technotes

Mongolite 2.0

This week version 2.0 of the mongolite package has been released to CRAN. Major new features in this release include support for MongoDB 4.0, GridFS, running database commands, and connection pooling. New in version 2.0 is support for the MongoDB GridFS system


visualizingtheleague.com

Where Guys Are From, Part One - Heat Maps

I’ve at some point described every place I’ve ever lived as “a good town/city/region/state for basketball”


ropensci.org/blog

Where to go observe birds in Radolfzell? An answer with R and open data

Yay, we now know where to find a bird hide not too far from the MPI! Let’s create a basemap for our bounding box, and then add roads and buildings to it. Quite pretty! The lakes can be seen because of the absence of roads and buildings on them


ryantravis.netlify.com

Fantasy Football Player Rankings

Accurate prediction of player performance is of immense value to those of use who play fantasy football. With this in mind, I was curious about how well simple prediction models could perform in this context. Conveniently, Sean J


www.ifconfig.it/hugo

Network topology validation with CDP and Python

As most IT professionals I usually configure network devices in a lab environment before the actual installation at customer site


nowosad.github.io

Pattern-based regionalization

Spatial patterns are an underexplored venue of the Earth and ecological sciences. They could be both an effect of some processes and at the same time affect other ones. For example, a land cover pattern (spatial arrangement of land cover categories) could exist because of an impact of a terrain topology, soils, climate, or human action


dusty.phillips.codes

Python

I really appreciate Python’s pathlib module for managing filesystem stuff. While I don’t love the argparse module for command line parsing, I don’t think it’s worse than other available options


blog.rstudio.com

R/Medicine Conference

The keynote speakers will be: Conference talks will address the use of R in medical applications from Phase I clinical trial design through the analysis of the efficacy of medical therapies in public


martakolczynska.com

Validating survey data

Educational attainment data OECD data SDR data Cleaning and merging SDR and OECD data Results The curious case of ISSP Switzerland Conclusion Appendix with Przemek Powałko General population surveys with representative samples should have a similar education structure as shown by data from administrative sources, especially if survey weights are


coolbutuseless.github.io

gganimate with sprites

A sprite is a 2d bitmap often used by games to represent objects


blog.zenggyu.com/en

A Comparison of Different Ways to Define and Return Outputs From PL/pgSQL Functions

Based on the matrix, the following pair-wise comparisons were made: The following code shows the definition of each function and how they are called. The followings are some take-away rules summarised from the


www.jessemaegan.com

Learning to Learn

As we head out on this adventure, there’s nothing I’d love more than to hear what’s working for you, what you think could be improved, and what should be left on the cutting room


r-mageddon.netlify.com

The Burger King Pandemic

Whilst I was rooting around for inspiration, my girlfriend suggested I should do a post about food, quickly followed up by “burgers!!!”. So a few google searches later and I decided to create an animated map, showing all the countries of the world that had at least one Burger King store. I’m particularly chuffed that I’ve managed to combine a multitude of really cool packages in this post


coolbutuseless.github.io

gganimate with bitmap fonts

I put together the little animation shown below, and this is a short guide on how I got there


irene.rbind.io

A Tale of Two Testing Environments

Background Lesson 1: check suggested packages Lesson 2: MODULARIZE + use vagrant scripts with caution Internet solutions Conclusion Today marks the second time I’ve debugged the problem of tests that pass with devtools::test() but fail with devtools::check()


www.ashwinmalshe.com

Application Preparation for MS in Data Analytics

The objective of this post is to help prospective MSDA students in preparing their MSDA applications so as to increase their chances of getting admitted to the program. The suggestions below are actually applicable to any good analytics program


lenkiefer.com

Core Inflation Viz with Progress Bar

About a year ago I shared code for a dataviz with a progress bar. Let’s update that R code using gifski and tweener. The code below will generate this animated gif: Gif code


energychisquared.com

Forecasting

Realizar un forecast es una tarea que generalmente requiere de conocimiento del sector, hipotetizar sofre efectos que afecten al resultado de estudio y como no, tener unas mínimas habilidades de


blog.zenggyu.com/en

Getting to Know a Database

Recently I was introduced to a database which contains tens of thousands of tables and millions of columns. Since I don’t have much documentation at hand, I constantly have the feeling that I don’t know enough to use the database while I was exploring it. It should be mentioned that the database I encountered is managed by Greenplum, which is based on


www.riinu.me

My data science toolbox

I’ve been doing data science for over 10 years now. Although most of this time I didn’t realise I was doing data science. I thought I was just doing normal science but focusing on simulations and data analysis, rather than field or lab work


www.semidocumentedlife.com

comparing audio features from my monthly playlists, using spotifyr

Spotify’s API functionality seems pretty straightforward. Tracks are curated as playlists by users, and tracks have a bunch of metadata (artist, album, etc.). Each of these entities (tracks, albums, artists, users) has an ID, allowing you to organize and jump between them


www.williamrchase.com

A responsive CV template with HTML/CSS (Hermione Granger's CV)

I really like this CV template-I think it’s stylish, but not too flashy to distract from the material. I worked out some bugs with the responsiveness and tested on all viewports, but let me know if there are any problems there


dusty.phillips.codes

An Order to Learn to Program, Part 2

Parts in this series An Order to Learn to Program, Part 1 An Order to Learn to Program, Part 2 An Order to Learn to Program, Part 3 An Order to Learn to Program, Part 4 Part 2: HTML This is the second in a series on the order to study topics related to programming


mouse-imaging-centre.github.io/blog

Highlights From APPNING and ISMRM 2018

Jacob’s Highlights I recently attended the Workshop on Animal Population Imaging (APPNING 2018) held after the ISMRM conference in Paris


www.ashwinmalshe.com

Installing R and RStudio

This a short tutorial for the incoming students of UTSA’s MS in Data Analytics program. I am going to assume that the reader has no knowledge of R and RStudio, the Integrated Development Environment (IDE), which we use to code


www.softinio.com

Life changes and announcing SFBayAreaTech

It is absolutely an awesome experience living in the Bay Area amongst so many super smart techies and great startups. On arrival I had two initial goals, namely, meet everyone in tech and make friends and find a new awesome job with a startup that has a great future and potential solving problems that line up well with my interests and technical interests


kasparmartens.rbind.io

Neural Processes as distributions over functions

Neural Processes (NPs) caught my attention as they essentially are a neural network (NN) based probabilistic model which can represent a distribution over stochastic processes. So NPs combine elements from two worlds: Both have their advantages and drawbacks


www.ashwinmalshe.com

Syllabus for DA6233

Data Visualization and Communication - MSDA Fall 2018 Syllabus by Ashwin Malshe on


cevo.com.au

Why Agile isn't working

Imagine your good friend asks you to come over to his house to check out his brand new Ferrari. Being only ‘slightly’ jealous you head on over. From a distance you see his prize parked majestically in the driveway and, as expected, it looks amazing


blog.rstudio.com

rstudio

rstudio::conf(2019) continues our tradition of diversity scholarships, and this year we’re doubling the program (again!) to 38 recipients: 32 domestic diversity scholarships available to anyone living in the US or Canada who is a member of a group that is under-represented at rstudio::conf


engineering.pivotal.io

GiST Support In GPORCA

Pivotal’s SQL Optimizer, GPORCA, does not handle GiST indexes, making any GPORCA generated plan extremely slow when the input grew large. In this blog post, we will look at what GiST indexes are, how we implemented them in GPORCA, and the resulting performance improvement


rviews.rstudio.com

Highcharting Jobs Friday

Let’s get to it! That plot is nice, but it’s static! Hover on it and you’ll see what I mean. First, we isolate the most recent month by filtering on the last date. We also don’t want the ADP Estimate and filter that out as well. Finally, let’s compare the ADP Estimates to the actual Nonfarm payroll numbers since 2017


energychisquared.com

Por qué este blog

Allá por noviembre de 2012 comencé un blog dedicado a dos de mis tres pasiones: la energía y la ingeniería


simplystatistics.org

The Trillion Dollar Question

Recently, Apple’s stock price rose to the point where the company’s market valuation was above $1 trillion, the first U.S. company to reach that benchmark. Subsequently, numerous articles were published describing Apple’s journey to this point and why it got there. Most people describe Apple as a technology company. They make technology products: iPhones, iPads, Macs, etc


www.niklasjohannes.com

Tools for getting started with your PhD

What do I mean with this? Well, see for yourself whether you recognize any of the following behaviors: However, if you left all of those behaviors behind you long ago, well, you can close this tab and save yourself ten minutes


peerchristensen.netlify.com

Topic Modelling of Trustpilot Reviews with R and tidytext

Improving the look of figures in ggplot2 is fairly simple. For consistency, we’ll create a clean and simple theme based on the APA theme from the jtools package and change some of the features. The background colour will be set to a light grey hue


www.tidyverse.org/articles

ggplot2 3.0.0: community posts

Spoiler alert: excellent material


yihui.name/en

makeActiveBinding()

Finally, we generate all outputs to build the dependency table


www.tidyverse.org/articles

scales 1.0.0

This is a major release with significant changes to the popular formatter functions, and added transformations


amateurdatasci.rbind.io

Deriving the Quotient Rule from the Product Rule

1 From Product to Quotient 1.1 Product Rule of Differentiation 1.2 Quotient Rule of Differentiation 1.3 Derive the Quotient Rule from the Product Rule 2 Problems 2.1 Extend the Product Rule and Prove 2


yihui.name/en

Double Negatives

Once again: naming, is,


ropensci.org/technotes

phylotaR

In this technote I will outline what phylotaR was developed for, how to install it and how to run it with some simple examples


blog.themechanicalbear.com

tastytrade - rstats - options...

I have been an options trader and follower of tastytrade research and methods since 2012. For the past few years, I have backtested trading ideas to learn investment strategies and improve my skills in rstats and data analysis


martakolczynska.com

Dot plot challenge

Getting and reshaping the data The Dot Plot The August edition of the Storytelling with Data challenge #SWDchallenge stars the dot plot


ropensci.org/blog

Extracting and Processing eBird Data

Access to the full eBird dataset is provided via two large, tab-separated text files. The eBird Basic Dataset (EBD) contains the bird observation information and consists of one line for each observation of a species on a checklist


engineering.pivotal.io

Frontend Contract Tests Without Magic Numbers

We will explain how we fell into the magic number anti-pattern in our frontend tests. Then, we demonstrate how to use a contract-stub reader to avoid this