mailund.github.io/r-programmer-blog
pmatch 0.1.3
I have just released version 0.1.3 of pmatch to CRAN. There are not a lot of changes to this version compared to 0.1…
blog-mjay.firebaseapp.com
Handling Large text data in production
Question: I have a 5GB a single JSON file which consists of million of queries (questions) with its answers. Can anyone please tell me, 1) How to handle such kind of large datasets? 2) Objective is to find out the top 5 similar query along with the answers which is similar to the provided a new test query…
www.robert-hickman.eu
Hello World! And A Small Chess Plotting Package
To copy the readme mini-vignette provides a nice overview of the uber-function which goes from pgn -> gif. There’s also a load of semi-arranged smaller functions used to work out the positions of the pieces and plot the board etc…
gcppodcast.com
NVIDIA and Deep Learning Research with Bryan Catanzaro
Bryan Catanzaro is VP of Applied Deep Learning Research at NVIDIA, where he leads a team solving problems in domains ranging from video games to chip design using deep learning…
yihui.name/en
Let MiKTeX Install Missing LaTeX Packages Automatically
From the viewpoint of the developer, it is absolutely the right thing to do to ask users before installing the missing LaTeX packages. However, from the viewpoint of users, I guess 99.99% of users will agree to install the missing packages…
engineering.pivotal.io
Failure is a part of the game
Failure can come in a variety of avenues in life - career, love, and friendships to name a few. I’m choosing to highlight a few extra curricular projects to illustrate how success can come from failure…
leonawicz.github.io/blog
Introducing tabr
While music can be quite complex and a full score will be much longer, something as simple as the following code snippet produces the music notation in the accompanying image. You can install tabr from GitHub with: Finally, there are nonetheless limitations to LilyPond itself…
research.libd.org/rstatsclub
Introduction to Scraping and Wrangling Tables from Research Articles
Next, get the url of the webpage where the table is stored. This is what the first few lines of our scraped product looks like: While this table has the information we want, it is clearly still a mess…
www.rladiesnyc.org
Brooke Watson
Open source software is made for remixing. When I first switched from STATA to R, I was comfortable using predefined packages and commands, but it quickly became apparent that R’s appeal lies in the power to write custom functions and packages. What’s more, because R is open source, these packages don’t have to be built from scratch…
ryanestrellado.netlify.com
California School Dashboards Part 2
This is part two of a three part series where I’ll be working with California School Dashboard data by cleaning, visualizaing, and exploring through modeling. You can read the first part of this series, which shows one way to clean and prepare the data, at this…
www.rdatagen.net
Exploring the underlying theory of the chi-square test through simulation - part 1
Kids today are so sophisticated (at least they are in New York City, where I live). While I didn’t hear about the chi-square test of independence until my first stint in graduate school, they’re already talking about it in high school…
thomasmock.netlify.com
Functional programming in R with Purrr
When you first started in R you likely were writing simple code to generate one outcome. This is great, you are learning about strings, math, and vectors in R! Then you get started with some basic analyses…
blog.mgechev.com
Machine Learning-Driven Bundling. The Future of JavaScript Tooling.
In this article, I’ll introduce the early implementation of a few tools which based on techniques from the machine learning allow us to perform data-driven chunk clustering and pre-fetching for single-page applications…
lenkiefer.com
Pipe Tweenr
I LIKE TO MAKE ANIMATIONS WITH R. Sometimes folks ask me how they add to understanding. They don’t always, but often, particularly when you are working with time series, I find they help visualize trends and understand the evolution of variables…
r-tastic.co.uk
Prime hints for running a data project in R
I’ve been asked more and more for hints and best practices when working with R. It can be a daunting task, depending on how deep or specialised you want to be…
mailund.github.io/r-programmer-blog
Transforming functions with cases calls
The issue with byte-compilation I wrote about yesterday can indeed be fixed by transforming functions that call cases…
mailund.github.io/r-programmer-blog
Building a package that uses pattern matching
After a week spend programming string algorithms in C—for teaching purposes, I am not working on a new read-mapper—it is nice to get back to programming in R…
emmavestesson.netlify.com
Settlers of Catan
Settlers of catan In our living room there is an old chest hiding some real treasures. Every now and again we will get Settlers of Catan out. I never grow tired of playing it as the board changes every time…
mailund.github.io/r-programmer-blog
tailr v0.1.1
As I wrote about here and here, I had a problem in tailr with higher-order functions…
thomasmock.netlify.com
A gentle guide to Tidy statistics in R
We will be using MMSE (mini-mental status exam) scores to assess the degree of cognitive impairment…
engineering.pivotal.io
Benchmarking the Disk Speed of IaaSes
In this blog post we record three metrics: And we record them for the following IaaSes: The table below summarizes our findings: AWS’s io1 storage is a “tunable” storage offering — you specify the number of IOPS you require…
wirtel.be
Challenge for 2018
About I think I am just frustrated because I don’t have time to share my experience or just continue to contribute to several open source projects…
jsmayorga.com
Getting Global Fishing Watch Data from Google Big Query using R
Now we are all set to start querying and analyzing Global Fishing Watch’s data…
statsbylopez.netlify.com
The impact of make-up calls is probably bigger than you think
A common line of reasoning behind call reversals is that referees would prefer to $not$ be part of the final narrative as to why a particular team won or lost…
adamspannbauer.github.io
Building a Rasa Chatbot to Perform Natural Language Queries
In this post I’ll be sharing a stateless chat bot built with Rasa. The bot has been trained to perform natural language queries against the iTunes Charts to retrieve app rank data. A preview of the bot’s capabilities can be seen in a small Dash app that appears in the gif below…
wytham.rbind.io
Contra JFK, use R because it is easy, not because it is hard
There’s an attitude that I believe is reasonably common among applied economists, that coding is the easy part of what we do and where we add the least…
yihui.name/en
Don't Use Spaces or Underscores in File Paths; Use Dashes Instead
These could be bad chunk labels: Sigh.…
jsmayorga.com
Mapping the Global Network of Transnational Fisheries
All the analysis is done in R, with Studio, using the following packages: Here is snippet of the dataset: where: We have excluded here 1) the connections between EU members states 2) the EU Northern agreements with Norway, and Iceland, 3) connections between sovereign states, e,g: France and Reunion and 4) disputed or jointly managed marine…
www.cultureofinsight.com/blog
Responsive iframes for Shiny Apps
Getting Shiny out into the wild Shiny has really changed game in terms of analytical web-application development…
www.mytinyshinys.com
Where should I live?
After a long spate of just soccer posts, it is a relief(delayed pun intended) to turn to a quick look at a relatively new package, weathercan released under the https://ropensci[... ](https://www.mytinyshinys.com/2018/03/15/weathercan-package/)
blog.wallaroolabs.com
Your Wallaroo Questions Answered
Wallaroo Labs has received a lot of great feedback from developers on Hacker News and other communities…
www.mytinyshinys.com
EPL Week 30
For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayRashford schools Alex-Arnold Palace joined WBA with a league-leading ninth one-goal defeat…
gcppodcast.com
OpenCensus with Morgan McLean and JBD
Morgan McLean is the Product Manager for Tracing, Debugging, and Profiling at Google, including OpenCensus I heard there are abilities to natively extend Kubernetes - what does that mean, and also how do I do…
roh.engineering
fitur 0.5.25 Release
Fixed appearance of plots Added plot_density function for comparison pdfs of fitted functions Updated argument naming conventions The output is a ggplot object so you can add colors, styling, etc…
lbusett.netlify.com
Automatically importing publications from bibtex to a hugo-academic blog
All is left is to run the script: Running the script will give you files like this one: , where I tweaked a bit the hugo-academic format to include bibliographic info such as volume, number, pages and doi link…
www.ifconfig.it/hugo
FMC API and TextFSM
Automation and programmability is not a new topic for me. Having studied Information Technology in High School I’ve always coded somehow, never making it my primary focus but always using it as a tool to make my life easier…
nowosad.github.io
Making maps of the USA with R
The next step is to calculate scale relations between the mainland and Hawaii as well as Alaska…
purrple.cat/blog
The not so obvious value of build passing
The real problem however is that because you know it was broken already, you just don’t pay attention so you don’t have a way to quickly assert if you have just added a new 🐞, and then you live in fear…
www.redbandsports.net
Did the 16-team format actually ruin the Brier and Scotties?
By Nick Faris and Guy Spurrier Brad Gushue had reason to be ruffled. It was last Wednesday in Regina, and the Team Canada skip had just suffered his first loss at the Brier, a 10-7 defeat to Brendan Bottcher of Alberta…
lenkiefer.com
Forecasting Game
LET’S PICK BACK UP where we left off and think about communicating forecast results. To help guide our thinking, let’s set up a little game. Basic setup Like last time we’re going to focus on a situation where a forecaster observes some information about the world and makes an announcement about a future binary outcome…
yihui.name/en
Miao YU is Looking for a Postdoc Position in the US
My friend Miao is going to finish his postdoctoral research in Waterloo this summer. Currently he is looking for a new postdoc position in the US (no particular preference over the location in US)…
mailund.github.io/r-programmer-blog
Red-black trees in matchbox
I’m working on implementing red-black search trees in matchbox and have managed most of it by now…
www.jtimm.net
place from text
From text to map Corpus search and context LSA, MDS, and semantic space FIN In this post, we demonstrate some different methodologies for exploring the geographical information found in…
lenkiefer.com
Charting Jobs Friday with R
LAST FRIDAY WAS JOBS FRIDAY, the day when the U.S. Bureau of Labor Statistics (BLS) releases its monthly employment situation report. This report is blanketed with media coverage and economist and financial analysts all over the world pay close attention to the report…
research.libd.org/rstatsclub
Edit your bashrc file for a nicer terminal experience
Next open them with your text editor (say Notepad++, TextMate 2, RStudio, among others) and paste the following contents. The next change will save you a lot of time! Plus it goes nicely with the bash history changes we just made…
mailund.github.io/r-programmer-blog
Linked lists in matchbox
I have started playing with data structures in matchbox and the first structure I implement had to be linked lists…
research.libd.org/rstatsclub
Textmate setup (Mac only)
Sometimes students are interested in this setup, which is what I’ll document here. Though I want to highlight that you can get a very similar setup using other tools. Note that this setup only works for Mac computers. and under bundle, choose the R bundle as shown below. As you can see, it hasn’t been updated in a while…
purrple.cat/blog
Anagrams
Next we allocate some data structures. Then, the results vector, where we will collect the anagrams. I just hope however that walking through the code is useful…
blog.schochastics.net
Analyzing NBA Player Data III
I decided to map the 70 stats into a 10 dimensional space. This “new” space supposedly preserves the intrinsic distance of the “old” space, but reduces the noise of the original data so that the differences and similarities of players become more evident…
www.sastibe.de
Benchmarking AWS Instances with MNIST classification
These instances differ in two dimensions: price and performance. Obviously, these dimensions are highly correlated, since higher price means (or should mean, at least) higher performance…
roh.engineering
Building Distribution Reference Tables in R
I’ve recently been studing for a professional exam that does not allow computers or advanced calculators. Some of the subject matter will require use of a few statistical distributions which can be very time-consuming to calculate manually…
research.libd.org/rstatsclub
Contributing to the LIBD rstats club
We first need to get the appropriate tools installed in our computer. Ok, lets go head and install it with The file structure of our blog involves a total of 3 GitHub repositories that are related to each other as shown below. However, you will only need to interact with one of them…
mailund.github.io/r-programmer-blog
Problems With Higher Order Functions in tailr
Ok, there is a problem with higher-order functions in my tailr package that I ran into while writing linked list functions for my matchbox package…
research.libd.org/rstatsclub
Welcome to the LIBD rstats club!
PS This is not a course or boot camp site to get started using R, for that there are other resources available…
theaknowles.com
Workshop materials
The real reason that Sally and I decided not to present, instead seeking out other women in the community to do so, was that we simply knew these women were out there and wanted them to join our…
www.mytinyshinys.com
EPL Week 29
For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayIn 273 league appearances with Manchester United Patrice Evra was part of a line up that conceded 4 or more goals on 5 occasions…
fharrell.com
Improving Research Through Safer Learning from Data
The following discussion concentrates on inference, although several of the concepts, especially measurement accuracy, fully pertain to exploratory data analysis…
mailund.github.io/r-programmer-blog
Matchbox and CMD CHECK
Ok, first, the package I wrote about yesterday will be called matchbox, following Dmytro Perepolkin’s suggestion (thanks!). You can get it at GitHub…
blog.wallaroolabs.com
Performance testing a low-latency stream processing system
At Wallaroo Labs we’ve been working on our stream processing engine, Wallaroo for just under two years now. We’ve designed Wallaroo to be able to handle millions of messages a second on a single server with low microsecond latencies…
lcolladotor.github.io
blogdown archetype (template)
I also like reminding myself how to do some common tasks. Basically, the equivalent of the new R Markdown file you get when using RStudio. In my case, I want to remind myself of the YAML options I frequently use (toc, fig height and width) or how to add screenshots…
timtrice.net
Answering Simple Questions in SQL Coding Interviews
Any database server will do. For this instance, MySQL is used. When I read this post I immeidately wondered if they were trick questions I wasn’t picking up on. They all seemed very easy to answer. They were a little tricky, but certainly not unreasonably complicated…
mailund.github.io/r-programmer-blog
Help Me Choose a Package Name
What’s in a name? That which we call a rose By any other word would smell as sweet — William Shakespeare, Romeo and Juliet I have plans for re-implementing several of the data structures I wrote about in Functional Data Structures in…
emmavestesson.netlify.com
My trip to Mexico in emojis
The code that never really worked This post is the result of hours and hours of me trying to write some code but never getting it quite right. I think that one of the worst things that you can do to yourself is to be scared to admit that you are struggling because you will end up never trying anything new…
statsbylopez.netlify.com
On the risks of categorizing a continuous variable (with an application to baseball data)
In the third inning during a contest a few weeks back between the Nationals and Cubs, Washington’s Brian Goodwin hit a line drive to left field with two outs and a runner on third. Despite an initial pause, Chicago’s Kyle Schwarber ran in and attempted to field the ball around his knees…
blog.rstudio.com
Platform Deprecation Strategy
In an effort to streamline product development, maintenance, and support to ensure the best experience for our users, we have created a strategy for operating system and browser deprecation…
purrple.cat/blog
arrow, rrrow, rcher, spurrrow
Here I am again at the conundrum of choosing a name for a thing. This is hard, I like when it’s over and I have the perfect name, and I feel finally free to try to match the personnality of code to the name…
lcolladotor.github.io
blogdown Insert Image addin
So we need to use either the Markdown or HTML syntax for adding the image. Maybe your initial thought is to use: If you want to edit the height or width, then you need to use the HTML syntax. Something like: So lets go head and select an image we want to upload. In my case, I chose an image that already exists…
ellocke.github.io
(R) Joyplots
German Zweites Juristisches Staatsexamen (2nd State Exam in Laws) is said to be tough. Let’s have a look at how hard it really is by visualising the distribution of grades from the Berlin 2017/IV campaign. The written part of the final exam consists of 7 handwritten 5-hour length cases…
www.redbandsports.net
How many home runs is Josh Donaldson predicted to hit in 2018
Last week, Birds All Day podcast cohost Drew Fairservice re-tweeted some over-under betting lines on individual home run totals for the coming baseball season. Over on Goldy and Bryce Harper plz. https://t.co/Smha0KYulI - Drew F (@DrewGROF) February 26, 2018 In the replies to the tweet, he wondered how the lines compared to projections…
www.ifconfig.it/hugo
Mikrotik hAP lite classic
For a Network Engineer living and working on the field has some challenges that are not common in office environments…
research.libd.org/rstatsclub
Test post for checking website
Useful links for editing the theme: This blog post was made possible thanks…
yihui.name/en
The R Markdown Source Documents of My Presentations
The Rmd source file can be obtained…
ndres.me
Top 5 Best Jupyter Notebook Extensions
Notebook extensions are plug-ins that you can easily add to your Jupyter notebooks…
mailund.github.io/r-programmer-blog
Variable bindings with pattern matching
I just added a new feature to my pmatch package…
mouse-imaging-centre.github.io/blog
Why Relative Volumes Matter
Intro Figuring out how one group’s brain is different from another’s is a big part of neuroscience. MRI-neuroanatomy – the study of the sizes of brain regions – is a wonderful tool for this job and makes up a sizeable chunk of what we study…
www.aggieerin.com
Working With Messy Text
Heyo! I am doing my best to procrastinate here on a blustery Tuesday afternoon. So, I decided to share some code I’ve put together that solves problems in R that I used to do in perl. HTML or C++ was probably my first real language, but I love the heck out of perl…
purrple.cat/blog
first dplyr mondays
Four issues (and then some comments to other issues) feels minimal, but this includes getting back on track with the codebase and setup a more formal way of contributing, i.e. through pull requests, reviews, and systematic testing. Today is another day, so I’ll work on another project…
mailund.github.io/r-programmer-blog
Another approach to evaluating dynprog expressions
In the approach to evaluating dynamic-programming expressions, that I wrote about yesterday, I used ranges- and recursion-specifications to build a loop for updating a table and then evaluated that loop inside an environment where local variables would over-scope the quosure environment from the…
lenkiefer.com
Forecasting and deciding binary outcomes under asymmetric information
LAST WEEK IN THE WALL STREET JOURNAL an article LINK talked about how pundits can strategically make probabilistic forecasts. It seems 40% is a sort of magic number, where it’s high enough that if the event comes true you can claim credit as a forecaster, but if it doesn’t happen, you still gave it less than 50⁄50 odds…
purrple.cat/blog
Strings know their own length
A simple way to get that information in C would be a loop (it’s ok to write loops in C)…
yihui.name/en
The Video of My Talk on blogdown at rstudio
After I gave the talk on blogdown at rstudio::conf 2018, some people really thought I was a fast typist, because I “typed” a relatively long post at the podium in just 20 seconds. On one hand, I was happy to learn that, because my trick worked (too) well on some people…
blog.schochastics.net
Analyzing NBA Player Data II
The next step will be to reduce the noise of the data by considering only players that played more than 500 minutes in a season. This leaves us with 705 rows to analyze. To obtain our very own set of new positions, we will go through the following steps: Now we need to decide on how many components to keep…
adamspannbauer.github.io
Building an Image Search Engine with Pretrained ResNet50 from Keras
In this post we’ll be using the pretrained ResNet50 ImageNet weights shipped with Keras as a foundation for building a small image search engine. In the below image we can see some sample output from our final product. As in my last post we’ll be working with app icons that we’re gathered by this scrape script…
mailund.github.io/r-programmer-blog
Comments
Ok, if this is working, it should now be possible to comment on posts here…
mailund.github.io/r-programmer-blog
Evaluating dynprog expressions
I think I now have a complete implementation of the dynamic programming DSL I wrote about the other day…
blog.schochastics.net
Analyzing NBA Player Data I
As a football (soccer) data enthusiast, I have always been jealous of the amount of available data for American sports. While much of the interesting football data is proprietary, you can can get virtually anything of interest for the NBA, MLB, NFL or NHL…
davemcg.github.io
Are you in genomics and building models? Stop using ROC - use PR
Area Under the Curve (AUC) of Receiver Operating Characteristic (ROC) is a terrible metric for a genomics problem. Do not use it. This metric also goes by AUC or AUROC. Use Precision Recall AUC…
mailund.github.io/r-programmer-blog
Designing a DSL for dynamic programming
I’m working on an example for one of the chapters of Domain Specific Languages in R that will appear in the printed version but weren’t included in the earlier e-book…
www.mytinyshinys.com
EPL Week 28
For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayAshley Young needs to play just over 3 full games to record the most minutes in a season wearing a Man. Utd…
lenkiefer.com
Rock that dadbod plot!
Spring is nearly upon us, or at least we can hope. Let’s examine how housing activity typically rounds into shape as the weather warms up. We’ll make some fun plots with R. Seasonality in housing data Housing market activity in the United States is highly seasonal. Consider this animated plot. This plot shows U.S…
davemcg.github.io
Something Different
title: ‘Something Different: Automated Neighborhood Traffic Monitoring’ author: David McGaughey date: ‘2018-03-03’ slug: traffic-monitoring-intro categories: - R - python - raspberry - pi tags: - R - python - raspberry - pi — This is, obviously, a personal project. Traffic is a concern in my town…
mailund.github.io/r-programmer-blog
Tick-marks for log10 axis
For the tailr post I needed to plot some benchmark results…
www.carlbfrederick.com
Uncovering the relationships among functions in a package
The practical value of this exercise comes from the following sorts of insights: Thanks for reading! Feel free to join the discussion…
mailund.github.io/r-programmer-blog
Blog Setup
Ok, I hadn’t planned to write any more about how the blog is set up, since that isn’t that interesting to me and probably isn’t to you — either you know a lot more about this than me, in which case I have nothing to teach you, or you just don’t care, which I can relate…
adamspannbauer.github.io
Finding and Using Images' Dominant Colors using Python & OpenCV
This post is about finding an image’s dominant color. To illustrate this concept we’ll be working with app icons from the Apple App Store…
drmolina.netlify.com
OktoberfestR
The app takes data from the Oktoberfest since 1985 and until 2016 (so far)…
mailund.github.io/r-programmer-blog
Purpose of this blog
Since I already have a blog, you might be asking, “why another one?”…
purrple.cat/blog
multiple lags with tidy evaluation
Let’s break it down in steps. The function takes 3 parameters: but with hopefully with nicer (or at least shorter) syntax: As this is often the case, I immediately posted this on twitter…