mailund.github.io/r-programmer-blog

pmatch 0.1.3

I have just released version 0.1.3 of pmatch to CRAN. There are not a lot of changes to this version compared to 0.1


blog-mjay.firebaseapp.com

Handling Large text data in production

Question: I have a 5GB a single JSON file which consists of million of queries (questions) with its answers. Can anyone please tell me, 1) How to handle such kind of large datasets? 2) Objective is to find out the top 5 similar query along with the answers which is similar to the provided a new test query


www.robert-hickman.eu

Hello World! And A Small Chess Plotting Package

To copy the readme mini-vignette provides a nice overview of the uber-function which goes from pgn -> gif. There’s also a load of semi-arranged smaller functions used to work out the positions of the pieces and plot the board etc


gcppodcast.com

NVIDIA and Deep Learning Research with Bryan Catanzaro

Bryan Catanzaro is VP of Applied Deep Learning Research at NVIDIA, where he leads a team solving problems in domains ranging from video games to chip design using deep learning


yihui.name/en

Let MiKTeX Install Missing LaTeX Packages Automatically

From the viewpoint of the developer, it is absolutely the right thing to do to ask users before installing the missing LaTeX packages. However, from the viewpoint of users, I guess 99.99% of users will agree to install the missing packages


engineering.pivotal.io

Failure is a part of the game

Failure can come in a variety of avenues in life - career, love, and friendships to name a few. I’m choosing to highlight a few extra curricular projects to illustrate how success can come from failure


leonawicz.github.io/blog

Introducing tabr

While music can be quite complex and a full score will be much longer, something as simple as the following code snippet produces the music notation in the accompanying image. You can install tabr from GitHub with: Finally, there are nonetheless limitations to LilyPond itself


research.libd.org/rstatsclub

Introduction to Scraping and Wrangling Tables from Research Articles

Next, get the url of the webpage where the table is stored. This is what the first few lines of our scraped product looks like: While this table has the information we want, it is clearly still a mess


www.rladiesnyc.org

Brooke Watson

Open source software is made for remixing. When I first switched from STATA to R, I was comfortable using predefined packages and commands, but it quickly became apparent that R’s appeal lies in the power to write custom functions and packages. What’s more, because R is open source, these packages don’t have to be built from scratch


ryanestrellado.netlify.com

California School Dashboards Part 2

This is part two of a three part series where I’ll be working with California School Dashboard data by cleaning, visualizaing, and exploring through modeling. You can read the first part of this series, which shows one way to clean and prepare the data, at this


www.rdatagen.net

Exploring the underlying theory of the chi-square test through simulation - part 1

Kids today are so sophisticated (at least they are in New York City, where I live). While I didn’t hear about the chi-square test of independence until my first stint in graduate school, they’re already talking about it in high school


thomasmock.netlify.com

Functional programming in R with Purrr

When you first started in R you likely were writing simple code to generate one outcome. This is great, you are learning about strings, math, and vectors in R! Then you get started with some basic analyses


blog.mgechev.com

Machine Learning-Driven Bundling. The Future of JavaScript Tooling.

In this article, I’ll introduce the early implementation of a few tools which based on techniques from the machine learning allow us to perform data-driven chunk clustering and pre-fetching for single-page applications


lenkiefer.com

Pipe Tweenr

I LIKE TO MAKE ANIMATIONS WITH R. Sometimes folks ask me how they add to understanding. They don’t always, but often, particularly when you are working with time series, I find they help visualize trends and understand the evolution of variables


r-tastic.co.uk

Prime hints for running a data project in R

I’ve been asked more and more for hints and best practices when working with R. It can be a daunting task, depending on how deep or specialised you want to be


mailund.github.io/r-programmer-blog

Transforming functions with cases calls

The issue with byte-compilation I wrote about yesterday can indeed be fixed by transforming functions that call cases


mailund.github.io/r-programmer-blog

Building a package that uses pattern matching

After a week spend programming string algorithms in C—for teaching purposes, I am not working on a new read-mapper—it is nice to get back to programming in R


emmavestesson.netlify.com

Settlers of Catan

Settlers of catan In our living room there is an old chest hiding some real treasures. Every now and again we will get Settlers of Catan out. I never grow tired of playing it as the board changes every time


mailund.github.io/r-programmer-blog

tailr v0.1.1

As I wrote about here and here, I had a problem in tailr with higher-order functions


thomasmock.netlify.com

A gentle guide to Tidy statistics in R

We will be using MMSE (mini-mental status exam) scores to assess the degree of cognitive impairment


engineering.pivotal.io

Benchmarking the Disk Speed of IaaSes

In this blog post we record three metrics: And we record them for the following IaaSes: The table below summarizes our findings: AWS’s io1 storage is a “tunable” storage offering — you specify the number of IOPS you require


wirtel.be

Challenge for 2018

About I think I am just frustrated because I don’t have time to share my experience or just continue to contribute to several open source projects


jsmayorga.com

Getting Global Fishing Watch Data from Google Big Query using R

Now we are all set to start querying and analyzing Global Fishing Watch’s data


statsbylopez.netlify.com

The impact of make-up calls is probably bigger than you think

A common line of reasoning behind call reversals is that referees would prefer to $not$ be part of the final narrative as to why a particular team won or lost


adamspannbauer.github.io

Building a Rasa Chatbot to Perform Natural Language Queries

In this post I’ll be sharing a stateless chat bot built with Rasa. The bot has been trained to perform natural language queries against the iTunes Charts to retrieve app rank data. A preview of the bot’s capabilities can be seen in a small Dash app that appears in the gif below


wytham.rbind.io

Contra JFK, use R because it is easy, not because it is hard

There’s an attitude that I believe is reasonably common among applied economists, that coding is the easy part of what we do and where we add the least


yihui.name/en

Don't Use Spaces or Underscores in File Paths; Use Dashes Instead

These could be bad chunk labels: Sigh.


jsmayorga.com

Mapping the Global Network of Transnational Fisheries

All the analysis is done in R, with Studio, using the following packages: Here is snippet of the dataset: where: We have excluded here 1) the connections between EU members states 2) the EU Northern agreements with Norway, and Iceland, 3) connections between sovereign states, e,g: France and Reunion and 4) disputed or jointly managed marine


www.cultureofinsight.com/blog

Responsive iframes for Shiny Apps

Getting Shiny out into the wild Shiny has really changed game in terms of analytical web-application development


www.mytinyshinys.com

Where should I live?

After a long spate of just soccer posts, it is a relief(delayed pun intended) to turn to a quick look at a relatively new package, weathercan released under the https://ropensci[... ](https://www.mytinyshinys.com/2018/03/15/weathercan-package/)


blog.wallaroolabs.com

Your Wallaroo Questions Answered

Wallaroo Labs has received a lot of great feedback from developers on Hacker News and other communities


www.mytinyshinys.com

EPL Week 30

For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayRashford schools Alex-Arnold Palace joined WBA with a league-leading ninth one-goal defeat


gcppodcast.com

OpenCensus with Morgan McLean and JBD

Morgan McLean is the Product Manager for Tracing, Debugging, and Profiling at Google, including OpenCensus I heard there are abilities to natively extend Kubernetes - what does that mean, and also how do I do


roh.engineering

fitur 0.5.25 Release

Fixed appearance of plots Added plot_density function for comparison pdfs of fitted functions Updated argument naming conventions The output is a ggplot object so you can add colors, styling, etc


lbusett.netlify.com

Automatically importing publications from bibtex to a hugo-academic blog

All is left is to run the script: Running the script will give you files like this one: , where I tweaked a bit the hugo-academic format to include bibliographic info such as volume, number, pages and doi link


www.ifconfig.it/hugo

FMC API and TextFSM

Automation and programmability is not a new topic for me. Having studied Information Technology in High School I’ve always coded somehow, never making it my primary focus but always using it as a tool to make my life easier


nowosad.github.io

Making maps of the USA with R

The next step is to calculate scale relations between the mainland and Hawaii as well as Alaska


purrple.cat/blog

The not so obvious value of build passing

The real problem however is that because you know it was broken already, you just don’t pay attention so you don’t have a way to quickly assert if you have just added a new 🐞, and then you live in fear


www.redbandsports.net

Did the 16-team format actually ruin the Brier and Scotties?

By Nick Faris and Guy Spurrier Brad Gushue had reason to be ruffled. It was last Wednesday in Regina, and the Team Canada skip had just suffered his first loss at the Brier, a 10-7 defeat to Brendan Bottcher of Alberta


lenkiefer.com

Forecasting Game

LET’S PICK BACK UP where we left off and think about communicating forecast results. To help guide our thinking, let’s set up a little game. Basic setup Like last time we’re going to focus on a situation where a forecaster observes some information about the world and makes an announcement about a future binary outcome


yihui.name/en

Miao YU is Looking for a Postdoc Position in the US

My friend Miao is going to finish his postdoctoral research in Waterloo this summer. Currently he is looking for a new postdoc position in the US (no particular preference over the location in US)


mailund.github.io/r-programmer-blog

Red-black trees in matchbox

I’m working on implementing red-black search trees in matchbox and have managed most of it by now


www.jtimm.net

place from text

From text to map Corpus search and context LSA, MDS, and semantic space FIN In this post, we demonstrate some different methodologies for exploring the geographical information found in


lenkiefer.com

Charting Jobs Friday with R

LAST FRIDAY WAS JOBS FRIDAY, the day when the U.S. Bureau of Labor Statistics (BLS) releases its monthly employment situation report. This report is blanketed with media coverage and economist and financial analysts all over the world pay close attention to the report


research.libd.org/rstatsclub

Edit your bashrc file for a nicer terminal experience

Next open them with your text editor (say Notepad++, TextMate 2, RStudio, among others) and paste the following contents. The next change will save you a lot of time! Plus it goes nicely with the bash history changes we just made


mailund.github.io/r-programmer-blog

Linked lists in matchbox

I have started playing with data structures in matchbox and the first structure I implement had to be linked lists


research.libd.org/rstatsclub

Textmate setup (Mac only)

Sometimes students are interested in this setup, which is what I’ll document here. Though I want to highlight that you can get a very similar setup using other tools. Note that this setup only works for Mac computers. and under bundle, choose the R bundle as shown below. As you can see, it hasn’t been updated in a while


purrple.cat/blog

Anagrams

Next we allocate some data structures. Then, the results vector, where we will collect the anagrams. I just hope however that walking through the code is useful


blog.schochastics.net

Analyzing NBA Player Data III

I decided to map the 70 stats into a 10 dimensional space. This “new” space supposedly preserves the intrinsic distance of the “old” space, but reduces the noise of the original data so that the differences and similarities of players become more evident


www.sastibe.de

Benchmarking AWS Instances with MNIST classification

These instances differ in two dimensions: price and performance. Obviously, these dimensions are highly correlated, since higher price means (or should mean, at least) higher performance


roh.engineering

Building Distribution Reference Tables in R

I’ve recently been studing for a professional exam that does not allow computers or advanced calculators. Some of the subject matter will require use of a few statistical distributions which can be very time-consuming to calculate manually


research.libd.org/rstatsclub

Contributing to the LIBD rstats club

We first need to get the appropriate tools installed in our computer. Ok, lets go head and install it with The file structure of our blog involves a total of 3 GitHub repositories that are related to each other as shown below. However, you will only need to interact with one of them


mailund.github.io/r-programmer-blog

Problems With Higher Order Functions in tailr

Ok, there is a problem with higher-order functions in my tailr package that I ran into while writing linked list functions for my matchbox package


research.libd.org/rstatsclub

Welcome to the LIBD rstats club!

PS This is not a course or boot camp site to get started using R, for that there are other resources available


theaknowles.com

Workshop materials

The real reason that Sally and I decided not to present, instead seeking out other women in the community to do so, was that we simply knew these women were out there and wanted them to join our


www.mytinyshinys.com

EPL Week 29

For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayIn 273 league appearances with Manchester United Patrice Evra was part of a line up that conceded 4 or more goals on 5 occasions


fharrell.com

Improving Research Through Safer Learning from Data

The following discussion concentrates on inference, although several of the concepts, especially measurement accuracy, fully pertain to exploratory data analysis


mailund.github.io/r-programmer-blog

Matchbox and CMD CHECK

Ok, first, the package I wrote about yesterday will be called matchbox, following Dmytro Perepolkin’s suggestion (thanks!). You can get it at GitHub


blog.wallaroolabs.com

Performance testing a low-latency stream processing system

At Wallaroo Labs we’ve been working on our stream processing engine, Wallaroo for just under two years now. We’ve designed Wallaroo to be able to handle millions of messages a second on a single server with low microsecond latencies


lcolladotor.github.io

blogdown archetype (template)

I also like reminding myself how to do some common tasks. Basically, the equivalent of the new R Markdown file you get when using RStudio. In my case, I want to remind myself of the YAML options I frequently use (toc, fig height and width) or how to add screenshots


timtrice.net

Answering Simple Questions in SQL Coding Interviews

Any database server will do. For this instance, MySQL is used. When I read this post I immeidately wondered if they were trick questions I wasn’t picking up on. They all seemed very easy to answer. They were a little tricky, but certainly not unreasonably complicated


gcppodcast.com

Cloud AI with Dr. Fei-Fei Li

Additional sample resources on Dr


mailund.github.io/r-programmer-blog

Help Me Choose a Package Name

What’s in a name? That which we call a rose By any other word would smell as sweet — William Shakespeare, Romeo and Juliet I have plans for re-implementing several of the data structures I wrote about in Functional Data Structures in


emmavestesson.netlify.com

My trip to Mexico in emojis

The code that never really worked This post is the result of hours and hours of me trying to write some code but never getting it quite right. I think that one of the worst things that you can do to yourself is to be scared to admit that you are struggling because you will end up never trying anything new


statsbylopez.netlify.com

On the risks of categorizing a continuous variable (with an application to baseball data)

In the third inning during a contest a few weeks back between the Nationals and Cubs, Washington’s Brian Goodwin hit a line drive to left field with two outs and a runner on third. Despite an initial pause, Chicago’s Kyle Schwarber ran in and attempted to field the ball around his knees


blog.rstudio.com

Platform Deprecation Strategy

In an effort to streamline product development, maintenance, and support to ensure the best experience for our users, we have created a strategy for operating system and browser deprecation


purrple.cat/blog

arrow, rrrow, rcher, spurrrow

Here I am again at the conundrum of choosing a name for a thing. This is hard, I like when it’s over and I have the perfect name, and I feel finally free to try to match the personnality of code to the name


lcolladotor.github.io

blogdown Insert Image addin

So we need to use either the Markdown or HTML syntax for adding the image. Maybe your initial thought is to use: If you want to edit the height or width, then you need to use the HTML syntax. Something like: So lets go head and select an image we want to upload. In my case, I chose an image that already exists


ellocke.github.io

(R) Joyplots

German Zweites Juristisches Staatsexamen (2nd State Exam in Laws) is said to be tough. Let’s have a look at how hard it really is by visualising the distribution of grades from the Berlin 2017/IV campaign. The written part of the final exam consists of 7 handwritten 5-hour length cases


blog.davisvaughan.com

Copula Resources

A number of resources I found useful while learning about


www.redbandsports.net

How many home runs is Josh Donaldson predicted to hit in 2018

Last week, Birds All Day podcast cohost Drew Fairservice re-tweeted some over-under betting lines on individual home run totals for the coming baseball season. Over on Goldy and Bryce Harper plz. https://t.co/Smha0KYulI - Drew F (@DrewGROF) February 26, 2018 In the replies to the tweet, he wondered how the lines compared to projections


www.ifconfig.it/hugo

Mikrotik hAP lite classic

For a Network Engineer living and working on the field has some challenges that are not common in office environments


research.libd.org/rstatsclub

Test post for checking website

Useful links for editing the theme: This blog post was made possible thanks


yihui.name/en

The R Markdown Source Documents of My Presentations

The Rmd source file can be obtained


ndres.me

Top 5 Best Jupyter Notebook Extensions

Notebook extensions are plug-ins that you can easily add to your Jupyter notebooks


mailund.github.io/r-programmer-blog

Variable bindings with pattern matching

I just added a new feature to my pmatch package


mouse-imaging-centre.github.io/blog

Why Relative Volumes Matter

Intro Figuring out how one group’s brain is different from another’s is a big part of neuroscience. MRI-neuroanatomy – the study of the sizes of brain regions – is a wonderful tool for this job and makes up a sizeable chunk of what we study


www.aggieerin.com

Working With Messy Text

Heyo! I am doing my best to procrastinate here on a blustery Tuesday afternoon. So, I decided to share some code I’ve put together that solves problems in R that I used to do in perl. HTML or C++ was probably my first real language, but I love the heck out of perl


purrple.cat/blog

first dplyr mondays

Four issues (and then some comments to other issues) feels minimal, but this includes getting back on track with the codebase and setup a more formal way of contributing, i.e. through pull requests, reviews, and systematic testing. Today is another day, so I’ll work on another project


mailund.github.io/r-programmer-blog

Another approach to evaluating dynprog expressions

In the approach to evaluating dynamic-programming expressions, that I wrote about yesterday, I used ranges- and recursion-specifications to build a loop for updating a table and then evaluated that loop inside an environment where local variables would over-scope the quosure environment from the


lenkiefer.com

Forecasting and deciding binary outcomes under asymmetric information

LAST WEEK IN THE WALL STREET JOURNAL an article LINK talked about how pundits can strategically make probabilistic forecasts. It seems 40% is a sort of magic number, where it’s high enough that if the event comes true you can claim credit as a forecaster, but if it doesn’t happen, you still gave it less than 5050 odds


purrple.cat/blog

Strings know their own length

A simple way to get that information in C would be a loop (it’s ok to write loops in C)


yihui.name/en

The Video of My Talk on blogdown at rstudio

After I gave the talk on blogdown at rstudio::conf 2018, some people really thought I was a fast typist, because I “typed” a relatively long post at the podium in just 20 seconds. On one hand, I was happy to learn that, because my trick worked (too) well on some people


blog.schochastics.net

Analyzing NBA Player Data II

The next step will be to reduce the noise of the data by considering only players that played more than 500 minutes in a season. This leaves us with 705 rows to analyze. To obtain our very own set of new positions, we will go through the following steps: Now we need to decide on how many components to keep


adamspannbauer.github.io

Building an Image Search Engine with Pretrained ResNet50 from Keras

In this post we’ll be using the pretrained ResNet50 ImageNet weights shipped with Keras as a foundation for building a small image search engine. In the below image we can see some sample output from our final product. As in my last post we’ll be working with app icons that we’re gathered by this scrape script


mailund.github.io/r-programmer-blog

Comments

Ok, if this is working, it should now be possible to comment on posts here


mailund.github.io/r-programmer-blog

Evaluating dynprog expressions

I think I now have a complete implementation of the dynamic programming DSL I wrote about the other day


blog.schochastics.net

Analyzing NBA Player Data I

As a football (soccer) data enthusiast, I have always been jealous of the amount of available data for American sports. While much of the interesting football data is proprietary, you can can get virtually anything of interest for the NBA, MLB, NFL or NHL


davemcg.github.io

Are you in genomics and building models? Stop using ROC - use PR

Area Under the Curve (AUC) of Receiver Operating Characteristic (ROC) is a terrible metric for a genomics problem. Do not use it. This metric also goes by AUC or AUROC. Use Precision Recall AUC


mailund.github.io/r-programmer-blog

Designing a DSL for dynamic programming

I’m working on an example for one of the chapters of Domain Specific Languages in R that will appear in the printed version but weren’t included in the earlier e-book


www.mytinyshinys.com

EPL Week 28

For the remainder of the season, I will be travelling with a back up laptop so please excuse any shortfall in posts and site updates Match of the DayAshley Young needs to play just over 3 full games to record the most minutes in a season wearing a Man. Utd


lenkiefer.com

Rock that dadbod plot!

Spring is nearly upon us, or at least we can hope. Let’s examine how housing activity typically rounds into shape as the weather warms up. We’ll make some fun plots with R. Seasonality in housing data Housing market activity in the United States is highly seasonal. Consider this animated plot. This plot shows U.S


davemcg.github.io

Something Different

title: ‘Something Different: Automated Neighborhood Traffic Monitoring’ author: David McGaughey date: ‘2018-03-03’ slug: traffic-monitoring-intro categories: - R - python - raspberry - pi tags: - R - python - raspberry - pi — This is, obviously, a personal project. Traffic is a concern in my town


mailund.github.io/r-programmer-blog

Tick-marks for log10 axis

For the tailr post I needed to plot some benchmark results


www.carlbfrederick.com

Uncovering the relationships among functions in a package

The practical value of this exercise comes from the following sorts of insights: Thanks for reading! Feel free to join the discussion


mailund.github.io/r-programmer-blog

Blog Setup

Ok, I hadn’t planned to write any more about how the blog is set up, since that isn’t that interesting to me and probably isn’t to you — either you know a lot more about this than me, in which case I have nothing to teach you, or you just don’t care, which I can relate


adamspannbauer.github.io

Finding and Using Images' Dominant Colors using Python & OpenCV

This post is about finding an image’s dominant color. To illustrate this concept we’ll be working with app icons from the Apple App Store


drmolina.netlify.com

OktoberfestR

The app takes data from the Oktoberfest since 1985 and until 2016 (so far)


mailund.github.io/r-programmer-blog

Purpose of this blog

Since I already have a blog, you might be asking, “why another one?”


purrple.cat/blog

multiple lags with tidy evaluation

Let’s break it down in steps. The function takes 3 parameters: but with hopefully with nicer (or at least shorter) syntax: As this is often the case, I immediately posted this on twitter