Posts

Showing posts from July, 2018

Unlocking the power of relational data visualization with ggraph

I am absolutely thrilled to announce that ggraph has been released on CRAN . This is my most ambitious package to date, designed to solve a problem that plagues data scientists everywhere: visualizing complex networks without creating a mess. If ggraph is new to you, think of it as an extension of the ggplot2 API tailored for relational data, such as networks, graphs, and trees. If you love the layered philosophy of ggplot2 , you are going to feel right at home. In this post, I’ll explore the philosophy behind the package and walk you through how to build your first clean, interpretable network visualization. The Philosophy: Death to Hairballs There is no shortage of software for creating network visualizations. However, these visualizations often prioritize "impressiveness" over information. We’ve all seen them: dense, unintelligible clusters of nodes and edges that look more like a cat’s hairball than a data insight. It doesn't have to be this way. The greatness of ...

Unlocking Ancient Secrets with Data

A Guide to Discriminant Function Analysis in R Imagine you are an archaeologist. You’ve discovered a cache of ancient artifacts near a series of salt mines , but you have a problem: you don't know which specific mine these artifacts originated from. Fortunately, science gives us a clue. The chemical composition of the artifacts matches the " brine " (salt water) signatures of the mines. But how do you match them mathematically? Enter Discriminant Function Analysis (DFA) . In this tutorial, we will use R to build a predictive model that acts as a " chemical fingerprint scanner ," identifying the source of a sample based on its geochemical makeup. Getting Started We will be using the BRINE dataset from the Kansas Geological Survey . It contains 19 water samples with measurements for distinct chemical elements (like Calcium , Magnesium , etc.) and the stratigraphic unit (Group) they belong to. Prerequisites: You will need R installed. We will use the M...