Among most popular off-the-shelf machine learning packages available to R, caret ought to stand out for its consistency. It reaches out to a wide range of dependencies that deploy and support model building using a uniform, simple syntax. I have been using caret extensively for the past three years, with a precious partial least squares (PLS) tutorial in … Continue reading The tidy caret interface in R

# Convolutional Neural Networks in R

Last time I promised to cover the graph-guided fused LASSO (GFLASSO) in a subsequent post. In the meantime, I wrote a GFLASSO R tutorial for DataCamp that you can freely access here, so give it a try! The plan here is to experiment with convolutional neural networks (CNNs), a form of deep learning. CNNs underlie … Continue reading Convolutional Neural Networks in R

# Linear mixed-effect models in R

Statistical models generally assume that All observations are independent from each other The distribution of the residuals follows $latex \mathcal{N}(0, \sigma^2)&s=1$, irrespective of the values taken by the dependent variable y When any of the two is not observed, more sophisticated modelling approaches are necessary. Let's consider two hypothetical problems that violate the two respective assumptions, … Continue reading Linear mixed-effect models in R

# Genome-wide association studies in R

This time I elaborate on a much more specific subject that will mostly concern biologists and geneticists. I will try my best to outline the approach as to ensure non-experts will still have a basic understanding. This tutorial illustrates the power of genome-wide association (GWA) studies by mapping the genetic determinants of cholesterol levels using … Continue reading Genome-wide association studies in R

# Partial least squares in R

My last entry introduces principal component analysis (PCA), one of many unsupervised learning tools. I concluded the post with a demonstration of principal component regression (PCR), which essentially is a ordinary least squares (OLS) fit using the first $latex k &s=1$ principal components (PCs) from the predictors. This brings about many advantages: There is virtually no … Continue reading Partial least squares in R

# Principal Component Analysis in R

Principal component analysis (PCA) is routinely employed on a wide range of problems. From the detection of outliers to predictive modeling, PCA has the ability of projecting the observations described by $latex p &s=1$ variables into few orthogonal components defined at where the data 'stretch' the most, rendering a simplified overview. PCA is particularly powerful in dealing with multicollinearity … Continue reading Principal Component Analysis in R

# Probability distributions in R

Some of the most fundamental functions in R, in my opinion, are those that deal with probability distributions. Whenever you compute a P-value you rely on a probability distribution, and there are many types out there. In this exercise I will cover four: Bernoulli, Binomial, Poisson, and Normal distributions. Let me begin with some theory first: Bernoulli … Continue reading Probability distributions in R