clustering Jugando con las dimensiones: desde Clustering, PCA, t-SNE.... ¡hasta Carl Sagan! Jugando con las dimensiones ¡Hola! Este post es un experimento que combina el resultado de t-SNE con dos técnicas de clustering bien conocidas: k-means y hierarchical. Esta será la sección práctica, en R.

libro-vivo-ciencia-datos Lanzamiento! Libro Vivo de Ciencia de Datos 📗 (open-source) Finalmente disponible la versión en español del _Data Science Live Book_! El libro se abre sin barreras idiomáticas ante las personas de habla-hispana con ganas de aprender 👨🎓👩🎓. Esta publicación es una edición revisada tanto en gramática como en aspectos técnicos de la versión en inglés.

shap A gentle introduction to SHAP values in R Opening the black-box in complex models: SHAP values. What are they and how to draw conclusions from them? With R code example!

data preparation New discretization method: Recursive information gain ratio maximization This method can discretize a variable taking into consideration the target variable, similar to what decision tree do but with gain ratio.

R Feature Selection using Genetic Algorithms in R From a gentle introduction to a practical solution, this is a post about feature selection using genetic algorithms in R.

tibble How to apply a function to a matrix/tibble Scenario: we got a table of id-value, and a matrix/tibble that contains the id, and we need the labels. It may be useful when predicting the Key (or Ids) of in a

deep-learning How to create a sequential model in Keras for R This tutorial will introduce the Deep Learning classification task with Keras. With focus on one-hot encoding, layer shapes, train & model evaluation.

machine learning Sample size and class balance on model performance Analyzing the relationship between the sample size and how it impacts on the accuracy in a classification model

bookdown How to self publish a book: customizing Bookdown tl;dr: This post is related to How to self-publish a book: A handy list of resources. It's centered around Bookdown and some non-standard customizations I found useful to create the Data Science

bookdown How to self-publish a book: A handy list of resources tl;dr: A list of useful resources aimed to self-publish a book on Amazon using Bookdown.

exploratory data analysis Exploratory Data Analysis in R (introduction) Exploratory data analysis (EDA) the very first step in a data project. We will create a code-template to achieve this with one function.

rstats Tutorial instalación R y RStudio Este tutorial tiene como propósito hacer el set-up inicial para empezar a desarrollar modelos machine learning en increíble lenguaje R.

machine learning Introduction to Machine Learning for non-developers About Machine Learning We all know that machine learning is about handling data, but it also can be seen as: The art of finding order in data by browsing its inner information. Some

learning "I hate math!" - Education and Artificial Intelligence to find a meaning in what we do Well, what you hate is the way that math was taught to you. That soup of equations, abstractions, and solutions to problems that we don’t know, It's hard to enjoy the things

data-science-live-book Data Science Live Book available at Amazon! Hi there! tl;dr: The Data Science Live Book is now available at Amazon! Kindle & Paperback versions! 🚀 👉 See at Amazon 📗! Link to the black & white version, also available on full-color. It

rstats Exploratory Data Analysis & Data Preparation with 'funModeling' funModeling quick-start This package contains a set of functions related to exploratory data analysis, data preparation, and model performance. It is used by people coming from business, research, and teaching (professors and students)

rstats Data discretization made easy with funModeling tl;dr: Convert numerical variables into categorical, as it is shown in the next image. ⏳ Reading time ~ 6 min. Let's start! The package funModeling (from version > 1.6.6) introduces two functions—

data science Data Science Live Book (open source) ~ new big release! 200-pages Well after some time, and +300 commits, this is the biggest release of the Data Science Live Book! (open source), after the first publication more than 1 year ago :) tl;dr: Hi there!

clustering Playing with dimensions: from Clustering, PCA, t-SNE... to Carl Sagan! Playing with dimensions Hi there! This post is an experiment combining the result of t-SNE with two well known clustering techniques: k-means and hierarchical. This will be the practical section, in R. But

data science Model Performance in Data Science Live Book Hi there! I decided to almost re-write the model validation section since it didn't reflect real case scenarios. Hopefully in the two new chapters you will gain a deeper knowledge on methodological aspects

data science Data Science Live Book - Scoring, Model Performance & profiling - Update! This update contains a new chapter -scoring- which is related to model performance and model deployment, used when predicting a binary outcome. Link to the scoring chapter. Important: To use following updates please

R Time Series Analysis Using Max/Min... and some Neuroscience. Introduction Time series have maximum and minimum points as general patterns. Sometimes the noise present on it causes problems to spot general behavior. In this post, we will smooth time series -reducing noise-

R Anomaly Detection in R Introduction Inspired by this Netflix post, I decided to write a post based on this topic using R. There are several nice packages to achieve this goal, the one we´re going to

R Text Mining Analysis: some theory and practice in R Introduction Big Data help us to analyze unstructred data (aka "text" ), with many techniques, in this post it is presented one: Cosine Similarity. There are also other analysts work, who scraped