Data Scientist, Research Software Engineer, and teacher. Cassia is passionate about transformative processes in data, technology and life. She is graduated in Philosophy and Information Systems, with a Strictu Sensu Master's Degree in the field of Foundations Of Mathematics.
Simple NLP in Python with TextBlob: Lemmatization
TextBlob is a package built on top of two other packages, one of them is called Natural Language Toolkit, known mainly in its abbreviated form as NLTK, and the other is Pattern. NLTK is a traditional package used for text processing or Natural Language Processing (NLP), and Pattern is built...
Implementing Other SVM Flavors with Python's Scikit-Learn
This guide is the third and final part of three guides about Support Vector Machines (SVMs). In this guide, we will keep working with the forged bank notes use case, have a quick recap about the general idea behind SVMs, understand what is the kernel trick, and implement different types...
Understanding SVM Hyperparameters
This guide is the second part of three guides about Support Vector Machines (SVMs). In this guide, we will keep working on the forged bank notes use case, understand what SVM parameters are already being set by Scikit-learn, what are C and Gamma hyperparameters, and how to tune them using...
Implementing SVM and Kernel SVM with Python's Scikit-Learn
This guide is the first part of three guides about Support Vector Machines (SVMs). In this series, we will work on a forged bank notes use case, learn about the simple SVM, then about SVM hyperparameters and, finally, learn a concept called the kernel trick and explore other types of...
DBSCAN with Scikit-Learn in Python
You are working in a consulting company as a data scientist. The project you were currently assigned to has data from students who have recently finished courses about finances. The financial company that conducts the courses wants to understand if there are common factors that influence students to purchase the...
Loading a Pretrained TensorFlow Model into TensorFlow Serving
You are part of a project that will use deep learning to try to identify what is in images - such as cars, ducks, mountains, sky, trees, etc. In this project, two things are important - the first one, is that the deep learning model trains quickly, with efficiency (because...
Converting JSON to a Dictionary in Python
Definitive Guide to the Random Forest Algorithm with Python and Scikit-Learn
The Random Forest algorithm is one of the most flexible, powerful and widely-used algorithms for classification and regression, built as an ensemble of Decision Trees. If you aren't familiar with these - no worries, we'll cover all of these concepts. In this in-depth hands-on guide, we'll build an intuition on...
Get Feature Importances for Random Forest with Python and Scikit-Learn
The Random Forest algorithm is a tree-based supervised learning algorithm that uses an ensemble of predictions of many decision trees, either to classify a data point or determine its approximate value. This means it can either be used for classification or regression. When applied for classification, the class of the...