Using Machine Learning for cluster analysis of data
This article shows how to use a k-means clustering Machine Learning (ML) algorithm on a sample loan data in Incorta for doing Loan prediction. Cluster analysis or clustering is an unsupervised machine learning algorithm that groups unlabeled datasets. It aims at forming subsets or groups within a dataset consisting of data points which are really similar to each other and the groups or subsets or clusters formed can be significantly differentiated from each other
- Install the following python libraries on the Incorta server using pip install if they are not installed.
- From the Incorta UI import the attached schema zip and the data file
- Navigate to the schema and load the clustering table from Incorta
- Now open the materialized view (mv) in a notebook and start running each of the paragraphs to see what it does.
- Finally you can load this mv and visualize it using Incorta Dashboard
Created by: Amit Kothari