Use SparkR DataFrame for Selecting, Filtering, and Grouping your Data
Use select(), filter(), arrange(), and group_by() functions to handle Spark DataFrame on R.
Use select(), filter(), arrange(), and group_by() functions to handle Spark DataFrame on R.
Get Basic Information about DataFrame in SparkR
Overview R is a popular language in Data Science. Incorta supports creating materialized views (MV) with R. Your first Spark Program The minimum requirement for creating an MV includes the following: # Read data from Incorta into Spark df <- read(...
Overview In this recipe, you’ll learn how to run SQL queries from SparkR. After using the read() Incorta Notebook extension function to load data in Incorta Notebook, the data is in a SparkR DataFrame. You can use sql() to write SQL queries after cr...
A snapshot table holds the same transactional data as its source system, with additional fields for tracking the snapshot date.
This article shows how to use a classification Machine Learning (ML) algorithm on the Iris Flower data set. It first creates a model based on a training data set and then scores the the test data set and gives the prediction of the flower based on th...
Use external sql query tools like DbVisualizer or PSequel to connect to Incorta and run postgres sqls against Incorta data.
This python script takes an unzipped tenant export file or an unzipped schema export file and produces a csv file which has all the table column details like datatype, etc... in a schema. This csv file can then be imported into Incorta and a dashboar...
Learn about how Incorta allows you to restrict user data access in the Incorta Direct Data Platform.
The purpose of this article is help users understand data modeling concepts and best practices in Incorta .
A slowly changing dimensions (SCD) is a dimension that stores and manages both current and historical data over time.
A take-home guide for an introduction to Machine Learning in the Incorta platform.
How to validate data between a Incorta report and my legacy BI report.
The purpose of this article is help users understand data modeling concepts and best practices in Incorta.
Date is a common dimension used in most application deployments. It is primarily used to roll up data so it can be viewed across a broad time range, facilitating trend analysis.