Spark Based Extraction
We are trying to use the Spark Based extraction to pull a large table from Oracle into Incorta. I have provided all the value as requested during the setup process. Below is a screenshot. However the extraction fails with an error
INC_03010100: Unknown Error: Extraction failed [java.lang.RuntimeException: Error while extracting data: Unrecognized SQL type -102 Caused by null].
I have ensured that there are no nulls on the Column to Parallelize queries on.
Any help in resolving this would be greatly appreciated.
To run Spark based Extraction, please enable Spark integration and make sure the Spark cluster is up and running. Please run Test Spark as described in this document.
To debug the issue related to Spark based Extraction, you can use the Spark Web UI to check if a spark job is launched after you submit the table refresh job from Incorta.
If a spark job is not created, please check the Incorta tenant log file.
To verify if the issue is related to specific SQL, you can simply the SQL by selecting few or just one column. If it passes, adding one column at a time to see which column is causing the issue.
When the Spark based extraction is used, the Spark executor runs the Spark job which connects to Oracle via JDBC. You may want to check if the Spark cluster has the permission and the right JDBC library. It should not be a problem if you are running embedded Spark shipped with Incorta.