on 02-28-2024 02:12 PM
You need to access AWS services, such as S3, Kafka, or Kinesis, from an Incorta Materialized View (MV) for extracting data.
You prefer not to list the credentials in the MV script.
You can set up AWS to use temporary security credentials.
Here is the documentation for how to set this up on the AWS side: Temporary security credentials in IAM.
From the Incorta side, here is sample MV code to use AWS temporary security credentials.
import requests, json
url = (
f"http://aws-tmp-creds-service."
f"<your incorta host>."
"svc.cluster.local.:8000/credentials"
)
response = requests.get(url)
tmp_creds = json.loads(response.text)
spark._jsc.hadoopConfiguration().set("fs.s3a.access.key", tmp_creds["aws_access_key_id"])
spark._jsc.hadoopConfiguration().set("fs.s3a.secret.key", tmp_creds["aws_secret_access_key"])
spark._jsc.hadoopConfiguration().set("fs.s3a.session.token", tmp_creds["aws_session_token"])
s3_path = f's3a://<your s3 bucket>/<your path>/<your file>'
df_data = spark.read.format("csv")\
.option("header","true")\
.schema(schema)\
.load(s3_path)\
.select("*")
# Transformation logic on df_data
df_1 = df_data.filter("...")
df_2 = df_1.withColumn("Col X", ...)
save(df_2)