.png)
Employee
Options
- Article History
- Subscribe to RSS Feed
- Mark as New
- Mark as Read
- Bookmark
- Subscribe
- Printer Friendly Page
- Report Inappropriate Content
on 02-28-2024 02:12 PM
Scenario
You need to access AWS services, such as S3, Kafka, or Kinesis, from an Incorta Materialized View (MV) for extracting data.
Issues
You prefer not to list the credentials in the MV script.
Solution
You can set up AWS to use temporary security credentials.
Here is the documentation for how to set this up on the AWS side: Temporary security credentials in IAM.
From the Incorta side, here is sample MV code to use AWS temporary security credentials.
import requests, json
url = (
f"http://aws-tmp-creds-service."
f"<your incorta host>."
"svc.cluster.local.:8000/credentials"
)
response = requests.get(url)
tmp_creds = json.loads(response.text)
spark._jsc.hadoopConfiguration().set("fs.s3a.access.key", tmp_creds["aws_access_key_id"])
spark._jsc.hadoopConfiguration().set("fs.s3a.secret.key", tmp_creds["aws_secret_access_key"])
spark._jsc.hadoopConfiguration().set("fs.s3a.session.token", tmp_creds["aws_session_token"])
s3_path = f's3a://<your s3 bucket>/<your path>/<your file>'
df_data = spark.read.format("csv")\
.option("header","true")\
.schema(schema)\
.load(s3_path)\
.select("*")
# Transformation logic on df_data
df_1 = df_data.filter("...")
df_2 = df_1.withColumn("Col X", ...)
save(df_2)
Labels: