cancel
Showing results for 
Search instead for 
Did you mean: 
dylanwan
Employee
Employee

Scenario

You need to access AWS services, such as S3, Kafka, or Kinesis, from an Incorta Materialized View (MV) for extracting data.

Issues

You prefer not to list the credentials in the MV script.

Solution

You can set up AWS to use temporary security credentials.

Here is the documentation for how to set this up on the AWS side: Temporary security credentials in IAM.

From the Incorta side, here is sample MV code to use AWS temporary security credentials.

import requests, json
url = (
    f"http://aws-tmp-creds-service."
    f"<your incorta host>."
    "svc.cluster.local.:8000/credentials"
)

response = requests.get(url)
tmp_creds = json.loads(response.text)

spark._jsc.hadoopConfiguration().set("fs.s3a.access.key", tmp_creds["aws_access_key_id"])
spark._jsc.hadoopConfiguration().set("fs.s3a.secret.key", tmp_creds["aws_secret_access_key"])
spark._jsc.hadoopConfiguration().set("fs.s3a.session.token", tmp_creds["aws_session_token"])

s3_path = f's3a://<your s3 bucket>/<your path>/<your file>'
df_data = spark.read.format("csv")\
    .option("header","true")\
    .schema(schema)\
    .load(s3_path)\
    .select("*")

# Transformation logic on df_data
df_1 = df_data.filter("...")
df_2 = df_1.withColumn("Col X", ...)

save(df_2)
Best Practices Index
Best Practices

Just here to browse knowledge? This might help!

Contributors
Version history
Last update:
‎02-28-2024 02:12 PM
Updated by: