Amazon Athena is an interactive query service that makes it easy to analyze data directly in s3 using standard SQL. Athena engine version 2, is based on Presto 0.217. More details about all the improvements are given in here.
I don’t want to get into details as the intent is NOT to introduce Athena but talk about Athena Engine 2 release along with tips and tricks to adapt the changes by following the AWS best practices.
We at propellor.ai, use Athena for various use cases such as querying cloudwatch logs , cloudtrail logs, VPC flowlogs, etc partitioned and stored on…
Creating an AWS Lambda layer that is suitable for your requirement is a tedious task. In the beginning, we at Propellor.ai used a dedicated on-demand (shutdown after use) EC2 Amazon Linux instance to create a
local copy of dependencies → zip in lambda ready format → transfer it to S3 followed by an automation script to create the final Lambda Layer
Every new library or the latest version of a dependency requirement used to follow the same approach which is time-consuming and non-productive for the developers. (Please don’t say — why can’t you use AWS SAM? Not again!)
Before we begin, please understand the difference between deploying a static website via s3 website Endpoint and via s3 RestAPI.
Here are the Key differences between a website endpoint and a REST API endpoint served by a CloudFront Distribution
In addition, AWS has a cloudformation template to deploy a static site via s3 website Endpoint and a manual step by step explanation of deploying it. You can find these details here.
There are a bunch of articles out there that talk about serverless logging (mostly Lambda) and best practices. In this article, I will try to cover tools and practices for the AWS serverless workload logs (observability) and critical insights they provide to build, plan and scale a reliable architecture from a Well-Architected Framework (AWS-WA framework) point of view.
Before we jump into details, just a quick recap of the fundamental difference between logs, metrics, and events provided by AWS.
Networking itself is a big and complicated topic of AWS. In this article, I am sharing my experiences in setting up EIP + NAT + IGW + RT to establish a connection from AWS glue connections to a MySql database where my EIP is whitelisted.
Before we get into details here is a quick summary and their one-line description for the current setup.
It is a well-known fact that s3 + Athena is a match made in heaven but since data is in S3 and Athena is serverless, we have to use GLUE crawler to store metadata about what is contained within those S3 locations.
Even small master tables, metrics tables or daily incremental transactional data with Schema changes must be crawled to create a table on Athena.
In the beginning, my team and I used to write python scripts which upload the CSV files to s3 and then trigger a Lambda function which will invoke the relevant Crawler and create/update the table…
It's so surprising that I didn’t find any article that helps me to install superset on an Ubuntu 16.04 AMI of AWS EC2 instance. So, I have decided to do it myself. In this approach, I tried to install a Docker container by following superset’s cloud-native installation procedure. In my future articles, I will be installing superset build node version running on WSGI HTTP Apache or Gunicorn server.
Following are the steps I followed to successfully install superset in DEBUG mode on a default port 8080.
Step 1: Start Ubuntu 16.04 AMI instance on t2.large(optional)
Step 2: SSH into the…
Transforming complex data into powerful communications.