Boto3 Reference
Get number of retries attempts from a boto3 client
Source: https://stackoverflow.com/questions/60858442/get-number-of-retries-attempts-from-a-boto3-client
It looks like the ResponseMetada from every client comes with that information.
Minimal code example:
import logging
import boto3
client = boto3.client('s3')
response = client.list_buckets()
logging.info("Retry Attempts: %d", response['ResponseMetadata']['RetryAttempts'])
Do NOT hardcode credentials in source code
Source: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html
From the AWS documentation: We do not recommend hard coding credentials in your source code.
Boto3 will by default check these environment variables for credentials:
AWS_ACCESS_KEY_ID
- The access key for your AWS account.AWS_SECRET_ACCESS_KEY
- The secret key for your AWS account.AWS_SESSION_TOKEN
- The session key for your AWS account. This is only needed when you are using temporary credentials. TheAWS_SECURITY_TOKEN
environment variable can also be used, but is only supported for backwards compatibility purposes.AWS_SESSION_TOKEN
is supported by multiple AWS SDKs besides python.
So, we do not need to need to do pass them in like this:
import boto3
client = boto3.client(
's3',
aws_access_key_id=ACCESS_KEY,
aws_secret_access_key=SECRET_KEY,
aws_session_token=SESSION_TOKEN
)
Instead, this code would do the same thing:
import boto3
client = boto3.client('s3')
Reading contents of a gzip file from a AWS S3 in Python
import boto3
import gzip
s3 = boto3.resource("s3")
obj = s3.Object("YOUR_BUCKET_NAME", "path/to/your_key.gz")
with gzip.GzipFile(fileobj=obj.get()["Body"]) as gzipfile:
content = gzipfile.read()
print(content)
Source: https://stackoverflow.com/questions/41161006/reading-contents-of-a-gzip-file-from-a-aws-s3-in-python
Boto3/S3: Renaming an object using copy_object
s3 = boto3.resource('s3')
s3.Object('my_bucket','new_file_key').copy_from(CopySource='my_bucket/old_file_key')
s3.Object('my_bucket','old_file_key').delete()
Source: https://stackoverflow.com/questions/32501995/boto3-s3-renaming-an-object-using-copy-object
check if a key exists in a bucket in s3 using boto3
import boto3
import botocore
s3 = boto3.resource('s3')
try:
s3.Object('my-bucket', 'dootdoot.jpg').load()
except botocore.exceptions.ClientError as e:
if e.response['Error']['Code'] == "404":
# The object does not exist.
...
else:
# Something else has gone wrong.
raise
else:
# The object does exist.
...
Source: https://stackoverflow.com/questions/33842944/check-if-a-key-exists-in-a-bucket-in-s3-using-boto3
How to query cloudwatch logs using boto3 in python
import boto3
from datetime import datetime, timedelta
import time
client = boto3.client('logs')
query = "fields @timestamp, @message | parse @message \"username: * ClinicID: * nodename: *\" as username, ClinicID, nodename | filter ClinicID = 7667 and username='simran+test@abc.com'"
log_group = '/aws/lambda/NAME_OF_YOUR_LAMBDA_FUNCTION'
start_query_response = client.start_query(
logGroupName=log_group,
startTime=int((datetime.today() - timedelta(hours=5)).timestamp()),
endTime=int(datetime.now().timestamp()),
queryString=query,
)
query_id = start_query_response['queryId']
response = None
while response == None or response['status'] == 'Running':
print('Waiting for query to complete ...')
time.sleep(1)
response = client.get_query_results(
queryId=query_id
)
Response:
{
'results': [
[
{
'field': '@timestamp',
'value': '2019-12-09 17:07:24.428'
},
{
'field': '@message',
'value': 'username: simran+test@abc.com ClinicID: 7667 nodename: MacBook-Pro-2.local\n'
},
{
'field': 'username',
'value': 'simran+test@abc.com'
},
{
'field': 'ClinicID',
'value': '7667'
},
{
'field': 'nodename',
'value': 'MacBook-Pro-2.local\n'
}
]
]
}
Source: https://stackoverflow.com/questions/59240107/how-to-query-cloudwatch-logs-using-boto3-in-python
Read file content from S3 bucket with boto3
s3 = boto3.resource('s3')
bucket = s3.Bucket('test-bucket')
# Iterates through all the objects, doing the pagination for you. Each obj
# is an ObjectSummary, so it doesn't contain the body. You'll need to call
# get to get the whole body.
for obj in bucket.objects.all():
key = obj.key
body = obj.get()['Body'].read()
- https://stackoverflow.com/questions/36205481/read-file-content-from-s3-bucket-with-boto3
Quickest Ways to List Files in S3 Bucket
- https://binaryguy.tech/aws/s3/quickest-ways-to-list-files-in-s3-bucket/
Search nested subdirectory in S3 bucket
- https://dev.to/aws-builders/search-nested-subdirectory-in-s3-bucket-52ke
Writing a pickle file to an s3 bucket in AWS
import io
import boto3
pickle_buffer = io.BytesIO()
s3_resource = boto3.resource('s3')
new_df.to_pickle(pickle_buffer)
s3_resource.Object(bucket, key).put(Body=pickle_buffer.getvalue())
- https://stackoverflow.com/questions/49120069/writing-a-pickle-file-to-an-s3-bucket-in-aws
Load pickle from s3
import pickle
import boto3
s3 = boto3.resource('s3')
my_pickle = pickle.loads(s3.Bucket("bucket_name").Object("key_to_pickle.pickle").get()['Body'].read())
- https://stackoverflow.com/questions/48964181/how-to-load-a-pickle-file-from-s3-to-use-in-aws-lambda
how to check if a particular directory exists in S3 bucket using python and boto3
While checking for S3 folder, there are two scenarios:
Scenario 1
import boto3
def folder_exists_and_not_empty(bucket:str, path:str) -> bool:
'''
Folder should exists.
Folder should not be empty.
'''
s3 = boto3.client('s3')
if not path.endswith('/'):
path = path+'/'
resp = s3.list_objects(Bucket=bucket, Prefix=path, Delimiter='/',MaxKeys=1)
return 'Contents' in resp
The above code uses MaxKeys=1
. This it more efficient. Even if the folder contains lot of files, it quickly responds back with just one of the contents.
Observe it checks Contents
in response
Scenario 2
import boto3
def folder_exists(bucket:str, path:str) -> bool:
'''
Folder should exists.
Folder could be empty.
'''
s3 = boto3.client('s3')
path = path.rstrip('/')
resp = s3.list_objects(Bucket=bucket, Prefix=path, Delimiter='/',MaxKeys=1)
return 'CommonPrefixes' in resp
- Observe it strips off the last
/
from path. This prefix will check just that folder and doesn’t check within that folder. -
Observe it checks
CommonPrefixes
in response and notContents
- https://stackoverflow.com/questions/57957585/how-to-check-if-a-particular-directory-exists-in-s3-bucket-using-python-and-boto
Three Ways to Count the Objects in an AWS S3 Bucket
- https://fuzzyblog.io/blog/aws/2019/10/24/three-ways-to-count-the-objects-in-an-aws-s3-bucket.html