Contents IAM User Setup and AMI Installation
IAM User Setup for AWS S3 Services by CloudZip
AWS Identity and Access Management (IAM) control panel allows
you to easily create new users and manage detailed credentials to access your AWS S3 buckets.
AWS recommends that you use IAM credentials for applications and services, instead of using your root access keys.
You may allow or restrict access for the IAM credentials depending on the policy that you create.
By default new IAM users have no access until you set the policy.
It may take a few minutes for IAM user policy changes to take effect.
To create IAM user credentials with a policy that allows CloudZip to access your S3 buckets,
login to the
AWS Identity and Access Management control panel.
Under Details on the left, click the Users link. Next, at the top of the page click the
Create New Users button to display the user name entry page.
In the first text edit field enter a user name like "cloudzipinc",
next select the checkbox for Generate an access key for each user,
and click the Create button to show the access keys page.
Click the Download Credentials button to save your Access and Secret keys, or click
Show User Security Credentials and copy the Access and Secret keys.
You need to remember the Access and Secret keys to enter into the CloudZip service forms.
Next step is set a policy for the IAM user you just created.
From the Users page, select the username you just created to drill into its details.
In the Permissions section, under Managed Policies, click Attach Policy and select the AmazonS3FullAccess policy.
Next, for Insight for Storage service integration add CloudWatchFullAccess policy.
That's it, you just created a new set of keys and allow them only to access your S3 buckets and CloudWatch.
Next proceed to the CloudZip service forms and enter the Access and Secret keys in the fields as needed.
Note: Do not create a password, signing certificate, or multi-factor authentication
for the IAM user credentials used for CloudZip.
Custom IAM User Policies for AWS S3 Services by CloudZip
In some cases you may wish to add a custom policy or edit an existing custom IAM user policy.
In the IAM User details page, select the user to drill into its details,
then select the Inline Policies label. Click the link to create or edit a custom policy.
Select Custom Policy, enter the custom policy name and insert or paste
the following configuration into the relevant policy text area to allow the
IAM user to read and write to all S3 buckets.
The S3 Statement Action lists below are the minimum required for CloudZip to list, read, and write files.
"Action": [ "s3:ListAllMyBuckets", "s3:ListBucket", "s3:GetBucketLocation", "s3:GetBucketAcl" ],
"Action": [ "s3:PutObject", "s3:GetObject", "s3:DeleteObject", "s3:GetObjectAcl", "s3:PutObjectAcl" ],
Insight4Storage AWS AMI Instructions
Installation using HVM 64-bit insight4storage-hvm-r3.2xl-1.4x class AMIs
https://aws.amazon.com/marketplace, search for
Insight4Storage and subscribe to the latest secure AMI.
- From your AWS EC2 Instances console select Launch an On Demand or Spot Instance
Select MarketPlace AMI, Linux 64-bit, search and select Insight4Storage
Select an HVM compatible EC2 instance type suitable for your S3 footprint, we recommend a r3.2xlarge instance.
The software will easily run on smaller instances, try it, use whatever instance works for your requirements.
On the low end the m3.medium may be sufficient for measuring up to 10TB per bucket.
This AMI will run on smaller instances depending on total number of prefixes.
This AMI configuration is optimized for r3.2xlarge, enough to crawl over 20 million path prefixes per bucket, typically between 1-2PB per bucket, for 100+ buckets.
In the Configure Instance Details panel, select an existing IAM Role or Create a new one
An existing IAM role must be an EC2 Instance profile trusted service role with S3 (read, write, list) and Cloudwatch (list and read) Access :
Click Next: Add Storage
Easiest and preferred is choose "Create new IAM role” that will open the IAM Dashboard and allow you to
create a new special purpose role and avoid using a pre-existing role that may, or may not, have the permissions needed,
Select “Create New Role” button,
Enter a Role Name, for example “insight4storage-role”, and click next,
In the Select Role Type panel, under the AWS Service Roles , select AmazonEC2, and it will open the Attach Policy panel
In the Attach Policy panel, select the S3 Full Access and Cloudwatch Full Access policies, and click next.
If you wish to customize the S3 policies then this role requires get, put, list, and get and list ACL access to your buckets and it does not need bucket create or delete access.
If you wish to customize the Cloudwatch policies this role requires read and list access to Cloudwatch and it does not need write, create or delete access
Review and click Create Role
Back in the Instance Launch console, click the refresh button next to “Create new IAM Role” and
choose the role you just created from the drop down.
Click Next:Tag and add Tags if needed
Click Next : Configure Security Group
At least a 160GB ephemeral disk is required.
Adjust the disk sizes if needed, the more path prefixes the more disk space required.
The Insight4Storage database is stored on the ephemeral disk.
A 1PB bucket may require 5GB of disk space for Insight4Storage reporting per capture.
Clear the delete on termination checkbox if you wish to keep the database image after terminating the instance,
Insight4Storage does not read or store your data, nor does it store your file key names, it only stores size information aggregated per path prefix.
Launch the Instance.
After the server status finishes initializing, identify the DNS name and EC2 instance id of your instance
Open your browser and access the new instance over http using the DNS name and port 9000,
for example : http://ec2-xxx-xx-xx-xxx.compute-1.amazonaws.com:9000/
Select an existing security group or the easiest is create a new one
Add a Custom TCP Rule to open port 9000 to your source IP.
You will access the Insight4Storage application using your browser at http://:9000
Finally review the Launch details and select a key pair.
You may need the key pair to ssh into the instance for maintenance or in the event of troubleshooting.
Under Configuration on the left, click the Application link
At Insight4Storage application first time use, you will need to login into Insight4Storage using the
EC2 instance id as user name and password.
You can disable authentication in the Application Configuration page.
The first time into Insight4Storage there won’t be any information about your storage.
You will need to capture some data first, and schedule a recurring automated data capture for ongoing analysis.
Under Configuration on the left, click the Capture link.
Next, click the Start Cloudwatch Capture Now capture button for a quick view,
Depending on how many buckets you have and the speed of the instance and network,
it may take a couple of minutes or a few seconds for the Cloudwatch Capture to complete.
Under Configuration on the left, click the Capture link to see if the current capture is progress or has finished,
When it finishes, you can click the Top Buckets and Top Paths links to view the footprint details.
Under Configuration on the left, click the Tuning link
Under Schedule, enable the automated scheduled capture either daily or weekly.
Recommend running the capture over night, set the start hour after your busy time has ended, like overnite is usually a quiet period.
Under S3 API Limit, enter the API calls limit. The paths capture will proceed until this limit is breached or it crawls all the paths.
In some cases, for example when using big data applications, a 1PB bucket could have 20+ million prefixes especially including logging paths.
We recommend you exclude paths where map-reduce logs are stored.
Under Configuration on the left, click the Capture link then the Start Paths Capture Now button
In the dropdown , select the buckets that have map-reduce or other log directories that should be avoided.
For each bucket, enter the logs path prefix to be excluded, and click Add.
The time to finish depends on the number of prefix paths in the buckets.
Each bucket’s paths will be displayed in the Top Buckets and Top Paths as the capture finishes.
For example a 1TB bucket may take 5 minutes to capture paths, a 1PB bucket may take 2+ hours to capture paths.