Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I'm getting frustrated by not finding any good explanation on how to list all files in a S3 bucket.

I have this bucket with about 20 images on. All I want to do is to list them. Someone says "just use the S3.list-method". But without any special library there is no S3.list-method. I have a S3.get-method, which I dont get to work. Arggh, would appreciate if someone told me how to simply get an list of all files(filenames) from an S3 bucket.

val S3files = S3.get(bucketName: String, path: Option[String], prefix: Option[String], delimiter: Option[String])

returns an Future[Response]

I dont know how to use this S3.get. What would be the easiest way to list all files in my S3 bucket?

Answers much appreciated!

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
841 views
Welcome To Ask or Share your Answers For Others

1 Answer

With Scala you might now want to use Amazon's official SDK for Java which provides the AmazonS3::listObjects method:

import scala.collection.JavaConverters._
import com.amazonaws.services.s3.model.ObjectListing

def keys(bucket: String): List[String] = nextBatch(s3Client.listObjects(bucket))

private def nextBatch(listing: ObjectListing, keys: List[String] = Nil): List[String] = {

  val pageKeys = listing.getObjectSummaries.asScala.map(_.getKey).toList

  if (listing.isTruncated)
    nextBatch(s3Client.listNextBatchOfObjects(listing), pageKeys ::: keys)
  else
    pageKeys ::: keys
}

Note the recursion on ObjectListing objects:

Since the listing of keys in a bucket is done by batch (using a pagination system as documented here), only up to the first 1000 keys would be returned by s3Client.listObjects(bucket).getObjectSummaries.asScala.map(_.getKey).

Thus the recursive call in order to get all keys in a bucket by asking for the next page of keys while ObjectListing::isTruncated is true.

Beware of memory issues if your bucket is huge though.


s3Client can be built as such:

import com.amazonaws.services.s3.{AmazonS3, AmazonS3ClientBuilder}
import com.amazonaws.auth.{AWSStaticCredentialsProvider, BasicAWSCredentials}

val credentials = new BasicAWSCredentials(awsKey, awsAccessKey)
val s3Client: AmazonS3 = AmazonS3ClientBuilder.standard().withCredentials(new AWSStaticCredentialsProvider(credentials)).build()

with these requirements in build.sbt and the latest version:

libraryDependencies ++= Seq(
  "com.amazonaws" % "aws-java-sdk-bom" % "1.11.391",
  "com.amazonaws" % "aws-java-sdk-s3"  % "1.11.391"
)

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share

548k questions

547k answers

4 comments

86.3k users

...