the/experts. Blog

Cover image for Share data with Azure Blob Storage using Java
Jan Martijn
Jan Martijn

Posted on • Updated on

Share data with Azure Blob Storage using Java

Why Azure Blob Storage?

Azure Blob Storage is a nice offering of Microsoft to store various types of data in the cloud. Having your Java application use Azure Blob Storage as the way to persist generated artifacts like files or movies is an solution that has an easy implementation in my opinion. In this post I will be looking at the practicalities for Java application connecting to Azure Blob Storage.

Azure Blob Storage can serve these files or images directly to the browser from pretty much everywhere in the world. Naturally there are client libraries for various languages and a REST API to write and read from the storage.

Basics for connecting to Azure Blob Storage

The use of Azure Storage requires a storage account. This account together with an access key, that can be retrieved from the Security + networking page in the Azure portal, is enough to setup a connection to a storage.
With the connection established a container can be created in which the Blob’s can be stored. Below the object model of the blob storage is shown which provides idea how you can structure the blobs stored and show the three important models in Azure Blob storage.

Making it work with Java

For the Java part the question is now how do I interact with this beautiful service? A couple of dependencies are needed to setup the connection and do something with the response. This is maybe not that straight forward as other blogs and manuals present it. These are the dependencies in Maven that I used in a project which uploads and reads blobs from the storage.

<properties>
  <stax2.version>4.2.1</stax2.version>
  <woodstox.version>6.2.4</woodstox.version>
</properties>

<dependencies>
<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-storage-blob</artifactId>
    <version>12.13.0</version>
</dependency>
<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-storage-common</artifactId>
    <version>12.13.0</version>
</dependency>
<dependency>
    <groupId>com.azure</groupId>
    <artifactId>azure-core</artifactId>
    <version>1.20.0</version>
    <exclusions>
        <exclusion>
            <groupId>com.fasterxml.jackson.dataformat</groupId>
            <artifactId>jackson-dataformat-xml</artifactId>
        </exclusion>
    </exclusions>
</dependency>
<dependency>
    <groupId>com.fasterxml.woodstox</groupId>
    <artifactId>woodstox-core</artifactId>
    <version>${woodstox.version}</version>
</dependency>
<dependency>
     <groupId>org.codehaus.woodstox</groupId>
     <artifactId>stax2-api</artifactId>
    <version>${stax2.version}</version>
</dependency>
</dependencies>
Enter fullscreen mode Exit fullscreen mode

When working with an application server like WildFly, or Spring, dependencies are already provided because they are used in the web server. Often certain dependencies of your application overlap and you let the build tool know that a dependency will already be provided in runtime. The Java client of Azure Storage has a dependency on jackson-dataformat-xml and the web server that was used in the project has provided libraries dependencies. But the annoying thing was that the web server was missing parser libraries to make jackson-dataformat-xml function correctly for Azure storage. And yes this caused a NoClassDefFoundError on runtime which is a dreaded exception. A Maven solution for the dependency problem can look like this:

<plugin>
    <groupId>org.apache.maven.plugins</groupId>
    <artifactId>maven-dependency-plugin</artifactId>
    <configuration>
        <artifactItems>
            <artifactItem>
                <groupId>org.codehaus.woodstox</groupId>
                <artifactId>stax2-api</artifactId>
                <version>${stax2.version}</version>
                <type>jar</type>
                <overWrite>false</overWrite>
                <destFileName>stax2-api.jar</destFileName>
            </artifactItem>
            <artifactItem>
                <groupId>com.fasterxml.woodstox</groupId>
                <artifactId>woodstox-core</artifactId>
                <version>${woodstox.version}</version>
                <type>jar</type>
                <overWrite>false</overWrite>
                <destFileName>woodstox-core.jar</destFileName>
            </artifactItem>
        </artifactItems>
        <outputDirectory>${project.build.directory}/additional-jars</outputDirectory>
    </configuration>
</plugin>
Enter fullscreen mode Exit fullscreen mode

Then copy these files to the web server library directory before startup. Having the dependencies present we can move along to setting up a connection. To access a storage container using the Java client a storage account is needed and a shared access signature token. With the storage account an URL can be constructed like this ENDPOINT_STRING_TEMPLATE = https://%s.blob.core.windows.net which the client connects to. Using the client library a BlobContainerClient can be created which than allows the client to build a BlobClient which is used to upload data.

String storageAccount = System.getenv("BLOB_CONTAINER_SA");
String accountKey = System.getenv("BLOB_CONTAINER_KEY");
var storageSharedKeyCredential = new StorageSharedKeyCredential(storageAccount, accountKey);
var blobServiceClient = new BlobServiceClientBuilder()
        .credential(storageSharedKeyCredential)
        .endpoint(String.format(ENDPOINT_STRING_TEMPLATE, storageAccount))
        .buildClient();

var blobContainerClient = blobServiceClient.getBlobContainerClient(containerName);
if (!blobContainerClient.exists()) {
    blobContainerClient = blobServiceClient.createBlobContainer(containerName);
}

var blobClient = blobContainerClient.getBlobClient("directory/" + file.getFilename())
blobClient.upload(BinaryData.fromString("interesting data"));
Enter fullscreen mode Exit fullscreen mode

From this example you can see that a hierarchical directory structure can be created in a container to categorize the objects.
The above blobClient.upload method won’t overwrite a file by default and a boolean can be provided as second argument to set this explicitly. In the case of my project we uploaded multiple blobs and wanted to make sure no previous blobs were remaining in the container. For this a delete method is available on the BlobClient and with the following code one can delete all
existing blobs in a container.

blobContainerClient.listBlobs().forEach(blobItem -> {
   var blobClient = blobContainerClient.getBlobClient(blobItem.getName());
   blobClient.delete();
}); 
Enter fullscreen mode Exit fullscreen mode

How to test?

One of the nicest things I experienced when developing against the blob storage service is the presence of Azurite an emulator that can be run locally or in Docker. Azurite can emulate the behavior of the actual Blob service locally so you are able to setup an integration test without depending on an actual Azure account. The fake account used by Azurite can be configured and data can be uploaded to, and retrieved from, this service. The docker configuration needed for Azurite can be added to the docker-compose.yml like this:

azurite:
  image: mcr.microsoft.com/azure-storage/azurite
  hostname: azurite
  restart: always
  command: "azurite --loose --blobHost 0.0.0.0 --blobPort 10000"
  ports:
    - "10000:10000"
Enter fullscreen mode Exit fullscreen mode

Docker environment configuration can be added to provide the values for the different host and the different credentials. These are the default development account credentials of Azurite.

AZURE_STORAGE_HOST: azurite
AZURE_STORAGE_PORT: 10000
BLOB_CONTAINER_SA: devstoreaccount1
BLOB_CONTAINER_KEY: Eby8vdM02xNOcqFlqUwJPLlmEtlCDXJ1OUzFT50uSRZ6IFsuFq2UVErCz4I6tq/K1SZFPTOtr/KBHBeksoGMGw==
Enter fullscreen mode Exit fullscreen mode

So now you can test it before deployment to the outside world which is great.

Conclusion

A storage service in the cloud is nothing new and there are numureous offerings. So I can not tell you that Azure Storage is the right choice for you. On the other hand I am conviced that the integration with a Java application is not difficult and enough functionality and libraries are available to allow various solutions.

Extra's

There is a GUI available to see what’s stored in the Azure Storage containers and it allows for manual adding and deleting. This is the storage explorer and can be downloaded here: https://azure.microsoft.com/en-us/features/storage-explorer/

For more information about Azurite : https://docs.microsoft.com/en-us/azure/storage/common/storage-use-azurite?tabs=visual-studio

Azurite header image License, creator

The Azure storage libraries for Java:
https://github.com/Azure/azure-sdk-for-java/tree/main/sdk/storage

Discussion (0)