data protection Archives

Building and Protecting Data Lakehouse Projects with Cloudian and Vertica

See how to start a data lakehouse with Vertica EON mode and Cloudian, extend the data lakehouse with Vertica external tables and Cloudian, and protect Vertica datasets with data backup to Cloudian.

Henry Golas, Director of Technology, Cloudian

View LinkedIn Profile

Building and Protecting Data Lakehouse Projects with Cloudian and Vertica

Over the past year, Cloudian has greatly expanded its support for data analytics through new partnerships. One of those key partnerships is with Vertica, where the combination of Vertica and Cloudian HyperStore enables organizations to build and protect data lakehouses for modern data analytics applications.

This blog highlights the three main use cases we’re currently serving together:

Starting a data lakehouse with Vertica in Eon mode and Cloudian
Extending the data lakehouse with Vertica external tables and Cloudian
Protecting Vertica datasets with data backup to Cloudian

Just as a reminder, Vertica is a unified analytics data warehouse platform, based on a massively scalable architecture, and Cloudian is a software-defined, limitlessly scalable, S3-compatible object storage platform for on-premises and hybrid cloud environments.

Starting a Data Lakehouse with Vertica in Eon Mode and Cloudian

Cloudian-Vertica Data Lakehouse In the data analytics space, Vertica is known for performance, whether it is run in “Enterprise Mode” or “Eon Mode.” In Enterprise Mode each database node stores a portion of the dataset and performs a portion of the computation. In Eon Mode, Vertica brings its cloud architecture to on-premises deployments and decouples compute and storage. In Eon Mode, each Vertica node can access a shared communal storage space via S3 API. The advantages are: a) compute can be scaled as required without having to scale storage, meaning no more server sprawl and b) storage can be consolidated into a single platform and accessed by various data tools:

Building out Vertica communal storage on Cloudian is easy. For this exercise we are going to assume we have both a functional Vertica and Cloudian HyperStore instance that can communicate via HTTP(s):

Configure a bucket via Cloudian Management Console (CMC) on your HyperStore cluster:
- - Let’s use the name “verticabucketoncloudian” for this example.
Create an auth_params.conf file:
- On your Vertica node, create an auth_params.conf file that will be accessible when you create the Vertica database instance.
  auth_params.conf values required are going to be:
  awsauth = Access_Key:Secret_Key awsendpoint = HyperstoreAddress:Port (either 443 or 80) awsenablehttps = 0 Is required if not using HTTPs
Create your Vertica in Eon Mode database instance:
- On your Vertica node, create the database instance. Specify the location of your auth_params.conf file to leverage a Cloudian S3 bucket for communal storage.
  admintools -t create_db -x auth_params.conf \ --communal-storage-location=s3://verticabucketoncloudian \ --depot-path=/home/dbadmin/depot --shard-count=6 \ -s vnode01,vnode02,vnode03,vnode04,vnode05,vnode06 -d verticadb -p 'YourDBAdminPasswordHere'
Success! Let’s test.
- Once the above command returns successfully, you can test the Vertica in Eon Mode instance.
- Connect to your db instance and load a dataset.
- Connect to Cloudian bucket “verticabucketoncloudian” via CMC or S3 browser, and you will see objects in the bucket.

Extending the Data Lakehouse with Vertica External Tables and Cloudian

One of the key tenants of a successful data lakehouse initiative is the ability to access and analyze datasets that have been generated by other analytics platforms.

Prior to the data lakehouse, an ETL (Extract Transform Load) operation would have been required to move data from one analytics platform to another. Today, Vertica can analyze the data in-place by leveraging external tables, without the need for complex and expensive data moves.

Let’s consider the following scenario… we have an ORC dataset, which was generated by an Apache Hive instance, stored on Cloudian, and we need to connect to it with Vertica. To analyze this dataset in-place, use the following Vertica syntax to connect to the ORC dataset:

That is much simpler and easier than working through any data ETL.

Here are the details for the S3 parameters and configuration.

Protecting Vertica Datasets with Data Backup to Cloudian

As with all datasets, backups of data are key to protecting and preserving data. For this purpose, Vertica has its own backup and recovery tool called “vbr,” and Vertica can leverage Cloudian as a backup target.

Vertica has thoroughly documented the process, but here’s a condensed version:

Configure connectivity and credentials for HyperStore
1. HyperStore credentials are important. They are configured within the database, as a security function, and they are configured as environmental variables to allow vbr to connect.
  - For the database that is going to be backed up, set the AWSAuth credentials (S3 credentials):
    ALTER DATABASE DEFAULT SET AWSAuth = 'accesskeyid:secretaccesskey';
2. Configure vbr HyperStore URL address and credentials
  export VBR_COMMUNAL_STORAGE_ENDPOINT_URL=http:// export VBR_COMMUNAL_STORAGE_ACCESS_KEY_ID= export VBR_COMMUNAL_STORAGE_SECRET_ACCESS_KEY= export VBR_BACKUP_STORAGE_ENDPOINT_URL=http:// export VBR_BACKUP_STORAGE_ACCESS_KEY_ID= export VBR_BACKUP_STORAGE_SECRET_ACCESS_KEY=
  - Keep in mind that you can back up to the same endpoint using the same credentials as the communal storage, but to a different bucket. Or backup can be to a second endpoint with different credentials. Most users will want to back up to a different bucket to reduce associated cost.
Setting the configuration file for vbr
1. There are some additional parameters that must be stored in a configuration file for Vertica to successfully backup / restore with Cloudian
2. Create a file called “eon_backup_restore.ini’ in the home directory of dbadmin
  As a quick reference, /opt/vertica/share/vbr/example_configs contains examples for cloud backups
  eon_backup_restore.ini [CloudStorage] cloud_storage_backup_path = s3://verticabackuponcloudian/fullbackup/ cloud_storage_backup_file_system_path = []:/home/dbadmin/backup_locks_dir/ cloud_storage_concurrency_backup = 10 cloud_storage_concurrency_restore = 10 [Misc] snapshotName = EONbackup_snapshot tempDir = /tmp/vbr restorePointLimit = 1 [Database] dbName = dbPromptForPassword = True dbUser = dbadmin
Target initialization and performing data backup
1. Vertica requires the S3 bucket to be initialized prior to use
  - vbr -t backup -c eon_backup_restore.ini
    Initializing backup locations. Backup locations initialized.
2. Run the Vertica backup
  - vbr -t backup -c eon_backup_restore.ini
    Enter vertica password: Starting backup of database VMart. Participating nodes: v_vmart_node0001, …., v_vmart_node0006. Snapshotting database. Snapshot complete. Approximate bytes to copy: x of y total. [================================================] 100% Copying backup metadata. Finalizing backup. Backup complete!

I hope this tech blog post helps make your Cloudian and Vertica data lakehouse project a success.

For more information about Cloudian data lakehouse / data analytics solutions, go to S3 Data Lakehouse for Modern Data Analytics.

VMware Cloud Service Providers Can Expand Their Business with Ransomware Protection

By helping their customers protect against ransomware, VCSPs can grow their footprint with existing clients and attract new ones. It’s easy with Object Lock technology.

Van Flowers, Senior Systems Engineer, Cloudian

View LinkedIn Profile

VMware Cloud Service Providers Can Expand Their Business with Ransomware Protection

VMware Cloud Service Providers (VCSPs) have emerged as an excellent alternative to hyperscalers for organizations that want to store their data in the cloud. By employing object storage, such as Cloudian HyperStore, these service providers can deliver hyperscaler-like scalability and flexibility while addressing organizations’ individual performance, data sovereignty, budget and security needs. Data security has become especially important due to the proliferation of ransomware attacks over the past few years, and VCSPs that can help their customers protect their data against this threat can expand their business by growing their footprint with existing clients and attracting new ones.

The best way to protect data against ransomware is with data immutability using Object Lock technology. As perimeter security solutions increasingly prove ineffective in preventing ransomware from getting in, having an immutable data backup copy ensures that this data cannot be deleted or encrypted. In the event of a ransomware attack, organizations can easily recover the unchanged backup without having to pay ransom.

Object Lock can be implemented as part of an automated backup workflow. For example, VCSPs using Veeam, Rubrik or Commvault can deploy Cloudian’s S3 Object Lock to seamlessly integrate a ransomware-proof, immutable S3 bucket into their customer backup solution.

So how do you do it? As shown in the screen shot below, all that’s needed to create the immutable bucket is to tick the slider, and you’re in the ransomware protection business! This simple task is performed at the time the bucket is created, and once created, the data written into the bucket cannot be deleted, altered or changed in any way until the defined immutability period expires. The bad guys – even a rogue administrator – can’t change it, but restores can be done in a flash!

Ransomware attacks are growing by the thousands every day, and VCSPs’ ability to offer this vital protection to their customers data can be a key contributor to continued growth, and to their customers’ peace of mind.

VCSPs – don’t let another minute go by without having immutable storage available for your customers. Nothing is easier to configure, integrate and fortify your customers’ data security than immutable storage built on Cloudian. Contact your local Cloudian representative for more information or drop me a note (vflowers@cloudian.com) or a tweet (@avf925). I’m always happy to help!

And to learn more about Cloudian solutions for VCSPs, visit Object Storage for VMware Cloud Director | Cloudian.

S3 Buckets: Accessing, Managing, and Securing Your Buckets

Amazon Simple Storage Service (Amazon S3) is an object storage solution that provides data availability, performance, security and scalability. Organizations from all industries and of every size may use Amazon S3 storage to safeguard and store any amount of information for a variety of use cases, including websites, data lakes, backup and restore, mobile applications, archives, big data analytics, IoT devices, and enterprise applications.

What Is AWS S3 Bucket?

To retain your information in Amazon S3, you use resources called objects and buckets. A bucket is a container that houses objects. An object contains a file and all metadata used to describe the file.

To retain an object in Amazon S3, you develop a bucket and upload the object into it. Once the object is within the bucket, you may move it, download it, or open it. When you don’t require the bucket or object any longer, you can discard them to trim back on your resources.

In this article:

How to Use an Amazon S3 Bucket
Tutorial: Creating a Bucket
What Is S3 Bucket Policy?
S3 Bucket URL and Other Methods to Access Your Buckets
S3 Bucket Configuration: Understanding Subresources
Best Practices for Keeping Amazon S3 Buckets Secure
S3 Bucket with Cloudian

This is part of an extensive series of articles about S3 Storage.

How to Use an Amazon S3 Bucket

An S3 customer starts by establishing a bucket in the AWS region of their choosing and assigns it a unique name. AWS suggests that customers select regions that are geographically close to them in order to minimize costs and latency.

After creating the bucket, the user chooses a storage tier based on the usage requirements for the data—there are various S3 tiers ranging in terms of price, accessibility and redundancy. A single bucket can retain objects from distinct S3 storage tiers.

The user may then assign particular access privileges regarding the objects retained in the bucket using various mechanisms, including bucket policies, the AWS IAM service, and ACL.

An AWS customer may work with an Amazon S3 bucket via the APIs, the AWS CLI, or the AWS Management Console.

Related content: Read our guide to the S3 API

Tutorial: Creating a Bucket

Before you can store content in S3, you need to open a new bucket, selecting a bucket name and Region. You may also wish to select additional storage management choices for your bucket. Once you have configured a bucket, you can’t modify the Region or bucket name.

The AWS account that opened the bucket remains the owner. You may upload as many objects as you like to the bucket. According to the default settings, you can have as many as 100 buckets for each AWS account.

S3 lets you create buckets using the S3 Console or the API.

Keep in mind that buckets are priced according to data volume stored in them, and other criteria. Learn more in our guide to S3 pricing

Developing an S3 bucket via the S3 console:

Access the S3 console.
Select Create bucket.
In Bucket name, create a DNS-accepted name for your bucket.

Image Source: AWS

The bucket name must be unique, begin with a number or lowercase letter, be between 3-63 characters, and may not feature any uppercase characters.

4. Select the AWS Region for the bucket. Select a Region near you to keep latency and cost to a minimum and to address regulatory demands. Keep in mind there are special charges for moving objects outside a region.
5. In Bucket settings for Block Public Access, specify if you want to allow or block access from external networks.
6. You can optionally enable the Object Lock feature in Advanced settings > Object Lock.
7. Select Create bucket.

What Is S3 Bucket Policy?

S3 provides the concept of a bucket policy, which lets you define access permissions for a bucket and the content stored in it. Technically, it is an Amazon IAM policy, which employs a JSON-based policy language.

For instance, policies permit you to:

Enable read access for unknown users
Restrict a particular IP address from accessing the bucket
Place a limit on access to a particular HTTP referrer
Require multi-factor authorization

S3 Bucket URLs and Other Methods to Access Your Buckets

You can perform almost any operation using the S3 console, with no need for code. However, S3 also provides a powerful REST API that gives you programmatic access to buckets and objects. You can reference any bucket or the objects within it via a unique Uniform Resource Identifier (URI).

Amazon S3 provides support for path-style and virtual-hosted-style URLs to gain access to a bucket. Given that buckets are accessible to these URLs, it is suggested that you establish buckets with bucket names that are DNS-compliant.

Virtual-Hosted-Style Access

In a virtual-hosted-style request, the bucket name is a component of the domain name within the URL.

Amazon S3 virtual-hosted-style URLs employ this format:

https://bucket-name.s3.Region.amazonaws.com/key name

For example, if you name the bucket bucket-one, select the US East 1 (Northern Virginia) Region, and use kitty.png as your key name, the URL will look as follows:

https://bucket-one.s3.us-east-1.amazonaws.com/kitty.png

Path-Style Access

In Amazon S3, path-style URLs use this format:

https://s3.Region.amazonaws.com/bucket-name/key name

For example, if you created a bucket in the US East (Northern Virginia) Region and named it bucket-one, the path-style URL you use to access the kitty.jpg object in the bucket will look like this:

https://s3.us-east-1.amazonaws.com/bucket-one/kitty.jpg

Accessing a Bucket Via S3 Access Points

As well as working with a bucket directly, you can work with a bucket via an access point.

S3 access points exclusively support virtual-host-style addressing. To address a bucket via an access point, you must employ the following format:

https://AccessPointName-AccountId.s3-accesspoint.region.amazonaws.com.

Accessing a Bucket Using S3://

Certain AWS services need you to specify an Amazon S3 bucket via S3://bucket, where you will need to follow this format:

S3://bucket-name/key-name

Note that when employing this format the bucket name does not feature the AWS Region. For example, a bucket called bucket-one with a kitty.jpg key will look like this:

S3://bucket-one/kitty.jpg

S3 Bucket Configuration: Understanding Subresources

AWS provides various tools for Amazon S3 buckets. An IT specialist may enable different versions for S3 buckets to retain every version of an object when an operation is carried out on it, for example a delete or copy operation. This may help stop IT specialists from accidentally deleting an object. Similarly, when creating a bucket, a user can establish server access logs, tags, object-level API logs, and encryption.

S3 Transfer Acceleration can assist with the execution of secure and fast transfers from the client to an S3 bucket via AWS edge locations.

Amazon S3 provides support for different alternatives for you to configure your bucket. Amazon S3 offers support for subresources so you can manage and retain the bucket configuration details. You can employ the Amazon S3 API to manage and develop these subresources. You may also utilize the AWS SDKs or the console.

These are known as subresources since they function in the context of a certain object or bucket. Below lists subresources that let you oversee bucket-specific configurations.

cors (cross-origin resource sharing): You may configure your bucket to permit cross-origin requests.

event notification: You may permit your bucket to alert you of particular bucket events.

lifecycle: You may specify lifecycle regulations for objects within your bucket that feature a well-outlined lifecycle.

location: When you establish a bucket, you choose the AWS Region where you want Amazon S3 to develop the bucket. Amazon S3 retains these details in the location subresources and offers an API so you can gain access to this information.

logging: Logging lets you monitor requests for access to the bucket. All access log records give details regarding one access request, including bucket name, requester, request action, request time, error code, and response status.

object locking: Enables the object lock feature for a bucket. You may also wish to configure a default period of retention and mode that applies to the latest objects that are uploaded to the bucket.

policy and ACL (access control list): Both buckets and the objects stored within them are private, unless you specify otherwise. ACL and bucket policies are two ways to grant permissions for an entire bucket.

replication: This option lets you automatically copy the content of the bucket to additional buckets, within the Amazon Region. Replication is asynchronous.

requestPayment: By default, the AWS account that sets up a bucket also receives bills for requests made to the bucket. This setting lets the bucket creator pass on the cost of downloading data from the bucket to the account downloading the content.

tagging: This setting allows you to add tags to an S3 bucket. This can help you track and organize your costs on S3. AWS shows the tags on your charges allocation report, with costs and usage aggregated via the tags.

transfer acceleration: Transfer acceleration enables easy, secure and fast movement of files over extended distances between your S3 bucket and your client. Transfer acceleration leverages the globally distributed edge locations via Amazon CloudFront.

versioning: Versioning assists you when recovering accidental deletes and overwrites.

website: You may configure the bucket for static website hosting.

Best Practices for Keeping Amazon S3 Buckets Secure

AWS S3 Buckets may not be as safe as most users believe. In many cases, AWS permissions are not correctly configured and can expose an organization’s AWS S3 buckets or some of their content.

Although misconfigured permissions are by no means a novel occurrence for many organizations, there is a specific permission that entails increased risk. If you allow objects to be public, this establishes a pathway for cyberattackers to write to S3 buckets that they don’t have the correct permissions to access. Misconfigured buckets are a major root cause behind many well-known attacks.

To protect your S3 buckets, you should apply the following best practices.

Block Public S3 Buckets at the Organization Level

Assign AWS accounts for public S3 utilization and stop all other S3 buckets from accidentally becoming public by putting in place S3 Block Public Access. Employ Organizations Service control policies (SCPs) to ensure that the Block Public Access setting is not alterable. S3 Block Public Access offers a degree of safety that functions at the level of the account and also on single buckets, encompassing those that you develop in the future.

You retain the capacity to prevent existing public access—irrespective of whether it was specified by a policy or an ACL—and to make sure that public access is not given to items you newly create. This provides only specific AWS accounts with public S3 buckets and stops all other AWS accounts.

Implement Role-Based Access Control

Outline roles that cover the access needs of users and objects. Make sure those roles have the least access needed to carry out the job so that if a user’s account is breached, the damage is kept to a minimum.

AWS security is founded on AWS Identity and Access Management (IAM) strategies. A principal is an identity that may be validated, for example, with a password. Roles, users, applications, and federated users (from separate systems) may all be principals. When a validated principal requests an entity, resource, service, or a different asset, verification begins.

Verification policies determine what access the principal has to the resource being requested. Approval is given based on resource-based methods or identity. Matching each validated principal with each validated policy will ascertain if the request is permitted.

Another data security methodology is splitting or sharing data into different buckets. For instance, a multi-tenant application could require separate Amazon S3 buckets for every tenant. You can use another AWS tool, Amazon VPC, which grants your endpoints secure access to sections of your Amazon S3 buckets.

Encrypt Your Data

Even with your greatest efforts, it remains good practice to assume that information is always at risk of being exposed. Given this, you should use encryption to stop unauthorized individuals from using your information if they have managed to access it.

Make sure that your Amazon S3 buckets are encrypted during transit and while sitting on the server. If you just have a single bucket, this is likely not complex, but if buckets are being developed dynamically, it may be difficult to keep track of them and manage encryption appropriately.

On the server side, Amazon S3 buckets support encryption, but this has to be enabled. Once encryption is turned on, the information is encrypted at rest. Encrypting the bucket will make sure that any individual who manages to access the data will require a password (key) to decrypt the data.

For transport security, HTTPS is used to make sure that information is encrypted from one end to another. Every additional version of Transport Layer Security (TLS) ensures that the protocol is more secure and does away with out-of-date, now insecure, encryption methods.

S3-Compatible Storage On-Premises with Cloudian

Cloudian® HyperStore® is a massive-capacity object storage device that is fully compatible with Amazon S3. It allows you to easily set up an object storage solution in your on-premises data center, enjoying the benefits of cloud-based object storage at much lower cost.

HyperStore can store up to 1.5 Petabytes in a 4U Chassis device, allowing you to store up to 18 Petabytes in a single data center rack. HyperStore comes with fully redundant power and cooling, and performance features including 1.92TB SSD drives for metadata, and 10Gb Ethernet ports for fast data transfer.

HyperStore is an object storage solution you can plug in and start using with no complex deployment. It also offers advanced data protection features, supporting use cases like compliance, healthcare data storage, disaster recovery, ransomware protection and data lifecycle management.

Learn more about Cloudian® HyperStore®.

Object Storage in the Cloud: 4 Providers Compared

What Is Object Storage in the Cloud?

As your business expands, you have to manage isolated but rapidly growing pools of data from various sources, which are used for a variety of business processes and applications. Nowadays, many organizations grapple with a fragmented storage portfolio that slows down innovation and adds complexity to an organization’s applications. Object storage can help your organization break down these silos. It provides cost-effective, highly scalable storage that can retain any type of data in its original format.

Object storage is highly suitable for the cloud as it is flexible, elastic and can be more easily scaled into many petabytes to support indefinite data growth. The architecture manages and stores data as objects, as opposed to block storage, which relates to data as logical volumes, blocks and files storage, where data is stored in hierarchical files.

Related content: Read our guides on object storage vs block storage and object storage vs file storage.

In this article:

4 Cloud Object Storage Options
Cloud Object Storage Pros and Cons
Object Storage in the Cloud with Cloudian

4 Cloud Object Storage Options

Let’s review the object storage offerings by some of the world’s leading cloud providers: Amazon Web Services, Microsoft Azure, Google Cloud, and IBM Cloud.

AWS Object Storage

AWS provides a distinct variety of storage classes for different use cases. Amazon S3 is the main object storage platform of AWS, with S3 Standard-IA providing cool storage, and Glacier providing cold storage:

Amazon S3 Standard—this is the storage choice for information that is often accessed, and is great for numerous use cases including dynamic websites, cloud applications, content distribution, data analytics and gaming. It delivers high throughput as well as low latency.
Amazon S3 Standard-Infrequent Access (Amazon S3 Standard—IA)—this is a storage alternative for data which is accessed less often, such as disaster recovery and long-term backups.
Amazon Glacier—this highly durable storage system is optimized for data that is not often accessed, or “cold” data, such as end-of-lifecycle data kept for compliance and regulatory backup purposes. Data is archived for long-term storage, and is immutable and encrypted.

Azure Object Storage

Microsoft offers Azure Blob Storage for object storage in the cloud. Blob storage is suited to storing any form of unstructured data, such as binary or text. This includes videos, images, documents, audio and more. Azure storage offers high-quality data integrity, flexibility and mutability.

Blob storage is employed for serving documents or images directly to a browser, for retaining files for distributed access, streaming audio and video, writing to log files, disaster recovery, storing data for restore and backup, and archiving, so it can be analyzed by an Azure-hosted or on-premises service.

Azure has several storage tiers, including:

Hot access tier— for information that is in or anticipated to be in active use and staged for processing and subsequent migration to the Cool storage tier.
Cool access tier—for data that is intended to stay in the Cool tier for more than 30 days. This includes disaster recovery datasets and short-term backup, media content that is older and intended to be immediately available when drawn on and large data sets.
Archive access tier—for data which will stay in the Archive tier for more than 180 days, and which can tolerate hours of retrieval latency.

Note: The Archive storage tier is not accessible at the storage account level, but only at the blob level. Azure also provides a Premium tier, which is for workloads that need consistent and fast response times.

Google Cloud Storage

Google Cloud Storage (GCS) provides united object storage for all workloads. It has four classes for backup and archival storage and high-performance object storage. All four classes provide high durability and low latency:

Hot (high-performance) storage—GCS provides regional and multi-regional storage for high-frequency access information.
Multi-regional storage—allows for the storing of information that is often accessed around the world, including streaming videos, serving website content, or mobile and gaming applications.
Regional storage—allows for frequent access to information in the corresponding region of Google Compute Engine instance or Google Cloud DataProc, for example data analytics.
Nearline (cool) storage—for data that only needs to be accessed less than once a month, but several times a year. Suitable for backups and long-tail multimedia content.
Coldline (cool) storage—for data that only needs to be accessed less than once a year. Suitable for archival data and disaster recovery.

IBM Cloud Object Storage

IBM Cloud provides scalable and flexible cloud storage with policy-driven archive abilities for unstructured data. This cloud storage service is intended for data archiving, for example for the long term retention of data that is infrequently accessed, including for mobile and web applications, and for backup and analytics.

IBM has four storage-class tiers integrated with an Asperaâ high-speed information transfer option. This allows for the easy transfer of data from and to Cloud Object Storage, and query-in-place functionality.

IBM Cloud Object Storage class tiers:

Standard storage—for active workloads that need high performance and low latency, and data that requires frequent and multiple access in a month. Usage scenarios are for example, active content repositories, analytics, mobile streaming and web content, collaboration and DevOps.
Vault storage—for less active workloads which need real-time, on-demand access but only infrequently, up to once a month. Use cases include digital asset retention and backup.
Cold vault—for cold workloads, where data needs on-demand, real-time access when needed but is mainly archived. For example, data that is accessed several times a year. Common use cases involve long-term backup, large data set preservation such as older media content and scientific data.
Flex storage—this class tier is utilized for dynamic workloads (combining cold and hot workloads) and data based on access patterns. Typical use cases include cognitive workloads, cloud-native analytics and user-generation applications.

Cloud Object Storage Pros and Cons

The following are some of the key advantages and disadvantages of object storage in the cloud.

Cloud Object Storage Pros

The key advantages of object storage include:

Data is highly distributed, which ensures it is more resilient to hardware failures or disasters. This way, it is available even if various nodes fail.
Objects are kept in a flat address space, which minimizes complexity and scalability issues.
Data protection is built into this architecture in the form of erasure coding or replication technology.
Object storage is most suitable for cloud storage and static data. Common use cases for object storage include archiving and cloud backup—the technology functions best with data that is more frequently read than written to.
Object storage has developed to the point where it scales at the exabyte level and represents trillions of objects. The use of VMs or commodity hardware enables nodes to be added easily, with the disk space being used more efficiently.
Object storage systems, via the use of object IDs (OIDs) or identifiers, can gain access to any piece of data without knowing on which physical storage device, directory, or file system it resides on. The abstraction lets object storage devices operate with storage hardware configured in distributed node architecture. This way, processing power can scale together with data storage capacity.
I/O requests don’t need to pass via a central controller, allowing for a true global storage system for large amounts of data overseen by objects, physically kept anywhere, and retrieved through the internet or a WAN.

Cloud Object Storage Pros

The key disadvantages of object storage include:

Object storage systems are not steady enough for real-time systems, including transactional databases. An undesirable use case for object storage is an environment or application with a high transactional rate.
Object storage doesn’t guarantee that read requests will produce the most up-to-date version of the data.
This technology isn’t alway appropriate for applications that have high performance demands.
Cloud-based storage often ends up being more expensive because you need to pay for storage on an ongoing basis. With on-premises equipment you pay once and the storage is yours.

Bring Object Storage On-Premises with Cloudian

Learn more about Cloudian® HyperStore®.

Fight Kubernetes Ransomware with Kasten and Cloudian

Adam Bergh

Cloud Native Technical Partnerships at Kasten by Veeam

LinkedIn Profile

Amit Rawlani

Director Technology Alliances, Product & Solution Marketing, Cloudian Inc.

LinkedIn Profile

The threat of ransomware should be thought of as serious problem for all enterprises. According to an annual report on global cyber security, there were 304 million ransomware attacks worldwide in 2020 — a 62% increase from 2019. While most IT organizations are aware of the continuously rising threat of ransomware on traditional applications and infrastructure, modern applications running on Kubernetes are also at risk. The rapid rise of critical applications and data moving into Kubernetes clusters has caught the attention of those seeking to exploit what is perceived to be a new and emerging space. This can leave many organizations ill prepared to fight back.

Kubernetes Vulnerabilities

Kubernetes itself and many of the most common applications that run in Kubernetes are open-source products. Open-source means that the underly code that makes up the applications is freely available for any to review and find potential vulnerabilities. While not overly common, open-source products can often lead to exploitable bugs being discovered by malicious actors. In addition, misconfigured access controls can unintentionally lead to unauthorized access to applications or even the entire cluster. Kubernetes is updated quarterly, and some applications as often as every week, so it’s crucial for organizations to stay up to date with patching.

Surprisingly, many organizations that use Kubernetes don’t yet have a backup and recovery solution in place — which is a last line of defense against an attack. As ransomware becomes more sophisticated, clusters and applications are at risk of being destroyed, and without a means to restore them, you could suffer devastating data and application loss in the case of an attack.

What to Look for In a Kubernetes Ransomware Protection Platform

When looking to an effective defense against ransomware in your K8s environment, think about these four core capabilities:

Backup integrity and immutability: Since backup is your last line of defense, it’s important that your backup solution is reliable, and it’s critical to be confident that your backup target storage locations contain the information you need to recover applications in case of an attack. Having guaranteed immutability of your backup data is a must.
High-performance recovery: No one wants to pay a ransom because it was faster to unencrypt your data than recover it from your backup system. The ability to work quickly to recover resources is critical, as the cost of ransom typically increases over time. Being confident that your recovery performance can meet target requirements even as the amount of data grows over time.
Operational Simplicity: Operations teams must work at scale across multiple clusters in hybrid environments that span cloud and on-premises locations. When you’re working in a high-pressure environment following a ransomware attack, simplicity of operations become paramount.

Cloudian and Kasten by Veeam Have the Solution

Kasten By Veeam and Cloudian have teamed to bring a truly cloud native approach to this mission critical problem. The Kasten K10 data management software platform has been purpose-built for Kubernetes. K10’s deep integrations with Kubernetes distributions and cloud storage systems provide for protection and mobility of your entire Kubernetes application. Cloudian’s HyperStore is an enterprise-grade S3-compatible object storage platform running in your data center. Cloudian makes it easy to use private cloud storage to protect your Kubernetes applications with a verified integration with Kasten. With native support of the cloud standard S3 API, including S3 Object Lock data immutability, Kasten and Cloudian offer seamless protection for modern applications at up to 70% less cost than public cloud.

Kasten Cloudian blog diagram 1

Fast recovery: Cloudian provides a local, disk-based object storage target for backing up modern apps using Kasten K10 over your local, high-speed network. The solution lets you backup and restore large data sets in a fraction of the time required for public cloud storage, leading to enhanced Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO).

Security and Ransomware Protection

Cloudian is a hardened object storage system that includes enhanced security features such as secure shell, encryption, integrated firewall and RBAC/IAM access controls to protect backup copies against malware. es in a shared-storage environment. In addition, to protect data from ransomware attacks, Cloudian HyperStore and Kasten support Object Lock for air-tight data immutability all the way up to the operating system root level.

Kasten-Validated Solution

Cloudian is Kasten-validated to ensure trouble-free integration. Kasten’s native support for the S3 API enables seamless integration with Cloudian HyperStore.

Easy as 1-2-3

Setting up Kasten K10 and Cloudian Ransomware Protection is as simple as 3 easy steps:

1. Create a new target bucket on Cloudian HyperStore and enable Object Lock.

Kasten Cloudian blog diagram 2

2. After Kasten K10 installation, check the “Enable Immutable Backups” box when adding a target S3 object storage bucket.

Kasten Cloudian blog diagram 3

3. Validate the Cloudian object storage bucket and specify your protection period.

GET STARTED WITH KASTEN K10 TODAY!

Try the full-featured and free Edition of Kasten K10 with a fast and easy install.
Watch and read customer webinars and case studies.
Kasten K10 Data Sheet
Try Cloudian free in your data center for 45 days
Cloudian + Kasten By Veeam for Ransomware Protection

NAS Backup & Archive Solution with Rubrik NAS Cloud Direct

Cloudian and Rubrik are simplifying enterprise data protection with a best-in-class NAS backup and archival solution that combines Cloudian HyperStore and Rubrik’s NAS Cloud Direct. This simple solution makes it easy to manage and migrate massive amounts of NAS data to Cloudian on-prem storage without impacting production environments. Cost-effective and highly scalable, this solution delivers new levels of operational efficiency and flexibility to solve challenges for large-scale NAS data management.

With the surging growth in NAS data volumes, the need for an affordable, simple and cost-effective approach to data life cycle and storage management at scale has never been greater. Enterprise organizations must be able to store massive amounts of data while also ensuring that data moving across data centers and to the cloud is simple, seamless, and secure.

Combining Cloudian HyperStore with Rubrik NAS Cloud Direct, a software-only product with a direct-to-object capability, provides a single data management fabric with automated, policy-based protection and allows users to store their NAS backup and archive data in one or multiple geographically separated regions or data centers. Enterprises can extend and scale their Cloudian capacity as needed and non-disruptively while keeping NAS data storage costs to a minimum.

Rubrik NAS Cloud Direct is deployed as a virtual machine that can be up and protecting data from any local and remote NAS platform to Cloudian HyperStore, within minutes.

At any scale – from terabytes to petabytes of data and millions to billions of files – Cloudian HyperStore and NAS Cloud Direct eliminate the complexity of tape solutions and the vendor lock-in of disk-to-disk backup solutions, all at a lower cost.

Learn more about this new solution: Download Brief

See how Cloudian and Rubrik are collaborating: https://www-cloudian-com-staging.go-vip.net/rubrik/

Learn more about Cloudian® HyperStore®

What is Object Storage: Definition, How It Works, and Use Cases

What is Object Storage?

Object storage is relatively new when compared with more traditional storage systems such as file or block storage. So, what is object storage, exactly? In short, it is storage for unstructured data that eliminates the scaling limitations of traditional file storage. Limitless scale is the reason that object storage is the storage of the cloud. All of the major public cloud services, including Amazon, Google and Microsoft, employ object storage as their primary storage.

This is part of an extensive series of guides about data security.

In this article:

Object Storage Definition
Object Storage Architecture: How Does It Work?
Object Storage Benefits
Object Storage Use Cases
Selecting the Best Object-Based Storage Solution

Object Storage Definition

Object storage is a technology that manages data as objects. All data is stored in one large repository which may be distributed across multiple physical storage devices, instead of being divided into files or folders.

It is easier to understand object-based storage when you compare it to more traditional forms of storage – file and block storage.

File storage

File storage stores data in folders. This method, also known as hierarchical storage, simulates how paper documents are stored. When data needs to be accessed, a computer system must look for it using its path in the folder structure.

File storage uses TCP/IP as its transport, and devices typically use the NFS protocol in Linux and SMB in Windows.

Block storage

Block storage splits a file into separate data blocks, and stores each of these blocks as a separate data unit. Each block has an address, and so the storage system can find data without needing a path to a folder. This also allows data to be split into smaller pieces and stored in a distributed manner. Whenever a file is accessed, the storage system software assembles the file from the required blocks.

Block storage uses FC or iSCSI for transport, and devices operate as direct attached storage or via a storage area network (SAN).

Object storage

In object storage systems, data blocks that make up a file or “object”, together with its metadata, are all kept together. Extra metadata is added to each object, which makes it possible to access data with no hierarchy. All objects are placed in a unified address space. In order to find an object, users provide a unique ID.

Object-based storage uses TCP/IP as its transport, and devices communicate using HTTP and REST APIs.

Metadata is an important part of object storage technology. Metadata is determined by the user, and allows flexible analysis and retrieval of the data in a storage pool, based on its function and characteristics.

The main advantage of object storage is that you can group devices into large storage pools, and distribute those pools across multiple locations. This not only allows unlimited scale, but also improves resilience and high availability of the data.

Object Storage Architecture: How Does It Work?

Object storage is fundamentally different from traditional file and block storage in the way it handles data. In an object storage system, each piece of data is stored as an object, which contains both the data itself and a unique identifier, known as an object ID. This ID allows the system to locate and retrieve the object without relying on hierarchical file structures or block mappings, enabling faster and more efficient data access.

Object storage architecture typically consists of three main components: the data storage layer, the metadata index, and the API layer. Let’s take a closer look at each of these components and how they work together to create a powerful and flexible storage solution.

Data Storage Layer

The data storage layer is where the actual data objects are stored. In an object storage system, data is typically distributed across multiple storage nodes to ensure high performance, durability, and redundancy. Each storage node typically contains a combination of hard disk drives (HDDs) and solid-state drives (SSDs) to provide the optimal balance between capacity, performance, and cost. Data objects are automatically replicated across multiple nodes, ensuring that data remains available and protected even in the event of hardware failures or other disruptions.

Metadata Index

The metadata index is a critical component of object storage architecture, as it maintains a record of each object’s unique identifier, along with other relevant metadata, such as access controls, creation date, and size. This information is stored separately from the actual data, allowing the system to quickly and efficiently locate and retrieve objects based on their metadata attributes. The metadata index is designed to be highly scalable, enabling it to support millions or even billions of objects within a single object storage system.

API Layer

The API layer is responsible for providing access to the object storage system, allowing users and applications to store, retrieve, and manage data objects. Most object storage systems support a variety of standardized APIs, such as the Simple Storage Service (S3) API from Amazon Web Services (AWS), the OpenStack Swift API, and the Cloud Data Management Interface (CDMI). These APIs enable developers to easily integrate object storage into their applications, regardless of the underlying storage technology or vendor.

5 Expert Tips

Jon Toor, CMO

With over 20 years of storage industry experience in a variety of companies including Xsigo Systems and OnStor, and with an MBA in Mechanical Engineering, Jon Toor is an expert and innovator in the ever growing storage space.

Leverage lifecycle policies to manage storage costs
Implement object lifecycle management to automatically transition objects between storage classes based on their age or access patterns. This can help you reduce storage costs by moving infrequently accessed data to colder storage tiers.

Optimize metadata for faster search and analytics
Invest time in designing your object metadata schema. Adding meaningful, searchable metadata can dramatically enhance retrieval speed and enable powerful analytics without needing to process the entire object.

Use erasure coding for efficient data protection
While replication is common, erasure coding provides more efficient storage utilization, especially in environments with large datasets. It offers high durability while using less storage space than simple replication.

Enable versioning for data integrity and compliance
Activate object versioning to protect against accidental overwrites or deletions. This is critical for compliance in industries where data integrity is required over long retention periods.

Implement policy-driven data tiering
Automate data movement between hot, warm, and cold storage using policy-based rules. This approach allows you to maximize cost efficiency by aligning storage costs with data value and access frequency.

Object Storage Benefits

Exabyte Scalable

Unlike file or block storage, object storage services enable scalability that goes beyond exabytes. While file storage can hold many millions of files, you will eventually hit a ceiling. With unstructured data growing at 50+% per year, more and more users are hitting those limits, or they expect to in the future.

Scale Out Architecture

Object storage makes it easy to start small and grow. In enterprise storage, a simple scaling model is golden. And scale-out storage is about as simple as it gets: you simply add another node to the cluster and that capacity gets folded into the available pool.

HyperStore is an S3-compatible storage system. HyperFile is a connector that allows files to be stored on HyperStore.

Customizable Metadata

While file systems have metadata, the information is limited and basic (date/time created, date/time updated, owner, etc.). Object storage allows users to customize and add as many metadata tags as they need to easily locate the object later. For example, an X-ray could have information about the patient’s age and height, the type of injury, etc.

High Sequential Throughput Performance

Early object storage systems did not prioritize performance, but that’s now changed. Now, object stores can provide high sequential throughput performance, which makes them great for streaming large files. Also, object storage services help eliminate networking limitations. Files can be streamed in parallel over multiple pipes, boosting usable bandwidth.

Flexible Data Protection Options

To safeguard against data loss, most traditional storage options utilize fixed RAID groups (groups of hard drives joined together), sometimes in combination with data replication. The problem is, these solutions generally lead to one-size-fits-all data protection. You can not vary the protection level to suit different data types.

Object storage solutions employ a flexible tool called erasure coding that is similar to old-fashioned RAID in some ways, but is far more flexible. Data is striped across multiple drives or nodes as needed to achieve the needed protection for that data type. Between erasure coding and configurable replication, data protection is both more robust and more efficient.

Support for the S3 API

Back when object storage solutions were launched, the interfaces were proprietary. Few application developers wrote to these interfaces. Then Amazon created the Simple Storage Service, or “S3”. They also created a new interface, called the “S3 API”. The S3 API interface has since become a de-facto standard for object storage data transfer.

The existence of a de facto standard changed the game. Now, S3-compatible application developers have a stable and growing market for their applications. And service providers and S3-compatible storage vendors such as Cloudian have a growing user set deploying those applications. The combination sets the stage for rapid market growth.

Lower Total Cost of Ownership (TCO)

Cost is always a factor in storage. And object storage services offer the most compelling story, both in hardware/software costs and in management expenses. By allowing you to start small and scale, this technology minimizes waste, both in the form of extra headcount and unused space. Additionally object storage systems are inherently easy to manage. With limitless capacity within a single namespace, configurable data protection, geo replication, and policy-based tiering to the cloud, it’s a powerful tool for large-scale data management.

To learn more about Cloudian’s fully native S3-compatible storage in your data center, and how it can cut down your TCO, check out our free trial. Or visit cloudian.com for more information.

Object Storage Use Cases

There are numerous use cases for object storage, thanks to its scalability, flexibility, and ease of use. Some of the most common use cases include:

Backup and archiving
Object storage is an excellent choice for storing backup and archive data, thanks to its durability, scalability, and cost-effectiveness. The ability to store custom metadata with each object allows organizations to easily manage retention policies and ensure compliance with relevant regulations.

Big data analytics
The horizontal scalability and programmability of object storage make it a natural choice for storing and processing large volumes of unstructured data in big data analytics platforms. Custom metadata schemes can be used to enrich the data and enable more advanced analytics capabilities.

Media storage and delivery
Object storage is a popular choice for storing and delivering media files, such as images, video, and audio. Its scalability and performance make it well-suited to handling large volumes of media files, while its support for various data formats and access methods enables seamless integration with content delivery networks and other media delivery solutions.

Internet of Things (IoT)
As the number of connected IoT devices continues to grow, so too does the amount of data they generate. Object storage is well-suited to handle the storage and management of this data, thanks to its scalability, flexibility, and support for unstructured data formats.

How to Choose an Object-Based Storage Solution

When choosing an object storage solution, there are several factors to consider. Some of the most important factors include:

Scalability: One of the primary strengths of object storage is its ability to scale horizontally, so it’s essential to choose a platform that can grow with your organization’s data needs. Look for a solution that can easily accommodate massive amounts of data without sacrificing performance or manageability.
Data durability and protection: Ensuring the integrity and availability of your data is critical, so look for an object storage platform that offers robust data protection features, such as erasure coding, replication, or versioning. Additionally, consider the platform’s durability guarantees – how likely is it that your data will be lost or corrupted?
Cost: Cost is always a consideration when choosing a storage solution, and object storage is no exception. Be sure to evaluate the total cost of ownership (TCO) of the platform, including factors such as hardware, software, maintenance, and support costs. Additionally, if you’re considering a cloud-based solution, be sure to factor in the costs of data transfer and storage.
Performance: While object storage is not typically designed for high-performance, low-latency workloads, it’s still important to choose a platform that can deliver acceptable performance for your organization’s specific use cases. Consider factors such as throughput, latency, and data transfer speed when evaluating performance.
Integration and compatibility: The ability to integrate the object storage platform with your existing infrastructure and applications is essential. Look for a solution that supports industry-standard APIs and protocols, as well as compatibility with your organization’s preferred development languages and tools.

See Additional Guides on Key Data Security Topics

Together with our content partners, we have authored in-depth guides on several other topics that can also be useful as you explore the world of data security.

EDR

Authored by Cynet

Ransomware Protection

Authored by Cynet

AWS SQL

Authored by NetApp

Learn More About Object Storage

Object Storage vs. File Storage: What’s the Difference?

Object Storage vs. Block Storage: What’s the Difference?

6 Best Practices for Object Storage Deployment

How Object Storage Protects You From Ransomware

Enhancing Object Storage Analytics: Adding Metadata Labels to S3 Images with TensorFlow

S3 Compatible Storage Solutions Compared

Understanding Cloud Native Storage

Object Storage in the Cloud: 4 Providers Compared

Cloudian S3 Compatible Enterprise Object Storage – Watch 1 min Overview Video

Are You Prepared For A Ransomware Attack Against Your Data?

Is your data really protected and safe against ransomware attacks? Take this quick ransomware assessment to find out if you’re adequately prepared and if your data is safe and protected.

Grant Jacobson, Director of Technology Alliances and Partner Marketing, Cloudian

View LinkedIn Profile

Find Out If You’re Prepared For A Ransomware Attack Against Your Data!

Do you know if your data is really protected and safe against ransomware attacks which have become one of the top cybersecurity threats facing organizations around the world? With multiple attacks now occurring every minute and an increasing level of ransom payments and other business costs, having the proper cybersecurity protections has become an urgent priority against this accelerating and costly threat.

Organizations who have cybersecurity strategies may feel protected but they can still be vulnerable. For example, while much has been written about network perimeter and firewall security measures, these defenses typically do nothing at all for protecting data, if breached.

What would you do if you discovered that your organization’s data was attacked, encrypted and held hostage for ransom? Do you know if you’re adequately prepared and if your data is safe and protected? It’s well worth checking to know just how secure your data storage is and whether your protections are up to date.

Spend just a few minutes to find out with this cybersecurity assessment, below. Upon completion, you will receive a score and a report of your answers along with suggestions for improvement, if needed.

Take the survey now.

5 Reasons Ransomware Protection Needs to Be a Board-Level Conversation

It is not just the responsibility of the IT/IS department to keep the business safe, but the obligation of every CXO and Board member to ask for and implement stringent cyber security measures starting with zero trust, perimeter security, and employee training.

Amit Rawlani, Director of Solutions & Technology Alliances, Cloudian

View LinkedIn Profile

“We are on the cusp of a global pandemic,” said Christopher Krebs, the first director of the Cybersecurity and Infrastructure Security Agency(CISA), told Congress in May of 2021. The director of CISA isn’t talking about a virus created pandemic, rather he is referring to the pandemic of cyber-attacks and data breaches. This warning rang especially true when the Colonial Pipeline ransomware attack crippled the US energy sector the following week.

Your files are encrypted

For the uninitiated, ransomware is the fastest growing malware threat, targeting users and organizations of all types. It works by encrypting the user’s data, rendering the source data and backup data useless and asks for ransom, threatening to hold the data hostage until it is received. Payments are usually demanded in untraceable crypto currencies which can (and in many cases do) end up with state sponsored bad actors.

Today, protection against and mitigation for a ransomware attack are information technology and information security responsibilities with the C-Suite and Board taking a relatively hands-off approach. But that must change and in some cases is already changing. Here’s why C-Suite and Board members should take this threat seriously and be the driving force to protect the organization against ransomware.

1. To Pay or not to Pay: Financial Impacts of Ransomware

Ransomware impacts organizations of all sizes, across all industries. The security company Sophos⁽¹⁾ found that 51% of the companies responded in an affirmative when asked if they were attacked by ransomware in 2020 – the year of the pandemic. In 73% of those cases, data was successfully encrypted, thereby bringing the business to its knees. More than a quarter of the respondents (26%) admitted to paying the ransom at an average of $761K/ incident, which is a huge increase from the previous years where a similar report had pegged the average at $133K

The financial implication of paying the ever-increasing ransom demands aside, the real impact of ransomware is on the business itself. It cripples businesses and renders services ineffective and undeliverable. There is also the threat of data exfiltration which can expose sensitive customer data and leave the organization open to lawsuits and additional financial penalties. This does not even account for the loss of business due to downtime, or the brand damage that the ransomware can cause.

With just these impacts alone, with rope in the Director of IT or IS, CFO, General Counsel, Public Relations, Chief Privacy Officer, CIO, and CISO. The CEO will also be roped in and will have to break the new to her board of directors. It would be far better if she remembers this as the day she was able to say, “We were prepared. We already have the business back up and running. We will not be paying.”

2. The Moral (and Regulatory) Low Ground of Paying a Ransom

Then there is the moral and regulatory dilemma associated with paying off ransom. This practice is actively discouraged by the US governmental agencies as it encourages and fosters similar and copycat attacks. Added to this is the Oct 2020 advisory from Department of The Treasury^(2), OFAC (Office of Foreign Assets Control) & FINCEN (Financial Crimes Enforcement Network) which talks about “Potential Sanctions Risks for Facilitating Ransomware Payments”. Given that most of the payments for ransomware are untraceable, this opens organizations, the executives and board members to US government sanctions violations.

3. Cyber Insurance: How to Get, Keep, and Save on This Must-Have for Business Continuity

Cyber Security Insurance, the fastest growing insurance segment is another important consideration. As a safeguard most large organizations require cyber insurances as part of their cyber defense strategy. But insurance companies are not immune to the US sanctions violation if a payment is made to rogue nations. Therefore, premiums for ransomware coverage are high or may require up to 50% coinsurance. In some cases, insurers may NOT even cover businesses unless they are able to show significant cyber security arrangements along with data immutability as part of their cyber security plans.

4. The Human Cost of Ransomware

Finally in addition to a business, insurance and regulatory impact, the most reprehensible danger of ransomware is its human impact. This applies across all industries. From impacting critical utilities in the energy sector, declined credit card and bank transactions in the financial sector, to delayed patient care, emergency treatments, and even death in the healthcare sector, the impact of ransomware is real and direct and all too inhumane.

5. Getting Organized: Plan, Don’t Pay

Without a regularly drilled, top-down plan on how a business will respond to a ransomware attack, an organization is going to make mistakes in the heat of an attack. It will pay the costs of those mistakes whether to masked malware attackers, through ransomware-induced PR nightmares, or via increased cyber insurance premiums levied for lack of proper preparation and protection. It is not just the responsibility of the IT/IS department to keep the business safe, but the obligation of every CXO and Board member to ask for and implement stringent cyber security measures starting with zero trust, perimeter security, and employee training. But don’t forget to protect the attackers ultimate prize–your backup data—in immutable WORM storage.

For all these reasons, ransomware MUST be a C-suite and Board-led conversation. Forrester analysts write: “Implementing an immutable file system with underlying WORM storage will make the system watertight from a ransomware protection perspective.” Data immutability through WORM features such as S3 Object Lock is also now a requirement for many cyber insurance policies to cover the threat of ransomware.

To learn more about solutions for ransomware protection, please visit
https://www-cloudian-com-staging.go-vip.net/lp/lock-ransomware-out-keep-data-safe-ent/

Citation:

$500 Billion in Lost Market Value: VC Firm Estimates Impact of Public Cloud Costs

VC firm Andreesen Horowitz examined the impact of public cloud costs on public company financials and found that they reduce the total market value for those companies using cloud at scale by at least $500 billion.

Jon Toor, CMO, Cloudian

View LinkedIn Profile

Cloud computing and on-prem computing will always co-exist, we believe. A recent article from the venture capital firm Andreesen Horowitz makes a compelling case for that. The article (“The Cost of Cloud, a Trillion Dollar Paradox”) examined the impact of public cloud costs on public company financials and found that they reduce the total market value for those companies using cloud at scale by at least $500 billion.

Here are some of the article’s key findings:

“If you’re operating at scale, the cost of cloud can at least double your infrastructure bill.”: The authors note that public cloud list prices can be 10-12X the cost of running your own data centers. Although use-commitment and volume discounts can reduce the difference, the cloud is still significantly more expensive.
“Some companies we spoke with reported that they exceeded their committed cloud spend forecast by at least 2X.” Cloud spend can be hard to predict, resulting in spending that often exceeds plan. Companies surveyed for the article indicate that actual spend is often 20% higher than committed spend and at least 2X in some cases.
“Repatriation results in one-third to one-half the cost of running equivalent workloads in the cloud.”: This takes into account the TCO of everything from server racks, real estate, and cooling to network and engineering costs.
“The cost of cloud ‘takes over’ at some point, locking up hundreds of billions of market cap that are now stuck in this paradox: You’re crazy if you don’t start in the cloud; you’re crazy if you stay on it.”: While public cloud delivers on its promise early on, as a company scales and its growth slows, the impact of cloud spend on margins can start to outweigh the benefits. Because this shift happens later in a company’s life, it’s difficult to reverse.
“Think about repatriation upfront.” By the time cloud costs start to catch up to or even outpace revenue growth, it’s too late. Even modest or modular architectural investment early on reduces the work needed to repatriate workloads in the future. In addition, repatriation can be done incrementally, and in a hybrid fashion.
“Companies need to optimize early, often, and, sometimes, also outside the cloud.”: When evaluating the value of any business, one of the most important factors is the cost of goods sold (COGS). That means infrastructure optimization is key.
“The popularity of Kubernetes and the containerization of software, which makes workloads more portable, was in part a reaction to companies not wanting to be locked into a specific cloud.”: Developers faced with larger-than-expected cloud bills have become more savvy about the need for greater rigor when it comes to cloud spend.
“For large companies — including startups as they reach scale — that [cloud flexibility] tax equates to hundreds of billions of dollars of equity value in many cases.”: This tax is levied long after the companies have committed themselves to the cloud. However, one of the primary reasons organizations have moved to the cloud early on – avoiding large CAPEX outlays – is no longer limited to public clouds. There are now data center alternatives that can be built, deployed, and managed entirely as OPEX.

In short, the article highlights the need to think carefully about which use cases are better suited for on-prem deployment. Public cloud can provide flexibility and scalability benefits, but at a cost that can significantly impact your company’s financial performance.

Cloudian was founded on the idea of bringing public cloud benefits to the data center, and we now have nearly 700 enterprise and service provider customers that have deployed our award-winning HyperStore object storage platform in on-prem and hybrid cloud environments. On-prem object storage can deliver public cloud-like benefits in your own data center, at less cost and with performance, agility, security and control advantages. In addition, as long as the object storage is highly S3-compatible, it can integrate easily with public cloud in a hybrid cloud model.

To learn more about how we can help you find the right cloud storage strategy for your organization, visit cloudian.com/solutions/cloud-storage/. You can also read about deploying HyperStore on-prem with AWS Outposts at cloudian.com/aws.

LinkedIn Live: Secure Data with VMware vSAN & Cloudian HyperStore

Our joint solution combines Cloudian Object Storage with VMware’s vSAN Data Persistence platform through VMware Cloud Foundation with Tanzu. Adding Cloudian object storage software to vSAN is simple and easy, and serves any cloud-native or traditional IT application requiring S3-compatible storage.

Grant Jacobson, Director of Technology Alliances and Partner Marketing, Cloudian

View LinkedIn Profile

Protecting Your Data with VMware vSAN and Cloudian HyperStore

Each month, VMware and Cloudian collaborate to promote our joint solution in a series of short (~15 minutes) LinkedIn Live sessions. Each session highlights a new solution use case and for today’s session, the fourth in our series, we talked about Data Protection and how to keep data safe. These are lively conversations about the solution and how our customers can take advantage of it to meet their evolving needs. Last month, we covered the new Splunk SmartStore use case, with a 44% TCO savings compared with traditional storage.

Our joint solution became available in February and combines Cloudian Object Storage with VMware’s vSAN Data Persistence platform through VMware Cloud Foundation with Tanzu. Adding Cloudian object storage software to vSAN is simple and easy, and serves any cloud-native or traditional IT application requiring S3-compatible storage. The solution enables many new use cases with Data Protection being one that cuts across all segments: everyone needs to ensure their data stays safe, especially from the accelerating increase in ransomware and other cyberattacks.

If you missed it, watch it here:

If you’d like more information about our solutions with VMware, see our dedicated webpage:
You can also reach us at vSAN@cloudian.com

Object Storage: Better Monetizing Content by Transitioning from Tape

As media organizations look for new ways to monetize their ever-growing content archives, they need to ask themselves whether they have the right storage foundation. In a recent article I wrote for Post Magazine, I discussed the advantages of object storage over tape when it comes to managing and protecting content. Below is a reprint of the article.

David Phillips, Principal Architect for M&E Solutions, Cloudian

View LinkedIn Profile

Object Storage: Better Monetizing Content by Transitioning from Tape

Media and entertainment companies derive significant recurring revenue through old content. From traditional television syndication to YouTube uploads, this content can be distributed and monetized in several different ways. Many M&E companies, particularly broadcasters, store their content in decades-old LTO tape libraries. With years of material, including thousands of episodes and millions of digital assets, these tape libraries can grow so large that they become unmanageable. Deployments can easily reach several petabytes of data and may sprawl across multiple floors in a broadcaster’s media storage facility. Searching these massive libraries and retrieving specific content can be a cumbersome, time-consuming task –like trying to find a needle in a haystack.

Object storage provides a far simpler, more efficient and cost-effective way for broadcasters to manage their old video content. With limitless scalability, object storage can easily grow to support petabytes of data without occupying a large physical footprint. Moreover, the technology supports rich, customizable metadata, making it easier and quicker to search and retrieve content. Organizations can use a Google-like search tool to immediately retrieve assets, ensuring that they have access to all existing content, no matter how old or obscure, and can readily monetize that content.

Here’s a deeper look at how the two formats compare in searchability, data access, scalability and management.

Searchability and data access

LTO tape was created to store static data for the long haul. Accessing, locating and retrieving this data was always an afterthought. In the most efficient tape libraries today, staff may be able to find a piece of media within a couple minutes. But even in this scenario, if there were multiple jobs queued up first in the library, finding that asset could take hours. And this is assuming that the tape that contains the asset is stored in the library and in good condition (i.e., it can be read and doesn’t suffer from a jam).

This also assumes the staff has the proper records to even find the asset. Because of the limitations of the format, LTO tape files do not support detailed metadata. This means that organizations can only search for assets using basic file attributes, such as date created or title. It’s impossible to conduct any sort of an ad hoc search. If a system’s data index doesn’t contain the file attributes that a user is looking for, the only option is to look manually, an untenable task for most M&E organizations that have massive content libraries. This won’t change in the future, as tape cannot support advanced technologies such as artificial intelligence (AI) and machine learning (ML) to improve searchability.

On the other hand, object storage makes it possible to immediately search and access assets. The architecture supports fully-customizable metadata, allowing staff to attach any attributes they want to any asset, no matter how specific. For example, a news broadcast could have metadata identifying the anchors or describing the type of stories covered. When trying to find an asset, a user can search for any of those attributes and rapidly retrieve it. This makes it much easier to find old or existing content and use it for new monetization opportunities, driving much greater return on investment (ROI) from that content. This value will only increase as AI and ML, which are both fully supported in object storage systems, provide new ways to analyze and leverage data (e.g., facial recognition, speech recognition and action analysis), increasing opportunities to monetize archival content.

Scalability and management

Organizations must commit significant staff and resources to manage and grow an LTO tape library. Due to their physical complexity, these libraries can be difficult and expensive to scale. In the age of streaming, broadcasters are increasing their content at breakneck speed. And with the adoption of capacity-intensive formats like 4K, 8K and 360/VR, more data is being created for each piece of content. Just several hundred hours of video in these advanced formats can easily reach a petabyte in size. In LTO environments, the only way to increase capacity is to add more tapes, which is particularly difficult if there are no available library slots. When that’s the case, the only choice is to add another library. Many M&E companies’ tape libraries already stretch across several floors, leaving little room for expansion, especially because new content (in higher resolution formats) tends to use larger data quantities than older content.

Object storage was designed for limitless scalability. It treats data as objects that are stored in a flat address space, which makes it easy to grow deployments via horizontal scaling (or scaling out) rather than vertical scaling (scaling up). To increase a deployment, organizations simply have to add more nodes or devices to their existing system, rather than adding new systems (such as LTO libraries) entirely. Because of this, object storage is simple to scale to hundreds of petabytes and beyond. With data continuing to grow exponentially, especially for video content, being able to scale easily and efficiently helps M&E companies maintain order and visibility over their content, enabling them to easily find and leverage those assets for new opportunities. Increasing the size of a sprawling, messy tape library is exactly the opposite.

Tape libraries also lack centralized management across locations. To access or manage a given asset, a user has to be near the library where it’s physically stored. For M&E organizations that have tape archives in multiple locations, this causes logistical issues, as each separate archive must be managed individually. As a result, companies often need to hire multiple administrators to operate each archive, driving up costs and causing operational siloing.

Object storage addresses the challenge of geo-distribution with centralized, universal management capabilities. Because the architecture leverages a global namespace and connects all nodes together in a single storage pool, assets can be accessed and managed from any location. While companies can only access data stored on tape directly through a physical copy, object storage enables them to access all content regardless of where it is physically stored. One person can administer an entire globally-distributed deployment, enforcing policies, creating backup copies, provisioning new users and executing other key tasks for the whole organization.

Conclusion

M&E companies still managing video content in LTO tape libraries suffer from major inefficiencies, and in turn, lost revenue. The format simply wasn’t designed for the modern media landscape. Object storage is a much newer architecture that was built to accommodate massive data volumes in the digital age. Object storage’s searchability, accessibility, scalability and centralized management helps broadcasters boost ROI from existing content.

To learn more about Cloudian’s Media and Entertainment solutions, visit cloudian.com/solutions/media-and-entertainment/.

Cloudian Blog

Building and Protecting Data Lakehouse Projects with Cloudian and Vertica

VMware Cloud Service Providers Can Expand Their Business with Ransomware Protection

What Is AWS S3 Bucket?

How to Use an Amazon S3 Bucket

Tutorial: Creating a Bucket

What Is S3 Bucket Policy?

S3 Bucket URLs and Other Methods to Access Your Buckets

Virtual-Hosted-Style Access

Path-Style Access

Accessing a Bucket Via S3 Access Points

Accessing a Bucket Using S3://

S3 Bucket Configuration: Understanding Subresources

Best Practices for Keeping Amazon S3 Buckets Secure

Block Public S3 Buckets at the Organization Level

Implement Role-Based Access Control

Encrypt Your Data

S3-Compatible Storage On-Premises with Cloudian

What Is Object Storage in the Cloud?

4 Cloud Object Storage Options

AWS Object Storage

Azure Object Storage

Google Cloud Storage

IBM Cloud Object Storage

IBM Cloud Object Storage class tiers:

Cloud Object Storage Pros and Cons

Cloud Object Storage Pros

Cloud Object Storage Pros

Bring Object Storage On-Premises with Cloudian

Adam Bergh

Cloud Native Technical Partnerships at Kasten by Veeam

Amit Rawlani

Director Technology Alliances, Product & Solution Marketing, Cloudian Inc.

Kubernetes Vulnerabilities

What to Look for In a Kubernetes Ransomware Protection Platform

Cloudian and Kasten by Veeam Have the Solution

Security and Ransomware Protection

Kasten-Validated Solution

Easy as 1-2-3

What is Object Storage?

Object Storage Definition

File storage

Block storage

Object storage

Object Storage Architecture: How Does It Work?

Data Storage Layer

Metadata Index

API Layer

5 Expert Tips

Object Storage Benefits

Exabyte Scalable

Scale Out Architecture

Customizable Metadata

High Sequential Throughput Performance

Flexible Data Protection Options

Support for the S3 API

Lower Total Cost of Ownership (TCO)

Object Storage Use Cases

How to Choose an Object-Based Storage Solution

See Additional Guides on Key Data Security Topics

Learn More About Object Storage

Find Out If You’re Prepared For A Ransomware Attack Against Your Data!

1. To Pay or not to Pay: Financial Impacts of Ransomware

2. The Moral (and Regulatory) Low Ground of Paying a Ransom

3. Cyber Insurance: How to Get, Keep, and Save on This Must-Have for Business Continuity

4. The Human Cost of Ransomware

5. Getting Organized: Plan, Don’t Pay

Protecting Your Data with VMware vSAN and Cloudian HyperStore

Object Storage: Better Monetizing Content by Transitioning from Tape

Categories

Get Started With Cloudian Today

Request a Demo

Download a Free Trial

Pricing